Week 4:Item types
Objective and subjective testing
Objective test: Common test types
Subjective test: Common test types
1
Where to start?
Look at the specifications
Consult past papers/inventory of the
content of previous tests
Find appropriate texts
2
Objective vs subjective testing
Objective and subjective are terms used to refer to the
scoring of tests.
Objective tests usually have only one answer, they can be
scored mechanically.
Being easily marked (e.g. by computer) – is one important
reason for their popularity among examining bodies
responsible for testing large numbers of candidates.
Reliability is somewhat easier to be achieved in the marking
of test items.
The question of how valid the items are may be of
considerable concern.
3
Objective vs subjective testing…con’t
Eg. How far do items like this reflect the real use of language in
everyday life?
Complete the sentences by putting the best word in each blank.
‘Is your home still in Cairo?’
‘Yes, I’ve been living here ……….. 1986.’
A. for B. on C. in D. at E. since
Language does not function in this way in real life situations. The
item tests student’s knowledge of language forms and how
language work than than with their ability to respond
appropriately to real questions (i.e. tests grammar rather than
communication)
4
Objective vs subjective testing…con’t
On the whole objective tests require far more careful
preparation than subjective test (examiner tend to spend
relatively shorter time on setting the question but
considerable time on marking).
In an objective test – testers spend a great deal of time
constructing the items as carefully as possible (i.e. attempting
to anticipate the various reactions of the test-takers at each
stage) but is rewarded – ease of marking.
Obviously, objective testing has both strengths and
weaknesses
5
Objective vs subjective testing…con’t
An objective test will be a very poor test if:
➢ The test items are poorly written
➢ Irrelevant areas and skills are emphasised in the test simply because they are
‘testable’, and
➢ It is confined to language-based usage and neglects the communicative skills involved.
It should never be claimed that objective tests can do those
tasks which they are not intended to do.
They can never test the ability to communicate in the TL, nor
can they evaluate actual performance.
A good classroom test will usually contain both subjective
and objective test items.
6
Item Types
Method used for testing a language ability may affect the
student’s score. – called ‘method effect’ – should be reduced
as much as possible.
At the moment – understanding of the test method effect is
still so rudimentary that it is not possible to recommend
particular methods for testing particular language abilities.
In general, the more different methods a test employs, the
more confidence we can have that the test is not biased
towards one particular method or to one particular sort of
learner.
7
Problems with particular Item types
General Problems – problems which apply to all test types
1. What an item is actually testing – testing something which is not
intended
For example, this item is supposed to test spelling:
Rearrange the following letters to make English words
RUFTI RSOEH MSAPT
TOLSO RIEWT PAHYP
May be testing spelling, but is also testing intelligence, ability to do anagrams
and vocabulary.
Common in high-level proficiency tests- for intelligence to be tested as well
as or instead of language.
Similarly, background knowledge is frequently tested instead of reading.
8
Problems with particular Item types
2. Item should be independent of the others – success on one
item should not depend on success on another.
3. Instructions for all items must be clear – Often students
fail a test or an item not because their language is poor,
but because they do not understand what they are meant
to do.
9
Objective test:
Item format: Classifications
Item Types
Free response
Fixed Response items
items
Short True/False
Completion MCQ Matching
response Item
10
MCQ Item
Choose the best answer.
Stem
Don’t _______ during assembly.
A talks C talked
B talk D talking
Options
Key Distractors
A. Objective test types
1. Multiple-choice
Most important requirement – ‘correct’answer
must be genuinely correct – avoid dubious
answer.
Eg.Which is the odd one out?
A. rabbit B. hare C. bunny D. Deer
The test writer might have planned that D was the odd one out,
but good language learners might have chosen C because
‘bunny’ is in ‘baby-language’
12
Item writers must ensure that if the answer key
gives just one correct answer, then there is only
one correct answer
Eg. “Why hasn’t your mother come?”
“Well, she said she ___________ leave the baby.”
A. can’t B. won’t C. couldn’t D. mayn’t
According to the textbook, C is the correct answer, because the
rules of reported speech. However, many of the native speakers
on who the item was presented said that A and B were
perfectly acceptable.
13
Multiple-choice items should be presented in context.
Eg. Select the option closest in meaning to the word underlined:
Come back soon.
A. Shortly B. Later C. today D. tomorrow
The lack of context makes it unclear whether B is
really wrong.
14
It would be clearer as follows:
Fill in the blank with the most suitable option:
Visitor: Thank you very much for such a
wonderful visit.
Hostess: We were so glad you could come.
Come back _______.
A. soon. B. Later C. today D. tomorrow
15
Each option should fit equally well into the
stem
Eg. Someone who designs houses is a ___________.
A. designer B. builder C. architect D. plumber
The correct answer is C but it does not fit into the stem.
16
Some items do not test what they are
intended to test.
Eg. (After a text about trees)
Who gets food from trees?
A. Only man B. Only animals C. Man and animals
Whatever the text says, it is surely common knowledge that both
humans and animals get food from trees.
17
Other objective-type items
2. Dichotomous items
True/False orYes/No items are generally
unsatisfactory – there is a 50% possibility of
getting any item right by chance alone.
In order to learn anything about a student’s ability, it is
necessary to have a large number of such items in order to
discount the effects of chance.
19
3. Matching
Items where students are given a list of possible answers which
they have to match with some other list of words, phrases,
sentences, paragraphs or visual clues.
Eg. Match the four words on the left with those on the right to
make other English words.
1. car A. room
2. cup B. pet
3. bed C. dress
4. night D. board
23
Disadvantage – once three of the items have been accurately
matched, the fourth pair is correct by default.
Good practice – to give more alternatives than the matching
task requires.
24
Advantages of Objective tests
High reliability
Can test a wider range of topics, skills, elements of language
Students who are less proficient can answer objective items
without too much difficulty
Objective tests are easily administered and scored
Objective tests is useful for summative test as they can test a
wider range of topics, skills, elements of language within a
reasonable amount of time
Objectives test items can be easily item analysed
Disadvantages of Objective tests
Difficult to write items that measure higher cognitive
abilities, such as analysis, synthesis, and evaluation
Needs time and exorbitant cost to write good items
Encourage guessing from weak students
Objective test is less useful for formative test
Does not lend itself to test that aims at testing language
proficiency especially in organising ideas, cohesion,
coherence, connected discourse
Subjective Tests
Structured Question
1. Define ‘validity’ in one sentence.
2. Give two advantages and disadvantages of objective
tests.
Essays:
1. Discuss the strengths and weaknesses of subjective
tests.
2. Why is it important to learn English?
4. Information Transfer
Is used most in reading and listening comprehension tasks –
Candidates have to transfer material from the text on to a
chart, table, form or map.
Task resemble real-life activities and are therefore much used
in test batteries which include authentic tasks.
Answers can be objectively marked (if it consists of just
names and numbers) or subjectively marked (take the form
of phrases and short sentences)
31
Main problem – task can be complicated – sometimes
candidates spend so much time working out what should go
where in a table that they do not manage to solve what is
linguistically an easy problem.
Another problem – task may be culturally/cognitively biased.
For example, candidate might be asked to listen to a
description of someone’s journey through a town and to
mark the route on a map. SS who are unfamiliar with maps or
not good at map-reading are at a disadvantage.
32
5. Ordering Tasks
Candidates are asked to put a group of words, phrases, sentences or
paragraphs in order.
Such tasks are typically used to test simple or complex grammar,
reference and cohesion, or reading comprehension.
Qrdering tasks are difficult to construct because it is not easy to
provide words or phrases which only make sense in one order.
Eg. Put the following words in order to complete the sentence.
She gave__________________________________
book her yesterday mother the to
33
6. Editing
Often consists of sentences or passages in which errors
have been introduced which the candidate has to identify.
Can take the form of MCQ or can be more open
A common method – ask students to identify one error
in each line of a text, either by marking the text, or by
writing a correction beside each appropriate line.
The main difficulty with this kind of item is to make sure
that there is only one mistake per line.
34
7. Gap-filling
Refers to tests in which candidate is given a short passage in which
some words or phrases have been deleted. The candidate’s task is
to restore the missing words.
The deletions have been specially selected by the test writer to test
chosen aspects of language such as grammar or reading
comprehension.
Main problem – ensuring each gap leads students to write the
expected word.
Another problem – candidate may not be able to think of an
answer, not because they have poor language but because the word
simply does not spring to mind.
35
8. Cloze
Refers to tests in which words are deleted mechanically. Each
nth word is deleted regardless of what the function of that
word is.
One problem with nth word deletion tasks – the choice of
the first deletion can have an effect on the validity of the test.
Some versions of the test – eg, high proportion of function
words deleted, may be fairly easy for competent language
users to restore, whereas other versions may have lost a high
proportion of content words which may prove to be
irretrievable even for native speakers.
36
Cloze .. Con’t
Another disadvantage is that an n th word deletion close test is not
easily amended. If the tester decides to reinstate the difficult word and
delete another nearby, then the principle of the n th word deletion is
being flouted.
Marking cloze tests can be difficult - there may be many possible
answers for any one gap, and there is often disagreement as to what
answers are acceptable.
Unless the aim of the cloze test is to test overall language proficiency
(as advocated by Oller, 1979) such tests may be a wasteful way of
testing – b’cos few of the items in any one passage may test the aspects
of language with which the tester is concerned.
37
9. C-test
Involves mechanical deletion, but this time it is every second
word which is mutilated, and half of each mutilated word
remains in the text in order to give the candidate a clue as to
what is missing.
Suffers from the same disadvantages as cloze and gap-filling.
See Sample
38
Sample of C-test
A C-test is a type of language test in which the students read
a brief paragraph in the target language. The first two
sentences are left intact.
There_ _ _ _ , every ot_ _ _ word i_ printed int_ _ _ , but
f_ _ each alte_ _ _ _ _ word, on_ _ the fi_ _ _ half o_ the
wo_ _ is wri_ _ _ , and t _ second ha_ _ is indi_ _ _ _ _ by a
bl_ _ _ space repre_ _ _ - _ - each let_ _ . T _ _ students’
abi_ _ _ _ to fi_ _ in t_ _ blank spa_ _ is
tho_ _ _ _ to b_ a mea_ _ _ _ of th_ _ _ language profi_ _
_ _ _ _.
39
10. Dictation
Can only be fair to students if it is presented in the same way to
all the students – i.e. having the material on tape
Difficult to mark objectively
Marking dictation is time consuming and boring
An alternative for dictation test – do not ask students to write
down the words verbatim, but to write down the main points,
like in note-taking.
A more authentic listening task than most traditional dictations
but gives rise to problems in marking.
40
11. Short-answer Questions (SAQ)
SAQ are items that are open-ended, where candidates have to
think up the answer for themselves.
Answers may range from a word or phrase to one or two
sentences.
An important point to remember when designing short-
answer questions is that the candidates must know what is
expected of them.
41
For example, in the following example, it is not at all clear what
is wanted:
Rewrite the following sentence, starting with the words provided.The new
sentence must be as close as possible to the original.
It was John who saved my life.
If it _____________________________________
Reading and listening comprehension can be tested using short-answer
question
The answers can be revealing – as they often show textual
misunderstandings which would never have occurred to the test
writer.
42
However, the marking of such items is often very difficult –
since there are many ways of saying the same thing, and many
acceptable alternative answers, some of which may not have
been anticipated by the item writer.
43
B. Subjectively marked Tests
1. Compositions and Essays
Writing the prompts for written compositions seems much easier than
MCQ.
Eg. “Travel broadens the mind.” (J. Smith) Discuss
Many disadvantages with this kind of task:
Terminology – candidates may be unfamiliar with the conventions behind
the technical use of the word ‘discuss’, and so will not know what is
expected.
The instructions lack information that the candidates need if they are to be
able to do justice to themselves.
Candidates need to know how long the essay should be and whether marks
will be deducted if it is too short.
44
1. Compositions and Essays
Audience - Students need to for whom this essay is to be written
– choose between colloquial style or academic style
Need to know how essay is marked. Are marks awarded for the
structure of the essay, and the ability to present a good argument,
or solely for the use of grammar and vocabulary?
Candidates need to know all these things in order to decide
whether to use easy, well-known structures so as not to be
penalized for errors, or whether to take risks because extra
marks are awarded for the use of complex and creative language
45
It would be better if it was presented in the following way:
Write a formal essay for your English teacher saying whether you agree
with the saying,“Travel broadens the mind”.
You should write about 200 to 250 words.
Marks will be awarded for:
structure of the essay -the use of paragraph (20%)
appropriacy of style (20%)
clarity of argument (20%)
range of grammar and vocabulary (20%)
accuracy of grammar and vocabulary (20%)
46
2. Summaries
Often use to test reading or listening comprehension and
writing skills.
Commonly used for an integrated test of comprehension and
writing.
Writing summaries closely replicate many real-life activities,
but there are two main problems –
If a candidate writes a poor summary – it may be impossible
to know whether this is because of poor comprehension or
poor writing skills
Marking a summary is not easy.
47
3. Oral Interviews
Need to be carefully structured so that the aspects of the test
which are considered important are covered with each student,
and each student is tested in a similar way.
Not fair if some are only required to make simple but appropriate
comments, while others are forced to use complex language
which betray their inadequacies.
Interviewers need to be trained to put candidates at their ease, to
get a genuine conversation going without saying much themselves,
to manage to appear interested in each interview and to know
how to ask questions which will elicit the language required.
48
Advantages of Subjective tests
Easier to construct. Less time needed to construct.
Useful to test higher cognitive abilities
Reduce guessing and copying since answers are not provided
Useful in identifying students weaknesses in writing, and
speaking
Disadvantages
Limited topics/areas to be tested, thereby encourage
‘spotting’
Subjective test has lower reliability and validity as there are
only limited areas/topics to be tested
Takes ages to score the test.
Labour intensive
Examiners may be affected by ‘Halo and horn effects,
especially when the identity of students are known.
Difficult to come up with good marking scheme