SlideShare a Scribd company logo
5
Most read
10
Most read
20
Most read
PRINCIPLES OF LANGUAGE ASSESSMENT
Dr.VMS
Brown’s Model
COMPONENTS OF LANGUAGE ASSESSMENT
1. Practicality,
2. Reliability,
3. Validity,
4. Authenticity and
5. Washback
1. PRACTICALITY
 An effective test is practical. This means that it
 Is not excessively expensive,
 Stays within appropriate time constraints,
 Is relatively easy to administer, and
 Has a scoring/evaluation procedure that is
specific and time-efficient.
PRACTICALITY
 A test that is prohibitively expensive is
impractical. A test of language proficiency that
takes a student five hours to complete is
impractical-it consumes more time (and money)
than necessary to accomplish its objective. A test
that requires individual one-on-one proctoring is
impractical for a group of several hundred test-
takers and only a handful of examiners. A test
that takes a few minutes for a student to take
and several hours for an examiner too evaluate
is impractical for most classroom situations.

2. RELIABILITY
 A reliable test is consistent and dependable. If
you give the same test to the same student or
matched students on two different occasions, the
test should yield similar result. The issue of
reliability of a test may best be addressed by
considering a number of factors that may
contribute to the unreliability of a test. Consider
the following possibilities (adapted from Mousavi,
2002, p. 804): fluctuations in the student, in
scoring, in test administration, and in the test
itself.
2.1 STUDENT-RELATED RELIABILITY
 The most common learner-related issue in
reliability is caused by temporary illness,
fatigue, a “bad day,” anxiety, and other
physical or psychological factors, which may
make an “observed” score deviate from one’s
“true” score. Also included in this category are
such factors as a test-taker’s “test-wiseness” or
strategies for efficient test taking (Mousavi,
2002, p. 804).
2.2 RATER RELIABILITY
 Human error, subjectivity, and bias may enter into the
scoring process. Inter-rater reliability occurs when two
or more scores yield inconsistent score of the same
test, possibly for lack of attention to scoring criteria,
inexperience, inattention, or even preconceived biases.
In the story above about the placement test, the initial
scoring plan for the dictations was found to be
unreliable-that is, the two scorers were not applying
the same standards.
2.3 TEST ADMINISTRATION RELIABILITY
 Unreliability may also result from the conditions in
which the test is administered. I once witnessed the
administration of a test of aural comprehension in
which a tape recorder played items for comprehension,
but because of street noise outside the building,
students sitting next to windows could not hear the
tape accurately. This was a clear case of unreliability
caused by the conditions of the test administration.
Other sources of unreliability are found in
photocopying variations, the amount of light in
different parts of the room, variations in temperature,
and even the condition of desks and chairs.
2.4 TEST RELIABILITY
 Sometimes the nature of the test itself can cause
measurement errors. If a test is too long, test-takers
may become fatigued by the time they reach the later
items and hastily respond incorrectly. Timed tests may
discriminate against students who do not perform well
on a test with a time limit. We all know people (and
you may be include in this category1) who “know” the
course material perfectly but who are adversely
affected by the presence of a clock ticking away. Poorly
written test items (that are ambiguous or that have
more than on correct answer) may be a further source
of test unreliability.
3. VALIDITY
 By far the most complex criterion of an effective test-and
arguably the most important principle-is validity, “the extent
to which inferences made from assessment result are
appropriate, meaningful, and useful in terms of the purpose of
the assessment” (Ground, 1998, p. 226). A valid test of reading
ability actually measures reading ability-not 20/20 vision, nor
previous knowledge in a subject, nor some other variable of
questionable relevance. To measure writing ability, one might
ask students to write as many words as they can in 15
minutes, then simply count the words for the final score. Such
a test would be easy to administer (practical), and the scoring
quite dependable (reliable). But it would not constitute a valid
test of writing ability without some consideration of
comprehensibility, rhetorical discourse elements, and the
organization of ideas, among other factors.
3.1 CONTENT-RELATE EVIDENCE
 If a test actually samples the subject matter
about which conclusion are to be drawn, and if it
requires the test-takers to perform the behavior
that is being measured, it can claim content-
related evidence of validity, often popularly
referred to as content validity (e.g., Mousavi,
2002; Hughes, 2003). You can usually identify
content-related evidence observationally if you
can clearly define the achievement that you are
measuring.
3.2 CRITERION-RELATED EVIDENCE
 A second of evidence of the validity of a test may be
found in what is called criterion-related evidence, also
referred to as criterion-related validity, or the extent
to which the “criterion” of the test has actually been
reached. You will recall that in Chapter I it was noted
that most classroom-based assessment with teacher-
designed tests fits the concept of criterion-referenced
assessment. In such tests, specified classroom
objectives are measured, and implied predetermined
levels of performance are expected to be reached (80
percent is considered a minimal passing grade).

3.4 CONSTRUCT-RELATED EVIDENCE
 A third kind of evidence that can support
validity, but one that does not play as large a role
classroom teachers, is construct-related validity,
commonly referred to as construct validity. A
construct is any theory, hypothesis, or model that
attempts to explain observed phenomena in our
universe of perceptions. Constructs may or may
not be directly or empirically measured-their
verification often requires inferential data.
3.5 CONSEQUENTIAL VALIDITY
 As well as the above three widely accepted forms of
evidence that may be introduced to support the validity of
an assessment, two other categories may be of some interest
and utility in your own quest for validating classroom test.
Messick (1989), Grounlund (1998), McNamara (2000), and
Brindley (2001), among others, underscore the potential
importance of the consequences of using an assessment.
Consequential validity encompasses all the consequences of
a test, including such considerations as its accuracy in
measuring intended criteria, its impact on the preparation
of test-takers, its effect on the learner, and the (intended
and unintended) social consequences of a test’s
interpretation and use.
3.6 FACE VALIDITY
 An important facet of consequential validity is the
extent to which “students view the assessment as fair,
relevant, and useful for improving learning”
(Gronlund, 1998, p. 210), or what is popularly known
as face validity. “Face validity refers to the degree to
which a test looks right, and appears to measure the
knowledge or abilities it claims to measure, based on
the subjective judgment of the examines who take it,
the administrative personnel who decode on its use,
and other psychometrically unsophisticated observers”
(Mousavi, 2002, p. 244).
4. AUTHENTICITY
 An fourth major principle of language testing is
authenticity, a concept that is a little slippery to
define, especially within the art and science of
evaluating and designing tests. Bachman and
Palmer (1996, p. 23) define authenticity as “the
degree of correspondence of the characteristics of
a given language test task to the features of a
target language task,” and then suggest an
agenda for identifying those target language
tasks and for transforming them into valid test
items.
5. WASHBACK
 A facet of consequential validity, discussed above, is “the
effect of testing on teaching and learning” (Hughes, 2003, p.
1), otherwise known among language-testing specialists as
washback. In large-scale assessment, wasback generally
refers to the effects the test have on instruction in terms of
how students prepare for the test. “Cram” courses and
“teaching to the test” are examples of such washback.
Another form of washback that occurs more in classroom
assessment is the information that “washes back” to
students in the form of useful diagnoses of strengths and
weaknesses. Washback also includes the effects of an
assessment on teaching and learning prior to the
assessment itself, that is, on preparation for the
assessment.
5.1 WASHBACK/BACKWASH
 The term wasback is commonly used in applied
linguistics. it is rarely found in dictionaries.
 However, the word backwash can be found in certain
dictionaries and it is defined as “an effect that is not
the direct result of something” by Cambridge
Advanced Learner’s Dictionary.
 In dealing with principles of language assessment,
these two words somehow can be interchangeable.
 Washback (Brown, 2004) or Backwash (Heaton, 1990)
refers to the influence of testing on teaching and
learning. ď‚§ The influence itself can be positive or
negative (Cheng et al. (Eds.), 2008:7-11)
5.2 POSITIVE WASHBACK
 Positive washback has beneficial influence on
teaching and learning. It means teachers and
students have a positive attitude toward the
examination or test, and work willingly and
collaboratively towards its objective (Cheng &
Curtis, 2008:10).
 A good test should have a good effect.
5.3 NEGATIVE WASHBACK
 Negative washback does not give any beneficial
influence on teaching and learning (Cheng and
Curtis, 2008:9).
 Tests which have negative washback is
considered to have negative influence on teaching
and learning.
Thank You

More Related Content

PPTX
Principles of language assessment
Sutrisno Evenddy
 
PDF
Principles of language assessment
Ameer Al-Labban
 
PDF
Principles of language assessment
paenriquez2
 
PPTX
Fundamental concepts and principles in Language Testing
Phạm Phúc Khánh Minh
 
PPT
Chapter 2(principles of language assessment)
Kheang Sokheng
 
PPTX
Designing language test
Jesullyna Manuel
 
PPTX
Principles of language assessment
Astrid Caballero
 
PPTX
Assessment and testing language
Ma Luz Kantu
 
Principles of language assessment
Sutrisno Evenddy
 
Principles of language assessment
Ameer Al-Labban
 
Principles of language assessment
paenriquez2
 
Fundamental concepts and principles in Language Testing
Phạm Phúc Khánh Minh
 
Chapter 2(principles of language assessment)
Kheang Sokheng
 
Designing language test
Jesullyna Manuel
 
Principles of language assessment
Astrid Caballero
 
Assessment and testing language
Ma Luz Kantu
 

What's hot (20)

PPTX
Language Assessment : Kinds of tests and testing
Musfera Nara Vadia
 
PPTX
Language testing and evaluation validity and reliability.
Vadher Ankita
 
PPTX
Testing for Language Teachers
mpazhou
 
PPTX
introducing language testing and assessment
Najah M. Algolaip
 
PPTX
Chapter 2: Principles of Language Assessment
Hamid Najaf Pour Sani
 
PPT
Language testing
ahmedabbas1121
 
PPTX
Types of tests and types of testing
Phạm Phúc Khánh Minh
 
PPTX
Validity, reliablility, washback
Maury Martinez
 
PPTX
Testing reading
Maury Martinez
 
PPTX
Testing for Language Teachers Arthur Hughes
Rajputt Ainee
 
PPTX
Communicative Testing
Ningsih SM
 
PPT
Testing for language teachers 101 (1)
Paul Doyon
 
PPTX
Testing writing (for Language Teachers)
Wenlie Jean
 
PPTX
Achieving beneficial blackwash
Maury Martinez
 
PPTX
Testing grammar
Shazia Ijaz
 
PPT
Chapter 3(designing classroom language tests)
Kheang Sokheng
 
PPTX
Designing classroom language tests
Sutrisno Evenddy
 
PPTX
Testing grammar and vocabulary
marinasr_
 
PPT
Communicative language testing
Ida Mantra
 
PPTX
Testing, assessing, and teaching
Sutrisno Evenddy
 
Language Assessment : Kinds of tests and testing
Musfera Nara Vadia
 
Language testing and evaluation validity and reliability.
Vadher Ankita
 
Testing for Language Teachers
mpazhou
 
introducing language testing and assessment
Najah M. Algolaip
 
Chapter 2: Principles of Language Assessment
Hamid Najaf Pour Sani
 
Language testing
ahmedabbas1121
 
Types of tests and types of testing
Phạm Phúc Khánh Minh
 
Validity, reliablility, washback
Maury Martinez
 
Testing reading
Maury Martinez
 
Testing for Language Teachers Arthur Hughes
Rajputt Ainee
 
Communicative Testing
Ningsih SM
 
Testing for language teachers 101 (1)
Paul Doyon
 
Testing writing (for Language Teachers)
Wenlie Jean
 
Achieving beneficial blackwash
Maury Martinez
 
Testing grammar
Shazia Ijaz
 
Chapter 3(designing classroom language tests)
Kheang Sokheng
 
Designing classroom language tests
Sutrisno Evenddy
 
Testing grammar and vocabulary
marinasr_
 
Communicative language testing
Ida Mantra
 
Testing, assessing, and teaching
Sutrisno Evenddy
 
Ad

Similar to Principles of Language Assessment (20)

PPTX
The nittygritty of language testing
Marzs
 
PPTX
PRINCIPLES OF ASSESSMENT 2.pptx
JoelGuamani2
 
PPTX
ppt language as..pptx
Ariseandainya
 
PPTX
Principles of Language Assessment
A Faiz
 
PPTX
Language Assessments - Key Features and Concepts
ranjit857586
 
PPTX
Basic Principles of Language Assessment.pptx
chibiicutiiii
 
DOC
Testing
liveness
 
PPTX
CHARACTERISTICS OF A GOOD INSTRUMENT
Musfera Nara Vadia
 
PPT
Principles_of_language_testing.ppt
NaufalKurniawan12
 
PPT
Principles of Lang Assessment_Recently RvsdRe.ppt
AntonWahyana
 
PPTX
Principles of language assessment.pptx
NOELIAANALIPROAOTROY1
 
PPTX
3232423232323232323232323232323232323 .pptx
Attallah Alanazi
 
DOCX
assessment and feedback in english language learning
TeacherXimenaEnglish
 
PPTX
topic_3 Language Assessment (Principles of Assessment).pptx
nrglecture
 
PPTX
presentation of Requirements of a good test..pptx
erramimaryam7
 
PDF
2. presentation evaluation segunda part.pdf
InesMato
 
PPTX
3 basic-principles_of_assessment
hakim azman
 
PPTX
Language Testing : Principles of language assessment
Yulia Eolia
 
DOCX
CLASSROOM ACTIVITIES
Alfredo Carrion
 
PPTX
Language testing in all levels
April Love Palad
 
The nittygritty of language testing
Marzs
 
PRINCIPLES OF ASSESSMENT 2.pptx
JoelGuamani2
 
ppt language as..pptx
Ariseandainya
 
Principles of Language Assessment
A Faiz
 
Language Assessments - Key Features and Concepts
ranjit857586
 
Basic Principles of Language Assessment.pptx
chibiicutiiii
 
Testing
liveness
 
CHARACTERISTICS OF A GOOD INSTRUMENT
Musfera Nara Vadia
 
Principles_of_language_testing.ppt
NaufalKurniawan12
 
Principles of Lang Assessment_Recently RvsdRe.ppt
AntonWahyana
 
Principles of language assessment.pptx
NOELIAANALIPROAOTROY1
 
3232423232323232323232323232323232323 .pptx
Attallah Alanazi
 
assessment and feedback in english language learning
TeacherXimenaEnglish
 
topic_3 Language Assessment (Principles of Assessment).pptx
nrglecture
 
presentation of Requirements of a good test..pptx
erramimaryam7
 
2. presentation evaluation segunda part.pdf
InesMato
 
3 basic-principles_of_assessment
hakim azman
 
Language Testing : Principles of language assessment
Yulia Eolia
 
CLASSROOM ACTIVITIES
Alfredo Carrion
 
Language testing in all levels
April Love Palad
 
Ad

More from SubramanianMuthusamy3 (19)

PPTX
Call and calt
SubramanianMuthusamy3
 
PPTX
Lfg and gpsg
SubramanianMuthusamy3
 
PPTX
Group discussion
SubramanianMuthusamy3
 
PPTX
Word sense, notions
SubramanianMuthusamy3
 
PPTX
Rewrite systems
SubramanianMuthusamy3
 
PPTX
Phrase structure grammar
SubramanianMuthusamy3
 
PPT
Head Movement and verb movement
SubramanianMuthusamy3
 
PPTX
Text editing, analysis, processing, bibliography
SubramanianMuthusamy3
 
PPTX
R language
SubramanianMuthusamy3
 
PPTX
Nlp (1)
SubramanianMuthusamy3
 
PPTX
Computer programming languages
SubramanianMuthusamy3
 
PPTX
Computer dictionaries and_parsing_ppt
SubramanianMuthusamy3
 
PPTX
Applications of computers in linguistics
SubramanianMuthusamy3
 
PPTX
Scope of translation technologies in indusstry 5.0
SubramanianMuthusamy3
 
PPTX
Stylistics in computational perspective
SubramanianMuthusamy3
 
PPTX
Presentation skills
SubramanianMuthusamy3
 
PPTX
Creativity and strategic thinking
SubramanianMuthusamy3
 
PPTX
Building rapport soft skills
SubramanianMuthusamy3
 
PPTX
Types of computers[6999]
SubramanianMuthusamy3
 
Call and calt
SubramanianMuthusamy3
 
Lfg and gpsg
SubramanianMuthusamy3
 
Group discussion
SubramanianMuthusamy3
 
Word sense, notions
SubramanianMuthusamy3
 
Rewrite systems
SubramanianMuthusamy3
 
Phrase structure grammar
SubramanianMuthusamy3
 
Head Movement and verb movement
SubramanianMuthusamy3
 
Text editing, analysis, processing, bibliography
SubramanianMuthusamy3
 
R language
SubramanianMuthusamy3
 
Computer programming languages
SubramanianMuthusamy3
 
Computer dictionaries and_parsing_ppt
SubramanianMuthusamy3
 
Applications of computers in linguistics
SubramanianMuthusamy3
 
Scope of translation technologies in indusstry 5.0
SubramanianMuthusamy3
 
Stylistics in computational perspective
SubramanianMuthusamy3
 
Presentation skills
SubramanianMuthusamy3
 
Creativity and strategic thinking
SubramanianMuthusamy3
 
Building rapport soft skills
SubramanianMuthusamy3
 
Types of computers[6999]
SubramanianMuthusamy3
 

Recently uploaded (20)

PPTX
Artificial Intelligence in Gastroentrology: Advancements and Future Presprec...
AyanHossain
 
PPTX
Introduction to pediatric nursing in 5th Sem..pptx
AneetaSharma15
 
PPTX
20250924 Navigating the Future: How to tell the difference between an emergen...
McGuinness Institute
 
PPTX
HISTORY COLLECTION FOR PSYCHIATRIC PATIENTS.pptx
PoojaSen20
 
PDF
Health-The-Ultimate-Treasure (1).pdf/8th class science curiosity /samyans edu...
Sandeep Swamy
 
PPTX
Artificial-Intelligence-in-Drug-Discovery by R D Jawarkar.pptx
Rahul Jawarkar
 
DOCX
pgdei-UNIT -V Neurological Disorders & developmental disabilities
JELLA VISHNU DURGA PRASAD
 
PPTX
Dakar Framework Education For All- 2000(Act)
santoshmohalik1
 
PPTX
CARE OF UNCONSCIOUS PATIENTS .pptx
AneetaSharma15
 
PDF
Review of Related Literature & Studies.pdf
Thelma Villaflores
 
PPTX
Tips Management in Odoo 18 POS - Odoo Slides
Celine George
 
PPTX
Sonnet 130_ My Mistress’ Eyes Are Nothing Like the Sun By William Shakespear...
DhatriParmar
 
PPTX
Applications of matrices In Real Life_20250724_091307_0000.pptx
gehlotkrish03
 
DOCX
SAROCES Action-Plan FOR ARAL PROGRAM IN DEPED
Levenmartlacuna1
 
PPTX
Care of patients with elImination deviation.pptx
AneetaSharma15
 
PPTX
CONCEPT OF CHILD CARE. pptx
AneetaSharma15
 
PPTX
An introduction to Prepositions for beginners.pptx
drsiddhantnagine
 
PPTX
An introduction to Dialogue writing.pptx
drsiddhantnagine
 
PPTX
HEALTH CARE DELIVERY SYSTEM - UNIT 2 - GNM 3RD YEAR.pptx
Priyanshu Anand
 
PDF
Module 2: Public Health History [Tutorial Slides]
JonathanHallett4
 
Artificial Intelligence in Gastroentrology: Advancements and Future Presprec...
AyanHossain
 
Introduction to pediatric nursing in 5th Sem..pptx
AneetaSharma15
 
20250924 Navigating the Future: How to tell the difference between an emergen...
McGuinness Institute
 
HISTORY COLLECTION FOR PSYCHIATRIC PATIENTS.pptx
PoojaSen20
 
Health-The-Ultimate-Treasure (1).pdf/8th class science curiosity /samyans edu...
Sandeep Swamy
 
Artificial-Intelligence-in-Drug-Discovery by R D Jawarkar.pptx
Rahul Jawarkar
 
pgdei-UNIT -V Neurological Disorders & developmental disabilities
JELLA VISHNU DURGA PRASAD
 
Dakar Framework Education For All- 2000(Act)
santoshmohalik1
 
CARE OF UNCONSCIOUS PATIENTS .pptx
AneetaSharma15
 
Review of Related Literature & Studies.pdf
Thelma Villaflores
 
Tips Management in Odoo 18 POS - Odoo Slides
Celine George
 
Sonnet 130_ My Mistress’ Eyes Are Nothing Like the Sun By William Shakespear...
DhatriParmar
 
Applications of matrices In Real Life_20250724_091307_0000.pptx
gehlotkrish03
 
SAROCES Action-Plan FOR ARAL PROGRAM IN DEPED
Levenmartlacuna1
 
Care of patients with elImination deviation.pptx
AneetaSharma15
 
CONCEPT OF CHILD CARE. pptx
AneetaSharma15
 
An introduction to Prepositions for beginners.pptx
drsiddhantnagine
 
An introduction to Dialogue writing.pptx
drsiddhantnagine
 
HEALTH CARE DELIVERY SYSTEM - UNIT 2 - GNM 3RD YEAR.pptx
Priyanshu Anand
 
Module 2: Public Health History [Tutorial Slides]
JonathanHallett4
 

Principles of Language Assessment

  • 1. PRINCIPLES OF LANGUAGE ASSESSMENT Dr.VMS Brown’s Model
  • 2. COMPONENTS OF LANGUAGE ASSESSMENT 1. Practicality, 2. Reliability, 3. Validity, 4. Authenticity and 5. Washback
  • 3. 1. PRACTICALITY  An effective test is practical. This means that it  Is not excessively expensive,  Stays within appropriate time constraints,  Is relatively easy to administer, and  Has a scoring/evaluation procedure that is specific and time-efficient.
  • 4. PRACTICALITY  A test that is prohibitively expensive is impractical. A test of language proficiency that takes a student five hours to complete is impractical-it consumes more time (and money) than necessary to accomplish its objective. A test that requires individual one-on-one proctoring is impractical for a group of several hundred test- takers and only a handful of examiners. A test that takes a few minutes for a student to take and several hours for an examiner too evaluate is impractical for most classroom situations. 
  • 5. 2. RELIABILITY  A reliable test is consistent and dependable. If you give the same test to the same student or matched students on two different occasions, the test should yield similar result. The issue of reliability of a test may best be addressed by considering a number of factors that may contribute to the unreliability of a test. Consider the following possibilities (adapted from Mousavi, 2002, p. 804): fluctuations in the student, in scoring, in test administration, and in the test itself.
  • 6. 2.1 STUDENT-RELATED RELIABILITY  The most common learner-related issue in reliability is caused by temporary illness, fatigue, a “bad day,” anxiety, and other physical or psychological factors, which may make an “observed” score deviate from one’s “true” score. Also included in this category are such factors as a test-taker’s “test-wiseness” or strategies for efficient test taking (Mousavi, 2002, p. 804).
  • 7. 2.2 RATER RELIABILITY  Human error, subjectivity, and bias may enter into the scoring process. Inter-rater reliability occurs when two or more scores yield inconsistent score of the same test, possibly for lack of attention to scoring criteria, inexperience, inattention, or even preconceived biases. In the story above about the placement test, the initial scoring plan for the dictations was found to be unreliable-that is, the two scorers were not applying the same standards.
  • 8. 2.3 TEST ADMINISTRATION RELIABILITY  Unreliability may also result from the conditions in which the test is administered. I once witnessed the administration of a test of aural comprehension in which a tape recorder played items for comprehension, but because of street noise outside the building, students sitting next to windows could not hear the tape accurately. This was a clear case of unreliability caused by the conditions of the test administration. Other sources of unreliability are found in photocopying variations, the amount of light in different parts of the room, variations in temperature, and even the condition of desks and chairs.
  • 9. 2.4 TEST RELIABILITY  Sometimes the nature of the test itself can cause measurement errors. If a test is too long, test-takers may become fatigued by the time they reach the later items and hastily respond incorrectly. Timed tests may discriminate against students who do not perform well on a test with a time limit. We all know people (and you may be include in this category1) who “know” the course material perfectly but who are adversely affected by the presence of a clock ticking away. Poorly written test items (that are ambiguous or that have more than on correct answer) may be a further source of test unreliability.
  • 10. 3. VALIDITY  By far the most complex criterion of an effective test-and arguably the most important principle-is validity, “the extent to which inferences made from assessment result are appropriate, meaningful, and useful in terms of the purpose of the assessment” (Ground, 1998, p. 226). A valid test of reading ability actually measures reading ability-not 20/20 vision, nor previous knowledge in a subject, nor some other variable of questionable relevance. To measure writing ability, one might ask students to write as many words as they can in 15 minutes, then simply count the words for the final score. Such a test would be easy to administer (practical), and the scoring quite dependable (reliable). But it would not constitute a valid test of writing ability without some consideration of comprehensibility, rhetorical discourse elements, and the organization of ideas, among other factors.
  • 11. 3.1 CONTENT-RELATE EVIDENCE  If a test actually samples the subject matter about which conclusion are to be drawn, and if it requires the test-takers to perform the behavior that is being measured, it can claim content- related evidence of validity, often popularly referred to as content validity (e.g., Mousavi, 2002; Hughes, 2003). You can usually identify content-related evidence observationally if you can clearly define the achievement that you are measuring.
  • 12. 3.2 CRITERION-RELATED EVIDENCE  A second of evidence of the validity of a test may be found in what is called criterion-related evidence, also referred to as criterion-related validity, or the extent to which the “criterion” of the test has actually been reached. You will recall that in Chapter I it was noted that most classroom-based assessment with teacher- designed tests fits the concept of criterion-referenced assessment. In such tests, specified classroom objectives are measured, and implied predetermined levels of performance are expected to be reached (80 percent is considered a minimal passing grade). 
  • 13. 3.4 CONSTRUCT-RELATED EVIDENCE  A third kind of evidence that can support validity, but one that does not play as large a role classroom teachers, is construct-related validity, commonly referred to as construct validity. A construct is any theory, hypothesis, or model that attempts to explain observed phenomena in our universe of perceptions. Constructs may or may not be directly or empirically measured-their verification often requires inferential data.
  • 14. 3.5 CONSEQUENTIAL VALIDITY  As well as the above three widely accepted forms of evidence that may be introduced to support the validity of an assessment, two other categories may be of some interest and utility in your own quest for validating classroom test. Messick (1989), Grounlund (1998), McNamara (2000), and Brindley (2001), among others, underscore the potential importance of the consequences of using an assessment. Consequential validity encompasses all the consequences of a test, including such considerations as its accuracy in measuring intended criteria, its impact on the preparation of test-takers, its effect on the learner, and the (intended and unintended) social consequences of a test’s interpretation and use.
  • 15. 3.6 FACE VALIDITY  An important facet of consequential validity is the extent to which “students view the assessment as fair, relevant, and useful for improving learning” (Gronlund, 1998, p. 210), or what is popularly known as face validity. “Face validity refers to the degree to which a test looks right, and appears to measure the knowledge or abilities it claims to measure, based on the subjective judgment of the examines who take it, the administrative personnel who decode on its use, and other psychometrically unsophisticated observers” (Mousavi, 2002, p. 244).
  • 16. 4. AUTHENTICITY  An fourth major principle of language testing is authenticity, a concept that is a little slippery to define, especially within the art and science of evaluating and designing tests. Bachman and Palmer (1996, p. 23) define authenticity as “the degree of correspondence of the characteristics of a given language test task to the features of a target language task,” and then suggest an agenda for identifying those target language tasks and for transforming them into valid test items.
  • 17. 5. WASHBACK  A facet of consequential validity, discussed above, is “the effect of testing on teaching and learning” (Hughes, 2003, p. 1), otherwise known among language-testing specialists as washback. In large-scale assessment, wasback generally refers to the effects the test have on instruction in terms of how students prepare for the test. “Cram” courses and “teaching to the test” are examples of such washback. Another form of washback that occurs more in classroom assessment is the information that “washes back” to students in the form of useful diagnoses of strengths and weaknesses. Washback also includes the effects of an assessment on teaching and learning prior to the assessment itself, that is, on preparation for the assessment.
  • 18. 5.1 WASHBACK/BACKWASH  The term wasback is commonly used in applied linguistics. it is rarely found in dictionaries.  However, the word backwash can be found in certain dictionaries and it is defined as “an effect that is not the direct result of something” by Cambridge Advanced Learner’s Dictionary.  In dealing with principles of language assessment, these two words somehow can be interchangeable.  Washback (Brown, 2004) or Backwash (Heaton, 1990) refers to the influence of testing on teaching and learning. ď‚§ The influence itself can be positive or negative (Cheng et al. (Eds.), 2008:7-11)
  • 19. 5.2 POSITIVE WASHBACK  Positive washback has beneficial influence on teaching and learning. It means teachers and students have a positive attitude toward the examination or test, and work willingly and collaboratively towards its objective (Cheng & Curtis, 2008:10).  A good test should have a good effect.
  • 20. 5.3 NEGATIVE WASHBACK  Negative washback does not give any beneficial influence on teaching and learning (Cheng and Curtis, 2008:9).  Tests which have negative washback is considered to have negative influence on teaching and learning.