fundamental concepts and principles in language testing
TRANSCRIPT
FUNDAMENTAL CONCEPTS
AND PRINCIPLES IN
LANGUAGE TESTING
Subject: Language Testing
Instructor: Nguyễn Thanh Tùng, Ph.D.
Class: TESOL 2014B
1. Phạm Phúc Khánh Minh 4. Võ Thị Thanh Thư
2. Nguyễn Trần Hoài Phương 5. Đỗ Thị Bạch Vân
3. Nguyễn Ngọc Phương Thành 6. Ngô Thảo Vy
1. The importance of testing
2. Distinctions among test,
evaluation and measurement
3. Qualities of a language test
CONTENTS
1. The importance of testing
1.1. The relationship of testing and
teaching
“Testing and teaching are closely interrelated
that it is impossible to work in either field
without being constantly concerned with the
other.” (Heaton, J. B. 1988)
Good tests of grammar, translation or language manipulation
Good
communicative
tests of
language
1.1. The relationship of testing and
teaching
1.2. The elements of a good
classroom test
A good test should:
enable teachers to increase their effectiveness by
making adjustments in their teaching
help to locate the precise areas of difficultyencountered by the class or by the individual student
enable the teacher to ascertain which parts of the
language programme have been found difficult by
the class
provide the students with an opportunity to show their
ability to perform a certain task
1.3. Aspects to be tested
What should be tested?
Four skills in communicating: listening, speaking, reading, and writing
The language areas learnt: grammar and usage, vocabulary, and phonology
Language elements: nouns, verbs, adjectives, and so on
1.4. Testing the language skills
It is important to concentrate on types of
test items which are relevant to the ability to
use language for real-life communication,
especially in oral interaction.
Ways of assessing performance in the four
major skills may take the form of tests of:
listening (auditory) comprehension (short
utterances, dialogues, talks and lectures are given to
the learners);
speaking ability, usually in the form of an interview, a
picture description, role play and a problem-solving task involving pair work or group
work;
reading comprehension (questions are set to test the
students' ability to understand the gist of a text
and to extract key information on specific points in the text); and
writing ability, usually in the form of letters, reports,
memos, messages, instructions, and accounts
of past events, etc.
It is the test constructor's task to assess the
relative importance of these skills at the various
levels and to devise an accurate means of
measuring the student's success in developing
these skills.
1.5. Testing language areas
In an attempt to isolate the language areas
learnt, a considerable number of tests include
sections on:
Grammar usage
Vocabulary (concerned with word meanings,
word formation and collocations)
Phonology (concerned with phonemes, stress
and intonation)
• to measure students' ability to recognize appropriate grammatical forms and to manipulate structures
grammar and usage
• to measure students' knowledge of the meaning of words and the patterns and collocations in which they occur.
• may test their active or their passive vocabulary
vocabulary
• might attempt to assess the 3 sub-skills: ability to recognise and pronounce the significant sound contrasts, ability to recognise and use the stress patterns, and ability to hear and produce the melody or patterns of the tunes (i.e. the rise and fall of the voice)
phonology
1.6. Language skills and
language elements
Testing students' ability to handle the elements
of the language or testing the integrated skills
depends both the level and the purpose of
the test.
At all levels but the most elementary, it is
generally advisable to include test items which
measure the ability to communicate in the
target language.
1.7. Main item types of tests
Recognition
to test the recognition of correct words and
forms
Example: Choose the correct
answer and write A, B, C or D.
I've been standing here ___
half an hour.
A. since B. during C. while D. for
Production
to test if students
can produce
the correct answer
Example: Complete each blank with the correct word.
I've been standing here ___ half an
hour.
1.8. Sampling problems and
avoiding traps
The test must cover an adequate and
representative section of those areas and skills
which it is desired to test.
A good test should never be constructed in
such a way as to trap the students into giving an
incorrect answer.
2. Distinctions among test,
evaluation and
measurement
2. Distinctions among test, evaluation
and measurement
- Often used synonymously- For example: Giving a test to evaluate students’ language proficiency- Being essential to the development and use of language tests
2.1. Measurement
The process of quantifying the characteristics of persons according to explicit procedures and rules
Features Quantification
Characteristics
Rules and procedures
2.1.1. Quantification
- Assigning numbers- Differing from qualitative descriptions such as visual presentation, verbal or non-verbal accounts…
2.1.1. Quantification
1 2
• Scales of measurement:
+ Assigning numbers
+ Non-numerical categories, etc.
2.1.2. Characteristics
Whatever attributes or abilities we measure, it is these attributes or abilities and not the people themselves that we are measuring
- Both physical and mental characteristics
- Mental attributes: aptitude, intelligence, motivation, attitude, fluency in speaking, etc.
- Mental abilities: being able to do something , performance on a set of mental tasks The higher degrees of a given ability, the higher probability of correct performance on tasks of lower difficulty or complexity
2.1.3. Rules and procedures
Quantification must be done according to explicit rules and procedures
The observation of an attribute must be replicable for other observers, in other contexts and with other individuals
Many types of measures: rankings, rating scales and tests
2.2. Test
A psychological or educational test is a procedure designed to elicitcertain behavior from which one can make inferences about certain characteristics of an individual.
(Carroll, 1968:46)
For example: The Interagency Language Roundtable (ILR) oral interview – a speaking test:
+ A set of elicitation procedures (activities, questions & topics)
+ A measurement scale of language proficiency (0 5)
2.2. Test
Designed to obtain a specific sample of behavior
Provide the means for more focusing on the specific language
abilities that are of interest
Viewed as supplemental to other methods of measurement
The best means of assuring the sufficiency of the sample of language obtained
For example: the ILR oral interview, the TOELF, etc.
2.3. Evaluation
requires
The ability of the decision maker
The quality of the information: reliableand relevant
The systematic gathering information for the purpose of making decisions
For example: + Education decisions will be based on rumor+ Sex and motivation are relevant to learning strategies
2.3. Evaluation
- Not be exclusively quantitative information (verbal
descriptions, overall impressions, ratings, test scores, etc.)
- Not necessarily entail testing
- Tests can be for purely descriptive purposes - not evaluative
It is important to distinguish the information-providing
function of measurement from the decision-making function of
evaluation
2.4. Relationship among measurement,
tests, and evaluation
2.4. Relationship among measurement,
tests, and evaluation
1. An evaluation excludes tests and measures
Ex: Qualitative descriptions of student performance
2. A non-test measure for evaluation
Ex: Teacher ranking used for assigning grades
3. A test for purposes of evaluation
Ex: Using achievement test to determine student progress
4. A test not used for evaluation
Ex: Using proficiency test as a criterion in SLA research
5. A non-test measure not used for evaluation
Ex: Assigning code numbers to school subjects
3. Qualities of a language
test
3. Qualities of language tests
Usefulness of the test
Reliability
Construct validity
Authenticity
Interactiveness
Impact
Practicality
3.1. Reliability R
elia
bili
ty
Consistency of measurement
Consistent across different characteristics of the testing situation
3.1. Reliability
Example:
If the score of for the first student given by 3 examiners is
10/10. However, the score for the second students is just 2/10.
The scores is not consistent and would be considered to be
unreliable indicators of the ability we want to measure.
VALIDITY
the extent to which the test measures what it is supposed to measure
Content validity
Construct validity
Face validity
3.2. Validity
CO
NT
EN
T V
AL
IDIT
Y
The extent to which a test represent all facets of tasks within the domain being
tested
Example: One teacher gives students the final test. However, the test only covers the material for the last 3 weeks
Low content validity
3.2.1. Content validity
3.2.2. Construct validity
Construct validity
pertains to the meaningfulness of and appropriateness of the interpretations that
we make on the basis of test scores
the characteristics of the test task
construct definition
3.2.3. Face validity FA
CE
VA
LID
ITY
the extent to which a test is subjectively viewed as a covering the concepts it
purports to measure
Example: After a group of students sat a test, the teacher asked for feedback., particularly if they thought the test was a good one.
3.3. Authenticity
the degree of correspondence of the
characteristics of a given language test task to the
characteristics of a TLU task
provide a link between test performance and
the TLU tasks and domain we want to
generalize
the way test takes perceive the relative authentic of test task
can facilitate their test performance
3.4. Interactiveness
Interactiveness is the extent and type of involvementof the test take’s individual characteristics inaccomplishing a test task
Interactiveness is the heart
of many current views of
language teaching and
language leaning
Interactiveness is a
function of the extent and
type of involvement of the
test takes' language ability
and affective schemata
3.5. Impact
Impact
Micro level: individual
Macro level: society,
education system
Washback(Backwash): influence of testing on
teaching and learning.
3.6. Practicality
Practicality is the relationship between the resources
that will be required in the design, development,
and the use of the test and the resources that will be
available for these activities.
A practical test is one whose design, development,
and use do not require more resources than are
available.
Types of resources : human resources, material
resources, and time.
References
Heaton, J. B. (1988). Writing English language tests (New ed.).
London: Longman.
Bachman, L. F. (1997). Fundamental considerations in language
testing. Oxford: Oxford University Press.
Bachman, L. F. & Palmer, A. S. (1996). Language testing in Practice:
design and developing useful language tests. Oxford: Oxford
University Press.
Thank you!