Evaluating Selection
Techniques and Decisions
YZEL O. ALFECHE
BS PSYCHOLOGY 3
Learning Objectives
Understand how to determine the reliability of a test and the factors that
affect test reliability
Understand the five ways to validate a test
Learn how to find information about tests
Understand how to determine the utility of a selection test
Be able to evaluate a test for potential legal problems
Understand how to use test scores to make personnel selection decisions
1
Characteristics of Effective Selection Techniques
1. Reliable
2. Valid
3. Cost-efficient
4. Fair
5. Legally defensible
2
Reliability is the extent to which a score from a selection
measure is stable and free from error.
Test-Retest Reliability – repeated administration of the same
test will achieve similar results
Temporal stability: consistency of test scores across time
Alternate-Forms Reliability – two forms of the test are similar
Form stability: scores on two forms of a test are similar
3
Reliability
Internal Reliability – items within the test are measuring the
intended concept consistently
Item stability: responses to the same test items are consistent
Item homogeneity: test items measure the same construct
Scorer Reliability – two people scoring a test agree on the
test score
4
Methods to determine internal consistency:
Kuder-Richardson formula 20 (K-R-20) – determine internal
reliability of a test that use items with dichotomous answers (yes/no,
true/false)
Split-half method – consistency of item responses is determined by
comparing scores half of the items with scores to the other half
Spearman-Brown prophecy formula – used to correct reliability
coefficients resulting from the split-half method
Coefficient alpha – determine internal reliability of tests that use
interval or ratio scales
5
Validity is the degree to which inferences from test scores
are justified by the evidence
Content Validity – test items sample the content that they are
supposed to measure
Criterion Validity – test score is related to some measure of
job performance
Concurrent validity: correlates test scores with measures of job
performance for employees who are already on the job
Predictive validity: test scores of applicants are compared with a
future measure of job performance
Restricted range: narrow range of performance scores that makes it
difficult to obtain a significant validity coefficient
6
Validity
Validity generalization (VG): a test found valid for a job in one
location is valid for the same job in different location
Synthetic validity: inferred on the basis of a match between
components and tests previously found valid for those job components
Construct Validity – test actually measures the construct that
it purports to measure
Known-group validity: test scores from two contrasting groups
“known” to differ on a construct are compared
Face Validity – a test appears to be valid or job related
Barnum statements: general statements that they can be true of
almost everyone
7
What method of establishing
validity is the best?
Cost-efficiency
If two or more tests have similar validities, then cost should
be considered.
Computer-adaptive testing (CAT)
A type of test taken on a computer in which the computer
adapts the difficulty level of questions asked to the test
taker’s success in answering previous questions
8
Establishing the Usefulness of a Selection Device
Taylor-Russell Tables – a series of tables based on the
selection, ratio, base rate, and test validity that yield
information about the percentage of future employees who
will be successful if a particular test is used.
Selection ratio – percentage of applicants an organization hires
𝑛𝑢𝑚𝑏𝑒𝑟 ℎ𝑖𝑟𝑒𝑑
𝑠𝑒𝑙𝑒𝑐𝑡𝑖𝑜𝑛 𝑟𝑎𝑡𝑖𝑜 =
𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑎𝑝𝑝𝑙𝑖𝑐𝑎𝑛𝑡𝑠
Base rate – percentage of current employees who are considered
successful
9
Establishing the Usefulness of a Selection Device
Proportions of correct decisions – a utility method that
compares the percentage of times a selection decision was
accurate with the percentage of successful employees
Lawshe Tables – use the base rate, test validity, and applicant
percentile on test to determine the probability of future
success for that applicant
10
Establishing the Usefulness of a Selection Device
Brodgen-Cronbach-Gleser Utility Formula – method of
ascertaining the extent to which an organization will benefit
from the use of a particular selection system
Number of employees hired per year (n)
Average tenure (t)
Test validity (r)
Standard deviation in dollar (SDy)
Mean standardized predictor score of selected applicants (m)
11
Determining the Fairness of a Test
Measurement Bias
Group differences in test scores that are unrelated to the
construct being measure
Adverse impact:
An employment practice that results in members of a protected
class being negatively affected at a higher rate that members of
majority class
12
Determining the Fairness of a Test
Predictive Bias
A situation in which the predicted level of job success falsely
favors one group over another
Single-group validity:
Predicts a criterion for one class of people but not for another
Differential validity:
Predicts a criterion for two groups (minorities and nonminorities),
but better for one of the two groups
13
Making the Hiring Decisions
Unadjusted Top-Down Selection
Top-down selection: selecting applicants in straight rank order of
their test scores
Compensatory approach: method of making a selection decisions
in which a high score on one test can compensate for a low score
on another test
Rules of Three – the names of the top three applicants are given
to a hiring authority who can then select of the three
14
Why should we use anything
other than top-down
selection? Shouldn’t we
always hire the applicants
with the highest scores?
Making the Hiring Decisions
Passing Scores – the minimum test score an applicant must
achieve to be considered for hire
Multiple-cutoff approach: selection strategy in which applicants
must meet or exceed the passing score on more than one selection
test
Multiple-hurdle approach: selection practice of administering one
test at a time so that applicants must pass that test before being
allowed to take next test
15
Making the Hiring Decisions
Banding
Statistical technique based on the standard error of
measurement that allows similar test scores to be grouped
Standard error of measurement (SEM):
Number of points that a test score could be off due to test
unreliability
15
Is it ethical to hire a person
with a lower test score
because he or she seems to
be a better personality fit
for an organization?