M.Sc. POST GRADUATE DEGREE IN
NURSING
NURSING RESEARCH AND STATISTICS
TOOLS AND METHODS OF DATA COLLECTION
RELIABILITY & VALIDITY OF RESEARCH
INSTRUMENTS
Prepared by
Mrs. Deepa, Professor
OBJECTIVES
 describe the concept of validity
 explain different types of validity
 describe the concept of reliability
 explain factors affecting the reliability of a research
instrument
 illustrate methods of determining the reliability of an
instrument
 describe validity and reliability in qualitative research
DEFINITION
 Validity
 The accuracy of the measure in reflecting the concept
it is supposed to measure.
 Reliability
 Stability and consistency of the measuring
instrument.
 A measure can be reliable without being valid, but it
cannot be valid without being reliable.
CONCEPT OF VALIDITY
• Are we measuring what we think we are
measuring?
• Validity is the ability of an instrument to
measure what it is designed to measure.
• Key questions;
• Who decides whether an instrument is
measuring what it is supposed to measure?
• How can it be established that an instrument is
measuring what it supposed to measure?
Face validity
 Just on its face the instrument appears to be a
good measure of the concept. “intuitive, arrived at
through inspection”
• e.g. Concept=pain level
• Measure=verbal rating scale “rate your pain
from 1 to 10”.
Face validity is sometimes considered a subtype
of content validity.
Question--is there any time when face validity is
not desirable?
Content validity
 Content of the measure is justified by other
evidence, e.g. the literature.
 Entire range or universe of the construct is
measured.
 Usually evaluated and scored by experts in the
content area.
 A CVI (Content Validity Index) of .80 or more is
desirable.
CONSTRUCT VALIDITY
A construct refers to a concept or characteristic that
can’t be directly observed, but can be measured by
observing other indicators that are associated with it.
Constructs can be characteristics of individuals, such
as intelligence, obesity, job satisfaction, or
depression; they can also be broader concepts
applied to organizations or social groups, such as
gender equality, corporate social responsibility, or
freedom of speech.
CONSTRUCT VALIDITY
Construct validity is about ensuring that the
method of measurement matches the construct
you want to measure. If you develop a
questionnaire to diagnose depression, you need
to know: does the questionnaire really measure
the construct of depression? Or is it actually
measuring the respondent’s mood, self-esteem,
or some other construct?
WAYS OF ARRIVING AT CONSTRUCT
VALIDITY
 Hypothesis testing method
 Convergent and divergent
 multitrait-multi method matrix
 Contrasted groups approach
 factor analysis approach
CONVERGENT AND DIVERGENT
METHOD
 Measured constructs should correlate more highly
with measures of similar constructs than measures
of different constructs.
 Example : optimism and self –efficacy should
correlate highly (convergent validity)
 Self efficacy and anger should correlate weekly
(Divergent validity)
MULTITRAIT-MULTI METHOD MATRIX
 The Multi Trait-MultiMethod (MTMM) matrix is an
approach to examining construct validity developed by
Campbell and Fiske . It organizes convergent and
discriminant validity evidence for comparison of how a
measure relates to other measures.
 Multiple traits are used in this approach to examine (a)
similar or (b) dissimilar traits ( constructs), as to
establish convergent and discriminant validity between
traits. Similarly, multiple methods are used in this
approach to examine the differential effects (or lack
thereof) caused by method specific variance.
CONTRASTED GROUPS APPROACH
/KNOWN GROUPS METHOD
It is a typical method to support construct validity and
is provided when a test can discriminate between a
group of individuals known to have a particular trait
and a group who do not have the trait.
The known groups methods will evaluate the test’s
ability to discriminate between the groups based on
the groups demonstrating different mean scores on
the test. For example, a group of individuals known to
be not depressed should have lower scores on a
depression scale then the group known to be
depressed
CRITERION RELATED VALIDITY
 The ability of a measure to measure a criterion
(usually set by the researcher).
 If the criterion set for professionalism in nursing is
belonging to nursing organizations and reading
nursing journals, then couldn’t we just count
memberships and subscriptions to come up with a
professionalism score. Also referred to as
instrumental validity, it states that the criteria should
be clearly defined by the researcher in advance.
Concurrent and predictive validity are often listed as
forms of criterion related validity.
CONCURRENT VALIDITY
Concurrent validity is a statistical method using
correlation, rather than a logical method.
Examinees who are known to be either masters or
non masters on the content measured by the test
are identified before the test is administered. Once
the tests have been scored, the relationship
between the examinees’ status as either masters or
non-masters and their performance (i.e., pass or
fail) is estimated based on the test.This type of
validity provides evidence that the test is classifying
examinees correctly. The stronger the correlation is,
the greater the concurrent validity of the test is.
PREDICTIVE VALIDITY
The ability of one measure to predict another future
measure of the same concept. This is another statistical
approach to validity that estimates the relationship of test
scores to an examinee's future performance as a master or
nonmaster.
Predictive validity considers the question, "How well does
the test predict examinees' future status as masters or non-
masters?" For this type of validity, the correlation that is
computed is based on the test results and the examinee’s
later performance. This type of validity is especially useful
for test purposes such as selection or admissions.
INTERNAL VALIDITY
 Internal validity is the extent to which a study
establishes a trustworthy cause-and-effect
relationship between a treatment and an outcome.
It also reflects that a given study makes it possible
to eliminate alternative explanations for a finding.
 Internal validity depends largely on the procedures
of a study and how rigorously it is performed.
EXTERNAL VALIDITY
 External validity refers to how well the outcome of a
study can be expected to apply to other settings. In
other words, this type of validity refers to how
generalizable the findings are. Rigorous research
methods can ensure internal validity but external
validity, on the other hand, may be limited by these
methods
CONCEPT OF RELIABILITY
 Reliability is if a research tool is consistent and stable hence
predictable and accurate.
 The greater the degree of consistency and stability in a
research instrument, the greater the reliability.
 A scale or test is reliable to the extent that repeat
measurements made by it under constant conditions will
give the same result.
 Reliability is the degree of accuracy or precision in the
measurements made by a research instrument.
 The lower the degree of ‘error’ in an instrument, the higher
the reliability.
FACTORS AFFECTING RELIABILITY
RELIABILITY
Homogeneity, equivalence and stability of a measure
over time and subjects. The instrument yields the
same results over repeated measures and subjects.
Expressed as a correlation coefficient (degree of
agreement between times and subjects) 0 to +1.
Reliability coefficient expresses the relationship
between error variance, true variance and the
observed score.
The higher the reliability coefficient, the lower the error
variance. Hence, the higher the coefficient the
more reliable the tool! .70 or higher acceptable.
STABILITY
 The same results are obtained over repeated
administration of the instrument.
• Test-retest reliability
• parallel, equivalent or alternate forms
TEST-RETEST RELIABILITY
 The administration of the same instrument to the
same subjects two or more times (under similar
conditions--not before and after treatment)
 Scores are correlated and expressed as a Pearson
r. (usually .70 acceptable)
PARALLEL OR ALTERNATE FORMS
RELIABILITY
 Parallel or alternate forms of a test are administered
to the same individuals and scores are correlated.
 This is desirable when the researcher believes that
repeated administration will result in “test- wiseness”
Sample: ”I am able to tell my partner how I feel”
“My partner tries to understand my feelings” Pearson’s
correlation can be used to estimate the theoretical
reliability coefficient between parallel tests
HOMOGENEITY
 Internal consistency (unidimensional)
• Item-total correlations
• split-half reliability
• Cronbach’s alpha
• Kuder-Richardson coefficient
ITEM TO TOTAL CORRELATIONS
 Each item on an instrument is correlated to
total score--an item with low correlation may
be deleted. Highest and lowest correlations
are usually reported.
• Only important if you desire homogeneity of
items.
SPLIT HALF RELIABILITY
 Items are divided into two halves and then compared. Odd,
even items, or 1-50 and 51-100 are two ways to split items.
• Only important when homogenity and internal
consistency is desirable.
• Cronbach’s Alpha:
• The average of all possible split half reliabilities for a set
of items
• By convention, a lenient cut-off of .60 is common in
exploratory research; alpha should be at least .70 or
higher to retain an item in an "adequate" scale; and
many researchers require a cut-off of .80 for a "good
scale."
CRONBACH’S ALPHA
 Likert scale or linear graphic response format.
 Compares the consistency of response of all items
on the scale.
 May need to be computed for each sample.
KUDER-RICHARDSON COEFFICIENT (KR-
20)
 Estimate of homogeneity when items have a
dichotomous response, e.g. “yes/no” items.
 Should be computed for a test on an initial reliability
testing, and computed for the actual sample.
 Based on the consistency of responses to all of the
items of a single form of a test.
EQUIVALENCE
 Consistency of agreement of observers using the
same measure or among alternate forms of a tool.
• Parallel or alternate forms (described under
stability)
• Inter rater reliability
INTERTATER RELIABILITY
 Used with observational data.
 Concordance between two or more observers
scores of the same event or phenomenon.
IDEAL RESEARCH INSTRUMENT
VALIDITY & RELIABILITY IN QUALITATIVE
RESEARCH
 Credibility
 Transferability
 Dependability
 Confirmability
CREDIBILITY/ TRUSTWORTHINESS
It involves establishing that the results of research
are credible or believable.This is a classic example
of ‘quality not quantity’. It depends more on the
richness of the information gathered, rather than
the amount of data gathered. There are many
techniques to gauge the accuracy of the findings,
such as data triangulation, triangulation through
multiple analysts and ‘member checks’. In reality
the participants/readers are the only ones who can
reasonably judge the credibility of the results..
TRANSFERABILITY:
Transferability refers to the degree in which the
research can be transferred to other contexts; this
section is defined by readers of the research. The
reader notes the specific details of the research
situation and methods, and compares them to a
similar situation that they are more familiar with. If
the specifics are comparable, the original research
would be deemed more credible. It is essential that
the original researcher supplies a highly detailed
description of their situation and methods
DEPENDABILITY
Dependability ensures that the research findings
are consistent and could be repeated. This is
measured by the standard of which the research is
conducted, analyzed and presented. Each process
in the study should be reported in detail to enable
an external researcher to repeat the inquiry and
achieve similar results. This also enables
researchers to understand the methods and their
effectiveness.
CONFIRMABILITY
Confirmability questions how the research findings
are supported by the data collected. This is a
process to establish whether the researcher has
been biased during the study; this is due to the
assumption that qualitative research allows the
research to bring a unique perspective to the study.
An external researcher can judge whether this is
the case by studying the data collected during the
original inquiry. To enhance the confirmability of the
initial conclusion, an audit trial can be completed
throughout the study to demonstrate how each
decision was made
SUMMARY
When creating a question to quantify a goal, or
when deciding on a data instrument to secure
the results to that question, two concepts are
universally agreed upon by researchers to be of
pique importance. These two concepts are
called validity and reliability, and they refer to
the quality and accuracy of data instruments.
REFERENCES
 Denise f. Polit and Cheryl Tatano Beck. Essential of
nursing research. 7th edition. Lippincott Williams &
Wilkins.2017. Page 373-376
 Carmen G. Loiselle and Joanne Profetto-Mccrrath.
Canadian Essentials of nursing research. 2004.
Page 307-311
 Ranjith Kumar. Research methodology-a step-by
step Guide for beginners. Sage
publications.Newdelhi.1999.page136-143
 Suresh Sharma. “Nursing research and statistics”.
Elsevier.2011. page216-218

23APR_NR_Data collection Methods_Part 3.ppt

  • 1.
    M.Sc. POST GRADUATEDEGREE IN NURSING NURSING RESEARCH AND STATISTICS TOOLS AND METHODS OF DATA COLLECTION RELIABILITY & VALIDITY OF RESEARCH INSTRUMENTS Prepared by Mrs. Deepa, Professor
  • 2.
    OBJECTIVES  describe theconcept of validity  explain different types of validity  describe the concept of reliability  explain factors affecting the reliability of a research instrument  illustrate methods of determining the reliability of an instrument  describe validity and reliability in qualitative research
  • 3.
    DEFINITION  Validity  Theaccuracy of the measure in reflecting the concept it is supposed to measure.  Reliability  Stability and consistency of the measuring instrument.  A measure can be reliable without being valid, but it cannot be valid without being reliable.
  • 5.
    CONCEPT OF VALIDITY •Are we measuring what we think we are measuring? • Validity is the ability of an instrument to measure what it is designed to measure. • Key questions; • Who decides whether an instrument is measuring what it is supposed to measure? • How can it be established that an instrument is measuring what it supposed to measure?
  • 7.
    Face validity  Juston its face the instrument appears to be a good measure of the concept. “intuitive, arrived at through inspection” • e.g. Concept=pain level • Measure=verbal rating scale “rate your pain from 1 to 10”. Face validity is sometimes considered a subtype of content validity. Question--is there any time when face validity is not desirable?
  • 8.
    Content validity  Contentof the measure is justified by other evidence, e.g. the literature.  Entire range or universe of the construct is measured.  Usually evaluated and scored by experts in the content area.  A CVI (Content Validity Index) of .80 or more is desirable.
  • 9.
    CONSTRUCT VALIDITY A constructrefers to a concept or characteristic that can’t be directly observed, but can be measured by observing other indicators that are associated with it. Constructs can be characteristics of individuals, such as intelligence, obesity, job satisfaction, or depression; they can also be broader concepts applied to organizations or social groups, such as gender equality, corporate social responsibility, or freedom of speech.
  • 10.
    CONSTRUCT VALIDITY Construct validityis about ensuring that the method of measurement matches the construct you want to measure. If you develop a questionnaire to diagnose depression, you need to know: does the questionnaire really measure the construct of depression? Or is it actually measuring the respondent’s mood, self-esteem, or some other construct?
  • 11.
    WAYS OF ARRIVINGAT CONSTRUCT VALIDITY  Hypothesis testing method  Convergent and divergent  multitrait-multi method matrix  Contrasted groups approach  factor analysis approach
  • 12.
    CONVERGENT AND DIVERGENT METHOD Measured constructs should correlate more highly with measures of similar constructs than measures of different constructs.  Example : optimism and self –efficacy should correlate highly (convergent validity)  Self efficacy and anger should correlate weekly (Divergent validity)
  • 13.
    MULTITRAIT-MULTI METHOD MATRIX The Multi Trait-MultiMethod (MTMM) matrix is an approach to examining construct validity developed by Campbell and Fiske . It organizes convergent and discriminant validity evidence for comparison of how a measure relates to other measures.  Multiple traits are used in this approach to examine (a) similar or (b) dissimilar traits ( constructs), as to establish convergent and discriminant validity between traits. Similarly, multiple methods are used in this approach to examine the differential effects (or lack thereof) caused by method specific variance.
  • 14.
    CONTRASTED GROUPS APPROACH /KNOWNGROUPS METHOD It is a typical method to support construct validity and is provided when a test can discriminate between a group of individuals known to have a particular trait and a group who do not have the trait. The known groups methods will evaluate the test’s ability to discriminate between the groups based on the groups demonstrating different mean scores on the test. For example, a group of individuals known to be not depressed should have lower scores on a depression scale then the group known to be depressed
  • 15.
    CRITERION RELATED VALIDITY The ability of a measure to measure a criterion (usually set by the researcher).  If the criterion set for professionalism in nursing is belonging to nursing organizations and reading nursing journals, then couldn’t we just count memberships and subscriptions to come up with a professionalism score. Also referred to as instrumental validity, it states that the criteria should be clearly defined by the researcher in advance. Concurrent and predictive validity are often listed as forms of criterion related validity.
  • 16.
    CONCURRENT VALIDITY Concurrent validityis a statistical method using correlation, rather than a logical method. Examinees who are known to be either masters or non masters on the content measured by the test are identified before the test is administered. Once the tests have been scored, the relationship between the examinees’ status as either masters or non-masters and their performance (i.e., pass or fail) is estimated based on the test.This type of validity provides evidence that the test is classifying examinees correctly. The stronger the correlation is, the greater the concurrent validity of the test is.
  • 17.
    PREDICTIVE VALIDITY The abilityof one measure to predict another future measure of the same concept. This is another statistical approach to validity that estimates the relationship of test scores to an examinee's future performance as a master or nonmaster. Predictive validity considers the question, "How well does the test predict examinees' future status as masters or non- masters?" For this type of validity, the correlation that is computed is based on the test results and the examinee’s later performance. This type of validity is especially useful for test purposes such as selection or admissions.
  • 18.
    INTERNAL VALIDITY  Internalvalidity is the extent to which a study establishes a trustworthy cause-and-effect relationship between a treatment and an outcome. It also reflects that a given study makes it possible to eliminate alternative explanations for a finding.  Internal validity depends largely on the procedures of a study and how rigorously it is performed.
  • 19.
    EXTERNAL VALIDITY  Externalvalidity refers to how well the outcome of a study can be expected to apply to other settings. In other words, this type of validity refers to how generalizable the findings are. Rigorous research methods can ensure internal validity but external validity, on the other hand, may be limited by these methods
  • 20.
    CONCEPT OF RELIABILITY Reliability is if a research tool is consistent and stable hence predictable and accurate.  The greater the degree of consistency and stability in a research instrument, the greater the reliability.  A scale or test is reliable to the extent that repeat measurements made by it under constant conditions will give the same result.  Reliability is the degree of accuracy or precision in the measurements made by a research instrument.  The lower the degree of ‘error’ in an instrument, the higher the reliability.
  • 21.
  • 22.
    RELIABILITY Homogeneity, equivalence andstability of a measure over time and subjects. The instrument yields the same results over repeated measures and subjects. Expressed as a correlation coefficient (degree of agreement between times and subjects) 0 to +1. Reliability coefficient expresses the relationship between error variance, true variance and the observed score. The higher the reliability coefficient, the lower the error variance. Hence, the higher the coefficient the more reliable the tool! .70 or higher acceptable.
  • 23.
    STABILITY  The sameresults are obtained over repeated administration of the instrument. • Test-retest reliability • parallel, equivalent or alternate forms
  • 24.
    TEST-RETEST RELIABILITY  Theadministration of the same instrument to the same subjects two or more times (under similar conditions--not before and after treatment)  Scores are correlated and expressed as a Pearson r. (usually .70 acceptable)
  • 25.
    PARALLEL OR ALTERNATEFORMS RELIABILITY  Parallel or alternate forms of a test are administered to the same individuals and scores are correlated.  This is desirable when the researcher believes that repeated administration will result in “test- wiseness” Sample: ”I am able to tell my partner how I feel” “My partner tries to understand my feelings” Pearson’s correlation can be used to estimate the theoretical reliability coefficient between parallel tests
  • 26.
    HOMOGENEITY  Internal consistency(unidimensional) • Item-total correlations • split-half reliability • Cronbach’s alpha • Kuder-Richardson coefficient
  • 27.
    ITEM TO TOTALCORRELATIONS  Each item on an instrument is correlated to total score--an item with low correlation may be deleted. Highest and lowest correlations are usually reported. • Only important if you desire homogeneity of items.
  • 28.
    SPLIT HALF RELIABILITY Items are divided into two halves and then compared. Odd, even items, or 1-50 and 51-100 are two ways to split items. • Only important when homogenity and internal consistency is desirable. • Cronbach’s Alpha: • The average of all possible split half reliabilities for a set of items • By convention, a lenient cut-off of .60 is common in exploratory research; alpha should be at least .70 or higher to retain an item in an "adequate" scale; and many researchers require a cut-off of .80 for a "good scale."
  • 29.
    CRONBACH’S ALPHA  Likertscale or linear graphic response format.  Compares the consistency of response of all items on the scale.  May need to be computed for each sample.
  • 30.
    KUDER-RICHARDSON COEFFICIENT (KR- 20) Estimate of homogeneity when items have a dichotomous response, e.g. “yes/no” items.  Should be computed for a test on an initial reliability testing, and computed for the actual sample.  Based on the consistency of responses to all of the items of a single form of a test.
  • 31.
    EQUIVALENCE  Consistency ofagreement of observers using the same measure or among alternate forms of a tool. • Parallel or alternate forms (described under stability) • Inter rater reliability
  • 32.
    INTERTATER RELIABILITY  Usedwith observational data.  Concordance between two or more observers scores of the same event or phenomenon.
  • 33.
  • 34.
    VALIDITY & RELIABILITYIN QUALITATIVE RESEARCH  Credibility  Transferability  Dependability  Confirmability
  • 35.
    CREDIBILITY/ TRUSTWORTHINESS It involvesestablishing that the results of research are credible or believable.This is a classic example of ‘quality not quantity’. It depends more on the richness of the information gathered, rather than the amount of data gathered. There are many techniques to gauge the accuracy of the findings, such as data triangulation, triangulation through multiple analysts and ‘member checks’. In reality the participants/readers are the only ones who can reasonably judge the credibility of the results..
  • 36.
    TRANSFERABILITY: Transferability refers tothe degree in which the research can be transferred to other contexts; this section is defined by readers of the research. The reader notes the specific details of the research situation and methods, and compares them to a similar situation that they are more familiar with. If the specifics are comparable, the original research would be deemed more credible. It is essential that the original researcher supplies a highly detailed description of their situation and methods
  • 37.
    DEPENDABILITY Dependability ensures thatthe research findings are consistent and could be repeated. This is measured by the standard of which the research is conducted, analyzed and presented. Each process in the study should be reported in detail to enable an external researcher to repeat the inquiry and achieve similar results. This also enables researchers to understand the methods and their effectiveness.
  • 38.
    CONFIRMABILITY Confirmability questions howthe research findings are supported by the data collected. This is a process to establish whether the researcher has been biased during the study; this is due to the assumption that qualitative research allows the research to bring a unique perspective to the study. An external researcher can judge whether this is the case by studying the data collected during the original inquiry. To enhance the confirmability of the initial conclusion, an audit trial can be completed throughout the study to demonstrate how each decision was made
  • 39.
    SUMMARY When creating aquestion to quantify a goal, or when deciding on a data instrument to secure the results to that question, two concepts are universally agreed upon by researchers to be of pique importance. These two concepts are called validity and reliability, and they refer to the quality and accuracy of data instruments.
  • 40.
    REFERENCES  Denise f.Polit and Cheryl Tatano Beck. Essential of nursing research. 7th edition. Lippincott Williams & Wilkins.2017. Page 373-376  Carmen G. Loiselle and Joanne Profetto-Mccrrath. Canadian Essentials of nursing research. 2004. Page 307-311  Ranjith Kumar. Research methodology-a step-by step Guide for beginners. Sage publications.Newdelhi.1999.page136-143  Suresh Sharma. “Nursing research and statistics”. Elsevier.2011. page216-218