Critical Appraisal Level 2 Evaluating studies using the Critical Appraisal Skills Programme (CASP) Tools
What is Critical Appraisal? Critical appraisal is the process of systematically examining research papers before using the evidence to form a decision Critical appraisal allows us to make sense of research evidence and close the gap between research and practice
Why Critically Appraise? In evidence based medicine the clinician should  use the best available evidence to decide which  treatment option is best for their patient To determine the best treatment option we must  have critical appraisal skills to assess the quality  of research
Advantages of Critical Appraisal   Provides a systematic way of assessing the  validity, results and usefulness of published  research  Allows us to improve healthcare quality Encourages objective assessment of the  usefulness of all information/evidence Critical appraisal skills are not difficult to develop –  mostly common sense!
Disadvantages of Critical Appraisal Can be time consuming It may highlight that current practice is in  fact ineffective May highlight a lack of good evidence in an  area of interest
Help with Critical Appraisal It is difficult to remember all the issues which must  be taken into account when reading research  papers  To help with this the Public Health Resource Unit  in Oxford have produced tools to evaluate most  types of study as part of the Critical Appraisal  Study Programme (CASP) These are called CASP tools
CASP Tools CASP Tools have been developed to help us  appraise: Systematic Reviews RCTs Case-control studies Cohort Studies Tools for other types of study (e.g. qualitative  research, economic evaluations and diagnostic  studies) are also available
CASP Tools (2) The tools break critical appraisal of each study type into 10-12 manageable questions The questions in each tool will vary slighty as study designs are different They are available at: http://www.sph.nhs.uk/what-we-do/public-health-workforce/resources/critical-appraisals-skills-programme
Randomised Controlled Trials (RCTSs) RCTs are the best type of research study to determine if a specific  intervention  produced the desired  outcome  in a specific  population e.g. The HOPE study: Does ramipril (the intervention) prevent myocardial infarction, stroke or death from cardiovascular causes (the outcomes) in patients who were at high risk for cardiovascular events but who did not have left ventricular dysfunction or heart disease (the population)
RCTs (continued) RCTs should have a clearly defined research question in relation to the population, outcome and intervention RCTs can minimise bias and use the most appropriate study design for investigating the effectiveness of a specific intervention or treatment  RCTs are, however, not automatically of good quality and should be appraised critically
Appraising RCTs Three broad issues should be considered when examining a report of an RCT: Validity Results Relevance The CASP tool prompts us to ask questions  which will assess each of these issues in more  detail
Screening Questions in CASP tool Questions 1 and 2 in the RCT CASP tool are screening questions to assess if: The RCT  asks a clearly focused study question (i.e. is population, intervention and outcome being investigated clearly stated?) Whether an RCT is the most appropriate study design to ask this question If the answer to either of these is “no” then it is probably not worth continuing with the rest of the questions
Validity When assessing validity the  methods  used in the study are appraised If the research methods are flawed then this may invalidate the results of the trial To critically appraise the methodology of an RCT you will look at sample size,  randomisation and baseline characteristics, blinding, follow-up, data collection and interventions. These elements are dealt with by questions 3-7 in the CASP tool for RCTs
Randomisation and Baseline Characteristics (Q3 in CASP Tool) Randomisation reduces the possibility of bias Method of randomisation should be robust and neither the participants or clinicians should be aware which group patients will be in before randomisation Most robust methods are computer- generated numbers or tables of random numbers (gives 50:50 chance of being in either group) Stratification can be used to ensure similar baseline characteristics in both groups
Randomisation and Baseline Characteristics (Q3 in CASP Tool) The make-up of treatment and control groups should be very similar and the only difference should be the intervention under investigation in the study This is so that we can be confident that the outcome is due to the intervention and not to any other confounding factors Consider: Is the sample large enough? (see also Q7) Was the randomisation process robust?  Was there stratification?
Blinding (Q4 in CASP Tool) If the people involved in the study are not aware who is in the treatment or control group this reduces the possibility of bias i.e. ideally a study should be double-blind Single-blind and open-label studies would be expected to have higher chance of producing biased results To blind patients and clinicians the intervention and placebo must appear to be identical – this is not always possible Consider – if every effort was made to achieve blinding -if it matters in the study being reviewed (i.e. could there be observer bias in the results that are used?)
Follow-up (Q5 in CASP tool) If a large number of participants withdraw from a trial before its conclusion their loss may distort the results Crossover between groups can also distort the results as the effects of randomisation can be lost In most cases intention-to-treat analysis should be used – i.e. the final results should be analysed according to the original randomisation
Interventions (Q6 in CASP tool) It is important that, aside from the intervention under investigation, the groups are treated equally as this means the outcome of the study can be attributed to the intervention Consider: Were groups reviewed at the same time intervals? Were any other treatments allowed in either group?
Interventions (Q6 in CASP tool) Also consider: If tests or measurement s were used were they conducted  by appropriate personnel and established tests were used? Were assessments frequent enough to show a pattern of  response? Is the duration of the study sufficient?
Sample Size (Q7 in CASP tool) A study should include enough participants so that the  researchers can be reasonably sure that there is a high  chance of detecting a beneficial effect A power calculation should be carried out before the  participants are enrolled to estimate how many people  need to be recruited to achieve a certain level of certainty  (usually aim for 80-90%)
Results The results of a RCT should be scrutinised in a similar way to the method Broad considerations are: What are the results? How are the results presented? How precise are the results? This is dealt with by questions 8 and 9 in the CASP RCT tool
Presentation of Results  (Q8 in CASP tool) Results can be expressed as: Relative Risk – if we hope that the intervention will lead to  LESS outcomes (e.g. MI, stroke, death) we want a hazard  ratio of less than 1 Absolute Risk – proportion of people experiencing an event  in each group Results may be reported as either of the above or as both
Presentation of Results (2) The results presented should relate to the objectives set out in the original description of the study method Relative or absolute measures may have little meaning in relation to clinical practice Numbers needed to treat (NNT) may make measures more understandable NNT is the number of people who must be treated to produce one additional successful outcome
Precision of Results  (Q9 in CASP tool) Statistical tests are used to establish whether the results of a trial are real or whether they occurred purely by chance There will always be some doubt as the trial only looks at a sample of the population Confidence intervals and p-values indicate the level of certainty around the results
Confidence Intervals (CI) Inferences based on random samples are uncertain because different results would be obtained each time a study was repeated CIs indicate the range of doubt around the results and represent the range of values within which the true value lies Therefore, the narrower the range, the more convincing the results If CIs overlap this indicates that the study has failed to demonstrate a difference
P-values P-values describe the probability that a result has happened by chance P<0.05 is usually described as statistically significant and means that the results are unlikely to be due to chance
Statistics Used There should be adequate statistical analysis of the relevant results and the statistical analysis used should be appropriate to trial size and design Statistical tests used should be adequately described and referenced Consider whether a statistician is listed as one of the authors
Clinical vs. Statistical Significance A statistically significant difference in favour of a drug does not always indicate a clinically significant difference e.g. in a study of two antihypertensives a difference in BP of 1-2mmHg may be statistically significant but is this likely to confer significant clinical benefit? Also consider surrogate markers versus clinical endpoints (e.g. an increase in bone mineral density versus a decrease in fracture rate)
Relevance If the methodology and results are acceptable then the applicability of the results to the local population should be considered Some broad issues include: Relevance to the local population Were the outcomes considered clinically important? Risk versus benefit of treatment Q10 in the CASP tool prompts us to consider the issues around the relevance of our study to our local population
Relevance to Local Population It is important to consider if there are any differences between the participants in the trial and the local population that would make it impossible to apply the results locally. Think about: Inclusion/exclusion criteria – e.g. age, ethnicity, co-morbidities, concomitant medication Local healthcare provision – the setting may have been in a different healthcare system and it may not be possible to provide similar care locally Control group – is the standard treatment used in the study better or worse than local standard? Are realistic comparative doses used?
Importance of outcomes Were all clinically important outcomes considered? A single RCT is unlikely to address all the clinically important outcomes but consider if the original question has been answered and if any other important outcomes have been missed out. Think about outcomes form the point of view of: The patient and their family/carers The clinician using the treatment Policymakers The wider community
Risks versus Benefits Risks: Safety – the risk of serious side effects may outweigh the benefits of treatment (can calculate NNT vs. NNH) Tolerability – what was the drop-out rate in the study compared to withdrawal rates on the current standard treatment? Is the benefit of the treatment large enough to outweigh these risks?
Other Considerations Cost implications – financial information is not normally included in a trial. Economic evaluations may be available Simplicity of use – patient factors such as compliance, method of administration or complicated devices may limit the usefulness of a treatment in clinical practice Quality of life data – gives more information from the patient’s perspective  Sponsorship – sponsorship of the research or author affiliations to drug to drug companies may affect how the results are presented
Look out for…….. A lot of work and funding goes into drug development and clinical trials and drug companies and researchers will want a positive result from their hard work When apprasing RCTs beware of the negative aspects of the study being glossed over to make it appear positive. This can include: Use of sub-group analysis Use of composite end points Analysis of secondary outcomes
Sub-group analysis Sub-group analysis is when results are broken down into patient sub-groups e.g. elderly, patients with a history of stroke/diabetes If the result of the overall trial is negative it may be positive in a specific sub-group Trials may not be powered to detect differences in sub-groups containing small numbers of patients
Use of Composite Endpoints Sometimes the desired outcome of a clinical trial is relatively rare e.g. fatal MI To show a difference between intervention and control huge numbers of participants in very long term trials would be required Composite end-points may be used instead e.g. risk of MI, stroke, death or hospital admission due to cardiac causes Not all the individual components are always of equal importance to all patients
Analysis of Secondary Outcomes The primary outcome is the most important outcome of an RCT If no statistically significant increase in efficacy is observed with the new treatment then secondary outcomes which are significant may be quoted in the results The number of patient enrolled in a trial is based on the primary outcome It is poor practice to disregard the primary outcome and quote a secondary outcome as the main result
Remember! Healthcare decisions are not usually made on the basis of one trial. Other factors and other evidence may also have to be considered when making a decision
Appraising other publications In addition to RCTs other publications may be used to contribute to evidence-based decision making In order of importance the usual hierarchy is: Systematic reviews RCTs Observational studies e.g. cohort, case-control or  cross-sectional studies Case reports, case studies Expert consensus
Systematic Reviews Systematic reviews seek to bring the same level of rigour to reviewing research evidence as should be used in producing research evidence. They: Identify relevant published and non-published evidence Select studies for inclusion and assess the quality of  each Present a summary of the findings with due  consideration
Systematic Reviews (2) Meta-analysis is a statistical technique for combining the results of independent studies and is the technique normally used to combine the results of selected studies for systematic review. Validity of meta-analysis depends on the quality of the systematic review on which it is based. Like RCTs systematic reviews should not be considered to automatically be of good quality and should be critically appraised.
Observational Studies Includes cohort studies, case-control studies and cross-sectional studies These studies lack the controlled design of RCTs and the evidence which comes from this type of study is therefore considered less robust In some cases observational studies provide the only evidence available (e.g. emerging safety issues) and therefore it is also important that we are able to critically appraise these
Over to you…… Now use what you have learned from this presentation and the tools available online to critically appraise a randomised controlled trial and a cohort study

NES Pharmacy, Critical Appraisal 2011

  • 1.
    Critical Appraisal Level2 Evaluating studies using the Critical Appraisal Skills Programme (CASP) Tools
  • 2.
    What is CriticalAppraisal? Critical appraisal is the process of systematically examining research papers before using the evidence to form a decision Critical appraisal allows us to make sense of research evidence and close the gap between research and practice
  • 3.
    Why Critically Appraise?In evidence based medicine the clinician should use the best available evidence to decide which treatment option is best for their patient To determine the best treatment option we must have critical appraisal skills to assess the quality of research
  • 4.
    Advantages of CriticalAppraisal Provides a systematic way of assessing the validity, results and usefulness of published research Allows us to improve healthcare quality Encourages objective assessment of the usefulness of all information/evidence Critical appraisal skills are not difficult to develop – mostly common sense!
  • 5.
    Disadvantages of CriticalAppraisal Can be time consuming It may highlight that current practice is in fact ineffective May highlight a lack of good evidence in an area of interest
  • 6.
    Help with CriticalAppraisal It is difficult to remember all the issues which must be taken into account when reading research papers To help with this the Public Health Resource Unit in Oxford have produced tools to evaluate most types of study as part of the Critical Appraisal Study Programme (CASP) These are called CASP tools
  • 7.
    CASP Tools CASPTools have been developed to help us appraise: Systematic Reviews RCTs Case-control studies Cohort Studies Tools for other types of study (e.g. qualitative research, economic evaluations and diagnostic studies) are also available
  • 8.
    CASP Tools (2)The tools break critical appraisal of each study type into 10-12 manageable questions The questions in each tool will vary slighty as study designs are different They are available at: http://www.sph.nhs.uk/what-we-do/public-health-workforce/resources/critical-appraisals-skills-programme
  • 9.
    Randomised Controlled Trials(RCTSs) RCTs are the best type of research study to determine if a specific intervention produced the desired outcome in a specific population e.g. The HOPE study: Does ramipril (the intervention) prevent myocardial infarction, stroke or death from cardiovascular causes (the outcomes) in patients who were at high risk for cardiovascular events but who did not have left ventricular dysfunction or heart disease (the population)
  • 10.
    RCTs (continued) RCTsshould have a clearly defined research question in relation to the population, outcome and intervention RCTs can minimise bias and use the most appropriate study design for investigating the effectiveness of a specific intervention or treatment RCTs are, however, not automatically of good quality and should be appraised critically
  • 11.
    Appraising RCTs Threebroad issues should be considered when examining a report of an RCT: Validity Results Relevance The CASP tool prompts us to ask questions which will assess each of these issues in more detail
  • 12.
    Screening Questions inCASP tool Questions 1 and 2 in the RCT CASP tool are screening questions to assess if: The RCT asks a clearly focused study question (i.e. is population, intervention and outcome being investigated clearly stated?) Whether an RCT is the most appropriate study design to ask this question If the answer to either of these is “no” then it is probably not worth continuing with the rest of the questions
  • 13.
    Validity When assessingvalidity the methods used in the study are appraised If the research methods are flawed then this may invalidate the results of the trial To critically appraise the methodology of an RCT you will look at sample size, randomisation and baseline characteristics, blinding, follow-up, data collection and interventions. These elements are dealt with by questions 3-7 in the CASP tool for RCTs
  • 14.
    Randomisation and BaselineCharacteristics (Q3 in CASP Tool) Randomisation reduces the possibility of bias Method of randomisation should be robust and neither the participants or clinicians should be aware which group patients will be in before randomisation Most robust methods are computer- generated numbers or tables of random numbers (gives 50:50 chance of being in either group) Stratification can be used to ensure similar baseline characteristics in both groups
  • 15.
    Randomisation and BaselineCharacteristics (Q3 in CASP Tool) The make-up of treatment and control groups should be very similar and the only difference should be the intervention under investigation in the study This is so that we can be confident that the outcome is due to the intervention and not to any other confounding factors Consider: Is the sample large enough? (see also Q7) Was the randomisation process robust? Was there stratification?
  • 16.
    Blinding (Q4 inCASP Tool) If the people involved in the study are not aware who is in the treatment or control group this reduces the possibility of bias i.e. ideally a study should be double-blind Single-blind and open-label studies would be expected to have higher chance of producing biased results To blind patients and clinicians the intervention and placebo must appear to be identical – this is not always possible Consider – if every effort was made to achieve blinding -if it matters in the study being reviewed (i.e. could there be observer bias in the results that are used?)
  • 17.
    Follow-up (Q5 inCASP tool) If a large number of participants withdraw from a trial before its conclusion their loss may distort the results Crossover between groups can also distort the results as the effects of randomisation can be lost In most cases intention-to-treat analysis should be used – i.e. the final results should be analysed according to the original randomisation
  • 18.
    Interventions (Q6 inCASP tool) It is important that, aside from the intervention under investigation, the groups are treated equally as this means the outcome of the study can be attributed to the intervention Consider: Were groups reviewed at the same time intervals? Were any other treatments allowed in either group?
  • 19.
    Interventions (Q6 inCASP tool) Also consider: If tests or measurement s were used were they conducted by appropriate personnel and established tests were used? Were assessments frequent enough to show a pattern of response? Is the duration of the study sufficient?
  • 20.
    Sample Size (Q7in CASP tool) A study should include enough participants so that the researchers can be reasonably sure that there is a high chance of detecting a beneficial effect A power calculation should be carried out before the participants are enrolled to estimate how many people need to be recruited to achieve a certain level of certainty (usually aim for 80-90%)
  • 21.
    Results The resultsof a RCT should be scrutinised in a similar way to the method Broad considerations are: What are the results? How are the results presented? How precise are the results? This is dealt with by questions 8 and 9 in the CASP RCT tool
  • 22.
    Presentation of Results (Q8 in CASP tool) Results can be expressed as: Relative Risk – if we hope that the intervention will lead to LESS outcomes (e.g. MI, stroke, death) we want a hazard ratio of less than 1 Absolute Risk – proportion of people experiencing an event in each group Results may be reported as either of the above or as both
  • 23.
    Presentation of Results(2) The results presented should relate to the objectives set out in the original description of the study method Relative or absolute measures may have little meaning in relation to clinical practice Numbers needed to treat (NNT) may make measures more understandable NNT is the number of people who must be treated to produce one additional successful outcome
  • 24.
    Precision of Results (Q9 in CASP tool) Statistical tests are used to establish whether the results of a trial are real or whether they occurred purely by chance There will always be some doubt as the trial only looks at a sample of the population Confidence intervals and p-values indicate the level of certainty around the results
  • 25.
    Confidence Intervals (CI)Inferences based on random samples are uncertain because different results would be obtained each time a study was repeated CIs indicate the range of doubt around the results and represent the range of values within which the true value lies Therefore, the narrower the range, the more convincing the results If CIs overlap this indicates that the study has failed to demonstrate a difference
  • 26.
    P-values P-values describethe probability that a result has happened by chance P<0.05 is usually described as statistically significant and means that the results are unlikely to be due to chance
  • 27.
    Statistics Used Thereshould be adequate statistical analysis of the relevant results and the statistical analysis used should be appropriate to trial size and design Statistical tests used should be adequately described and referenced Consider whether a statistician is listed as one of the authors
  • 28.
    Clinical vs. StatisticalSignificance A statistically significant difference in favour of a drug does not always indicate a clinically significant difference e.g. in a study of two antihypertensives a difference in BP of 1-2mmHg may be statistically significant but is this likely to confer significant clinical benefit? Also consider surrogate markers versus clinical endpoints (e.g. an increase in bone mineral density versus a decrease in fracture rate)
  • 29.
    Relevance If themethodology and results are acceptable then the applicability of the results to the local population should be considered Some broad issues include: Relevance to the local population Were the outcomes considered clinically important? Risk versus benefit of treatment Q10 in the CASP tool prompts us to consider the issues around the relevance of our study to our local population
  • 30.
    Relevance to LocalPopulation It is important to consider if there are any differences between the participants in the trial and the local population that would make it impossible to apply the results locally. Think about: Inclusion/exclusion criteria – e.g. age, ethnicity, co-morbidities, concomitant medication Local healthcare provision – the setting may have been in a different healthcare system and it may not be possible to provide similar care locally Control group – is the standard treatment used in the study better or worse than local standard? Are realistic comparative doses used?
  • 31.
    Importance of outcomesWere all clinically important outcomes considered? A single RCT is unlikely to address all the clinically important outcomes but consider if the original question has been answered and if any other important outcomes have been missed out. Think about outcomes form the point of view of: The patient and their family/carers The clinician using the treatment Policymakers The wider community
  • 32.
    Risks versus BenefitsRisks: Safety – the risk of serious side effects may outweigh the benefits of treatment (can calculate NNT vs. NNH) Tolerability – what was the drop-out rate in the study compared to withdrawal rates on the current standard treatment? Is the benefit of the treatment large enough to outweigh these risks?
  • 33.
    Other Considerations Costimplications – financial information is not normally included in a trial. Economic evaluations may be available Simplicity of use – patient factors such as compliance, method of administration or complicated devices may limit the usefulness of a treatment in clinical practice Quality of life data – gives more information from the patient’s perspective Sponsorship – sponsorship of the research or author affiliations to drug to drug companies may affect how the results are presented
  • 34.
    Look out for……..A lot of work and funding goes into drug development and clinical trials and drug companies and researchers will want a positive result from their hard work When apprasing RCTs beware of the negative aspects of the study being glossed over to make it appear positive. This can include: Use of sub-group analysis Use of composite end points Analysis of secondary outcomes
  • 35.
    Sub-group analysis Sub-groupanalysis is when results are broken down into patient sub-groups e.g. elderly, patients with a history of stroke/diabetes If the result of the overall trial is negative it may be positive in a specific sub-group Trials may not be powered to detect differences in sub-groups containing small numbers of patients
  • 36.
    Use of CompositeEndpoints Sometimes the desired outcome of a clinical trial is relatively rare e.g. fatal MI To show a difference between intervention and control huge numbers of participants in very long term trials would be required Composite end-points may be used instead e.g. risk of MI, stroke, death or hospital admission due to cardiac causes Not all the individual components are always of equal importance to all patients
  • 37.
    Analysis of SecondaryOutcomes The primary outcome is the most important outcome of an RCT If no statistically significant increase in efficacy is observed with the new treatment then secondary outcomes which are significant may be quoted in the results The number of patient enrolled in a trial is based on the primary outcome It is poor practice to disregard the primary outcome and quote a secondary outcome as the main result
  • 38.
    Remember! Healthcare decisionsare not usually made on the basis of one trial. Other factors and other evidence may also have to be considered when making a decision
  • 39.
    Appraising other publicationsIn addition to RCTs other publications may be used to contribute to evidence-based decision making In order of importance the usual hierarchy is: Systematic reviews RCTs Observational studies e.g. cohort, case-control or cross-sectional studies Case reports, case studies Expert consensus
  • 40.
    Systematic Reviews Systematicreviews seek to bring the same level of rigour to reviewing research evidence as should be used in producing research evidence. They: Identify relevant published and non-published evidence Select studies for inclusion and assess the quality of each Present a summary of the findings with due consideration
  • 41.
    Systematic Reviews (2)Meta-analysis is a statistical technique for combining the results of independent studies and is the technique normally used to combine the results of selected studies for systematic review. Validity of meta-analysis depends on the quality of the systematic review on which it is based. Like RCTs systematic reviews should not be considered to automatically be of good quality and should be critically appraised.
  • 42.
    Observational Studies Includescohort studies, case-control studies and cross-sectional studies These studies lack the controlled design of RCTs and the evidence which comes from this type of study is therefore considered less robust In some cases observational studies provide the only evidence available (e.g. emerging safety issues) and therefore it is also important that we are able to critically appraise these
  • 43.
    Over to you……Now use what you have learned from this presentation and the tools available online to critically appraise a randomised controlled trial and a cohort study