Applied use of CUSUMs in surveillance
Paul Aylin, Co-Director of the Dr Foster Unit, Imperial College
London
Chair: Chris Sherlaw-Johnson, Senior Research Analyst,
Nuffield Trust
Applied use of CUSUMs in surveillance
Professor Paul Aylin
Dr Foster Unit at Imperial College London
p.aylin@imperial.ac.uk
Measuring Performance in Institutions
• Background
• Risk adjustment
• Analysis
• Alerting system
In an attempt to arrive at the truth, I
have applied everywhere for
information but in scarcely an
instance have I been able to obtain
hospital records fit for any purpose of
comparison. If they could be obtained
they would enable us to answer many
questions. They would show
subscribers how their money was
being spent, what amount of good
was really being done with it or
whether the money was not doing
mischief rather than good.
Florence Nightingale, 1863
Where healthcare measurement began
• Heart operations at
the Bristol Royal
Infirmary
“Inadequate care for one
third of children”
• Harold Shipman
Murdered more than 200
patients
Key events
“The Killing Fields”
Could we detect Shipman by looking at the data?
• Provided with data for over 1000 GPs in
five health authorities
• One GP was Shipman, but we were
blinded as to which one.
• Investigate methods for routine
surveillance of mortality data at the
level of General Practitioner (GP),
Practice and Health Authority (HA).
Prospective surveillance and multiple testing
• Different to Bristol
• No prior hypothesis
• Statistical process control charts (SPC) among
the most widely used methods for sequential
analysis
CUSUM charts for practice level mortality rates
Key lessons
• Routine hospital admissions data are
good enough for performance
monitoring (Bristol)
• Statistical methods exist for spotting
outliers and detecting improvement
(Shipman)
• Make use of multiple data sources
• Risk adjustment important
Casemix
• Not all patients are alike: age, sex,
illness severity, comorbidity, frailty…
• Many of these factors affect your risk of
clinical events e.g. death, complications
• The distribution of these casemix
factors varies by hospital
• Comparing hospitals’ crude death rates
will therefore be misleading -> risk
adjust
Tricky. Why not compare processes of care instead?
• Patients care about outcomes
• Process measures describe what was
done to patient (drugs given, guidelines
followed, scans done, advice given)
• A large number of these comprise
quality of care, but which ones really
affect the outcome? Which ones should
we monitor?
• Less available in electronic data
Challenges - Case mix adjustment
Limited within administrative data?
• Age
• Sex
• Emergency/Elective
Data used in our UK risk modelling
• National routine hospital admissions
data (HES), updated monthly
• 15m records annually, 300+ fields
• Dx, ops, age, sex, emergency status,
area code, deaths. No lab or drugs
• ICD10 and OPCS coding systems
• Easy(ish) access, cheap, comp
• Quality much improved but varies
Casemix adjustment methods: evolving science/art
• Indirect standardisation by age and sex
-> SMR (19th Century to date)
• Computing and stats algorithms ->
regression methods (1970s onwards)
• Machine learning methods (1980s
onwards)
• We need i) right list of casemix factors
ii) defined and blended the best way…
How does risk adjustment work for mortality?
• For each hospital, get actual
(‘observed’) number of deaths and
divide by predicted (or ‘expected’)
number of deaths: this is the SMR
• Derive expected number from risk
model where each patient’s probability
of death is estimated depending on
their set of casemix factors: sum these
probabilities of death for each hospital
to give the expected number of deaths
What might go into these risk models?
• age
• sex
• elective status
• socio-economic deprivation
• diagnosis subgroups or procedure subgroups
• comorbidity – e.g. Charlson, Elixhauser, Holman, DRG…
• number of prior emergency admissions
• palliative care
• year
• month of admission
• ethnic group
• source of admission (own home, care home, other hospital)
Our approach to risk modelling
• Build one model per outcome and per patient
(dx/op) group
• This allows for e.g. age to affect post-op
mortality differently from readmission for
stroke
• Try all the candidate variables and key
interactions and drop the unimportant ones
• Automate age grouping and technical issues
to allow industrial scale
• Update models at least yearly: things change
Comorbidity adjustment
• MANY approaches tried: count ICD
codes; count ‘common’ or ‘impt’ ICD
codes; count comorbs; weight subset of
comorbs (which ones??? What
weights???)
• Common indices include Charlson
(1987), Elixhauser (1998). Which is
better? Depends on outcome, pt group
How successful is the casemix adjustment?
ROC curve areas comparing ‘simple’, ‘intermediate’ and ‘complex’ models
derived from HES with models derived from clinical databases
0.5
0.55
0.6
0.65
0.7
0.75
0.8
0.85
0.9
0.95
1
CABG AAA - unruptured AAA - ruptured Colorectal excision
for cancer
Index procedure
ROC
HES Simple model (Year, age, sex)
HES Intermediate model (including method of admission)
HES Full model
Best model derived from clinical dataset
Aylin P; Bottle A; Majeed A. Use of administrative data or clinical databases as predictors of risk of death in hospital:
comparison of models. BMJ 2007;334: 1044
Comparison of HES vs clinical databases
Vascular surgery
• HES = 32,242
• National Vascular Database = 8,462
Aylin P; Lees T; Baker S; Prytherch D; Ashley S. (2007) Descriptive study
comparing routine hospital administrative data with the Vascular Society of
Great Britain and Ireland's National Vascular Database. Eur J Vasc
Endovasc Surg 2007;33:461-465
Bowel resection for colorectal cancer
• HES 2001/2 = 16,346
• ACPGBI 2001/2 = 7,635
• ACPGBI database, 39% of patients had missing data for the risk factors
Garout M, Tilney H, Aylin, P. Comparison of administrative data with the
Association of Coloproctology of Great Britain and Ireland (ACPGBI)
colorectal cancer database. International Journal of Colorectal Disease
2008;23(2):155-63
How to identify outliers?
“Even if all surgeons are equally good, about half will
have below average results, one will have the worst
results, and the worst results will be a long way below
average”
• Poloniecki J. BMJ 1998;316:1734-1736
Funnel plot for surgeon-level adjusted return to theatre (RTT)
rates for hip replacement.
Pyramid Model Of Investigation To Find Credible Cause
Lilford et al. Lancet 2004; 363: 1147-54
1st Step: Does the
coding reflect what
happened to the patient
4th Step: Examine when
other issues have occurred
2nd Step: Has
something occurred
locally to affect your
casemix
3rd Step: The Local Health
Economy may treat patients
differently than the rest of the
country/region e.g. provision of
hospices, etc
Finally: An individual is rarely the cause
of an alert. A Consultant name may be
coded against the primary diagnosis but
many individuals and teams are involved in
the patient’s care
Applied use of CUSUMs in surveillance
Applied use of CUSUMs in surveillance
Applied use of CUSUMs in surveillance
Applied use of CUSUMs in surveillance
Applied use of CUSUMs in surveillance
Applied use of CUSUMs in surveillance
Applied use of CUSUMs in surveillance

Applied use of CUSUMs in surveillance

  • 1.
    Applied use ofCUSUMs in surveillance Paul Aylin, Co-Director of the Dr Foster Unit, Imperial College London Chair: Chris Sherlaw-Johnson, Senior Research Analyst, Nuffield Trust
  • 2.
    Applied use ofCUSUMs in surveillance Professor Paul Aylin Dr Foster Unit at Imperial College London [email protected]
  • 3.
    Measuring Performance inInstitutions • Background • Risk adjustment • Analysis • Alerting system
  • 4.
    In an attemptto arrive at the truth, I have applied everywhere for information but in scarcely an instance have I been able to obtain hospital records fit for any purpose of comparison. If they could be obtained they would enable us to answer many questions. They would show subscribers how their money was being spent, what amount of good was really being done with it or whether the money was not doing mischief rather than good. Florence Nightingale, 1863 Where healthcare measurement began
  • 5.
    • Heart operationsat the Bristol Royal Infirmary “Inadequate care for one third of children” • Harold Shipman Murdered more than 200 patients Key events
  • 6.
  • 10.
    Could we detectShipman by looking at the data? • Provided with data for over 1000 GPs in five health authorities • One GP was Shipman, but we were blinded as to which one. • Investigate methods for routine surveillance of mortality data at the level of General Practitioner (GP), Practice and Health Authority (HA).
  • 11.
    Prospective surveillance andmultiple testing • Different to Bristol • No prior hypothesis • Statistical process control charts (SPC) among the most widely used methods for sequential analysis
  • 12.
    CUSUM charts forpractice level mortality rates
  • 14.
    Key lessons • Routinehospital admissions data are good enough for performance monitoring (Bristol) • Statistical methods exist for spotting outliers and detecting improvement (Shipman) • Make use of multiple data sources • Risk adjustment important
  • 15.
    Casemix • Not allpatients are alike: age, sex, illness severity, comorbidity, frailty… • Many of these factors affect your risk of clinical events e.g. death, complications • The distribution of these casemix factors varies by hospital • Comparing hospitals’ crude death rates will therefore be misleading -> risk adjust
  • 16.
    Tricky. Why notcompare processes of care instead? • Patients care about outcomes • Process measures describe what was done to patient (drugs given, guidelines followed, scans done, advice given) • A large number of these comprise quality of care, but which ones really affect the outcome? Which ones should we monitor? • Less available in electronic data
  • 17.
    Challenges - Casemix adjustment Limited within administrative data? • Age • Sex • Emergency/Elective
  • 18.
    Data used inour UK risk modelling • National routine hospital admissions data (HES), updated monthly • 15m records annually, 300+ fields • Dx, ops, age, sex, emergency status, area code, deaths. No lab or drugs • ICD10 and OPCS coding systems • Easy(ish) access, cheap, comp • Quality much improved but varies
  • 19.
    Casemix adjustment methods:evolving science/art • Indirect standardisation by age and sex -> SMR (19th Century to date) • Computing and stats algorithms -> regression methods (1970s onwards) • Machine learning methods (1980s onwards) • We need i) right list of casemix factors ii) defined and blended the best way…
  • 20.
    How does riskadjustment work for mortality? • For each hospital, get actual (‘observed’) number of deaths and divide by predicted (or ‘expected’) number of deaths: this is the SMR • Derive expected number from risk model where each patient’s probability of death is estimated depending on their set of casemix factors: sum these probabilities of death for each hospital to give the expected number of deaths
  • 21.
    What might gointo these risk models? • age • sex • elective status • socio-economic deprivation • diagnosis subgroups or procedure subgroups • comorbidity – e.g. Charlson, Elixhauser, Holman, DRG… • number of prior emergency admissions • palliative care • year • month of admission • ethnic group • source of admission (own home, care home, other hospital)
  • 22.
    Our approach torisk modelling • Build one model per outcome and per patient (dx/op) group • This allows for e.g. age to affect post-op mortality differently from readmission for stroke • Try all the candidate variables and key interactions and drop the unimportant ones • Automate age grouping and technical issues to allow industrial scale • Update models at least yearly: things change
  • 23.
    Comorbidity adjustment • MANYapproaches tried: count ICD codes; count ‘common’ or ‘impt’ ICD codes; count comorbs; weight subset of comorbs (which ones??? What weights???) • Common indices include Charlson (1987), Elixhauser (1998). Which is better? Depends on outcome, pt group
  • 24.
    How successful isthe casemix adjustment?
  • 25.
    ROC curve areascomparing ‘simple’, ‘intermediate’ and ‘complex’ models derived from HES with models derived from clinical databases 0.5 0.55 0.6 0.65 0.7 0.75 0.8 0.85 0.9 0.95 1 CABG AAA - unruptured AAA - ruptured Colorectal excision for cancer Index procedure ROC HES Simple model (Year, age, sex) HES Intermediate model (including method of admission) HES Full model Best model derived from clinical dataset Aylin P; Bottle A; Majeed A. Use of administrative data or clinical databases as predictors of risk of death in hospital: comparison of models. BMJ 2007;334: 1044
  • 26.
    Comparison of HESvs clinical databases Vascular surgery • HES = 32,242 • National Vascular Database = 8,462 Aylin P; Lees T; Baker S; Prytherch D; Ashley S. (2007) Descriptive study comparing routine hospital administrative data with the Vascular Society of Great Britain and Ireland's National Vascular Database. Eur J Vasc Endovasc Surg 2007;33:461-465 Bowel resection for colorectal cancer • HES 2001/2 = 16,346 • ACPGBI 2001/2 = 7,635 • ACPGBI database, 39% of patients had missing data for the risk factors Garout M, Tilney H, Aylin, P. Comparison of administrative data with the Association of Coloproctology of Great Britain and Ireland (ACPGBI) colorectal cancer database. International Journal of Colorectal Disease 2008;23(2):155-63
  • 27.
    How to identifyoutliers? “Even if all surgeons are equally good, about half will have below average results, one will have the worst results, and the worst results will be a long way below average” • Poloniecki J. BMJ 1998;316:1734-1736
  • 29.
    Funnel plot forsurgeon-level adjusted return to theatre (RTT) rates for hip replacement.
  • 32.
    Pyramid Model OfInvestigation To Find Credible Cause Lilford et al. Lancet 2004; 363: 1147-54 1st Step: Does the coding reflect what happened to the patient 4th Step: Examine when other issues have occurred 2nd Step: Has something occurred locally to affect your casemix 3rd Step: The Local Health Economy may treat patients differently than the rest of the country/region e.g. provision of hospices, etc Finally: An individual is rarely the cause of an alert. A Consultant name may be coded against the primary diagnosis but many individuals and teams are involved in the patient’s care