SlideShare a Scribd company logo
Better Data Leads to Better Analytics:
Three Ways to Improve Healthcare Data Quality
in an EDW
Written by
Jason B. Buskirk
Chief Operating Officer
Health Care DataWorks
 
	
  
	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  
2
Better Data Leads to Better Analytics:
Three Ways to Improve Healthcare Data Quality in an EDW
Too often, organizations embark on Enterprise Data Warehouse (EDW) projects with the notion
that all their data needs will be met once the implementation is complete. It is understandable
why this thinking becomes pervasive throughout the organization. Typically, organizations have
decided to take on such projects after lengthy and time-intensive meetings, presentations
and reviews to bring together the myriad interests of its key stakeholders, followed by the due
diligence necessary to secure the funding and select the technology partner. Expectations begin
to run very high.
While an EDW undoubtedly will empower organizations to do more with their data than ever before
and the investment will pay dividends in terms of the value it brings, an EDW is only as good as the
data that is fed into it. Every organization will encounter data quality issues during or leading up to
EDW implementation, and these issues can negatively affect the timeline of the implementation. If
there are issues with data quality, the organization will find that, when it comes time to extract the
data, it will not be as useful as expected. It is important to discover and address data quality issues
as early as possible. Not doing so becomes expensive, both in terms of the developers’ time and
the lack of trust that could occur within the organization. Think of it this way: If you put bad data in,
you get bad data out, and the sooner you find the bad data, the better off your project will be. This
white paper details three ways to improve data quality in an EDW.
Establish realistic expectations
Improving data quality starts with understanding the data challenges and proactively
communicating and working with stakeholders to address potential pitfalls. Taking these steps
will contribute to a successful, cost-efficient and relatively smooth implementation that can
achieve results at a quicker pace.
It is important that everyone in the organization knows that the EDW will only be as effective
as the data that goes into it. This will help manage expectations and reduce potential frustration.
Everyone wants access to data that is relevant, understandable and, ultimately, results in
actionable knowledge. But the reality is an organization will not know how bad its data is until
it begins the task of profiling the data that is to be extracted.
 
	
  
	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  
3
Know the causes of data issues
Virtually all the data issues encountered with a data warehouse implementation are not
technological in nature – they are operational. These operational causes of data issues generally
fall into two broad categories:
• Data collection requirements. Organizations have multiple systems capturing and
storing their electronic medical records, financial records and human resource information.
But these systems tend to operate in silos. This often contributes to issues around when
and if data is collected. Some systems may require data elements to be populated, while
others may not make them mandatory for data capture. This leads to sparse data sets that
could have limited usefulness in the future.
• Lack of standardization. Because myriad systems are in use and individual departments
can track data in different ways, problems with standardization often arise and take many
forms. For example, two units within a health system track the same information – patient
gender. In one system, the information is input and categorized as “male” or “female.” In
the other system, gender is input as a “1” or “2.” Even though these issues can and
should be fixed during the extract process, the time needed to identify these issues and
decide how the data should be stored in the data warehouse is something the organization
needs to take into consideration when planning the data warehouse project.
Improve data quality
By taking the following steps before the implementation process begins, organizations can cleanse
and improve the quality of the data, positioning the organization for a successful enterprise data
warehouse project.
• Establish a governance body or data quality group to create consistent standards.
Most organizations do not have this in place prior to an EDW implementation. The body
or group should be comprised of stakeholders who know which data is being collected,
how it is being categorized, how and where it is stored, and all the other details critical
to establishing an organization-wide standard. The goal should be to identify “bad” and
non-standardized data. Doing this sooner rather than later can ensure the most
cost-efficient implementation.
 
	
  
	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  
4
Organizations have
two options: They
can build their own
data model or buy
one. Health Care
DataWorks, for
instance, offers a
mature data model
that is proven over
many years of
effective use.
	
  
• Identify subject matter experts to play an ongoing role in the implementation
process. These should be individuals who understand the data and know how it can
be used. Make them part of the implementation team. They are valuable resources in
that they not only know the data, but also understand how existing operational systems
work. By including them on the team, you will identify data quality issues earlier in your
implementation. Their involvement will also help provide built-in credibility when it
comes time to go live.
It’s also important to remember that these subject matter expert resources be freed up
from a time commitment standpoint to devote the required attention to the implementation
process. It is an in-kind investment that is worthwhile because of the positive outcome
that will result.
• Standardize your data model up front. Having a
data model up front will not only accelerate the data
warehouse’s implementation timeline, it also will assist
the organization with the data issues mentioned earlier
by connecting multiple and disparate source systems.
Remember, data elements will be captured
inconsistently by different operational systems. When
the data model is populated, it will have a place to store
each data element regardless of the source system.
Data quality rules can be implemented to populate the
data based on data availability in each source system.
In the example of gender mentioned earlier, the same
data elements may be stored using different data values.
Possessing and populating a robust data model will
force an organization to standardize these data
elements and serve as a blueprint for how these data
elements should be handled. In this example, the data
model will have a conformed dimension to standardize
gender values.
 
	
  
	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  
5
Organizations have two options for obtaining a data model: They can build their own data
model or buy one. Health Care DataWorks, for instance, offers a mature data model that
is proven over many years of effective use. Regardless of how the organization proceeds,
the data model needs to be in place up front in order for an organization to be ready for
the data quality issues that it should expect.
Conclusion
Organizations can expect data quality challenges when undertaking an EDW implementation. But
when they understand the potential pitfalls, remain committed to improving the quality of data, and
involve their internal experts and users in the process, they will be well on the way to adding value
to the entire organization in the most cost-effective and timely manner.
 
	
  
	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  
6
	
  
About the Author
Jason Buskirk is responsible for managing the day-to-day operations of Health Care DataWorks
(HCD) and leading product strategies for the company's pre-built analytics applications. He is one
of the company's founders.
Prior to HCD, Buskirk worked for Deloitte Consulting, where he implemented analytic applications
built using Oracle's Business Intelligence Enterprise Edition. Buskirk also served as Manager of
the Information Warehouse and Research Information Systems at the Wexner Medical Center at
The Ohio State University.
Buskirk holds a bachelor's degree in computer information systems from DeVry University.
About Health Care DataWorks
Health Care DataWorks, Inc., a leading provider of business intelligence solutions, empowers
healthcare organizations to improve their quality of care and reduce costs. Through its pioneering
KnowledgeEdge™ product suite, including its enterprise data model, analytic dashboards,
applications, and reports, Health Care DataWorks delivers an Enterprise Data Warehouse
necessary for hospitals and health systems to effectively and efficiently gain deeper insights
into their operations. For more information, visit www.hcdataworks.com.
 
	
  
	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  
7
Contacting Health Care DataWorks
Published: February 2012
©2012 Health Care DataWorks, Inc. ALL RIGHTS RESERVED
Phone
1-877-979-HCDW (4239)
Email
info@hcdataworks.com
Address
1801 Watermark Drive,
Suite 250
Columbus, OH 43215
Web
www.hcdataworks.com

More Related Content

What's hot (20)

PDF
Data Management
Ismail Muhammad
 
PPT
Innovative Insights for Smarter Care: Care Management and Analytics
IBM Cúram Software Health and Social Programs
 
PDF
Beyond Firefighting: A Leaders Guide to Proactive Data Quality Management
Harley Capewell
 
PDF
Data Governance PowerPoint Presentation Slides
SlideTeam
 
PDF
Data-Ed: Unlock Business Value through Data Quality Engineering
DATAVERSITY
 
PDF
Challenges in integrating various DBMS during SAP implementation
Vignesh Ravichandran
 
PPTX
Advancements in Legal Entity Data Quality
Kingland
 
PDF
Business impact without data governance
John Bao Vuu
 
PPTX
AMCTO presentation on moving from records managment to information management
Christopher Wynder
 
PDF
Data Governance Maturity Model Thesis
Jan Merkus
 
PPT
JR's Lifetime Advanced Analytics
Chase Hamilton
 
PPTX
Workflow enhances ECM adoption_LaserFicheEpower14
Christopher Wynder
 
PDF
Data driven decision making
SHAHZAD M. SALEEM
 
PPT
SDM Presentation V1.0
KirSinc
 
PPTX
Choosing an Analytics Solution in Healthcare
Dale Sanders
 
PPTX
Realizing the Promise of Precision Medicine
Health Catalyst
 
PPTX
Expand ecm acrossorg_empower15
Christopher Wynder
 
PDF
Role of Operational System Design in Data Warehouse Implementation: Identifyi...
iosrjce
 
PDF
Change Management: The Secret to a Successful SAS® Implementation
ThotWave
 
PPTX
Governance And Data Protection In The Health Sector - Billy Hawkes
healthcareisi
 
Data Management
Ismail Muhammad
 
Innovative Insights for Smarter Care: Care Management and Analytics
IBM Cúram Software Health and Social Programs
 
Beyond Firefighting: A Leaders Guide to Proactive Data Quality Management
Harley Capewell
 
Data Governance PowerPoint Presentation Slides
SlideTeam
 
Data-Ed: Unlock Business Value through Data Quality Engineering
DATAVERSITY
 
Challenges in integrating various DBMS during SAP implementation
Vignesh Ravichandran
 
Advancements in Legal Entity Data Quality
Kingland
 
Business impact without data governance
John Bao Vuu
 
AMCTO presentation on moving from records managment to information management
Christopher Wynder
 
Data Governance Maturity Model Thesis
Jan Merkus
 
JR's Lifetime Advanced Analytics
Chase Hamilton
 
Workflow enhances ECM adoption_LaserFicheEpower14
Christopher Wynder
 
Data driven decision making
SHAHZAD M. SALEEM
 
SDM Presentation V1.0
KirSinc
 
Choosing an Analytics Solution in Healthcare
Dale Sanders
 
Realizing the Promise of Precision Medicine
Health Catalyst
 
Expand ecm acrossorg_empower15
Christopher Wynder
 
Role of Operational System Design in Data Warehouse Implementation: Identifyi...
iosrjce
 
Change Management: The Secret to a Successful SAS® Implementation
ThotWave
 
Governance And Data Protection In The Health Sector - Billy Hawkes
healthcareisi
 

Similar to Hcd wp-2012-better dataleadstobetteranalytics (20)

PDF
Tips --Break Down the Barriers to Better Data Analytics
Abhishek Sood
 
PPTX
DC Salesforce1 Tour Data Governance Lunch Best Practices deck
Beth Fitzpatrick
 
PDF
Qlik wp 2021_q3_data_governance_in_the_modern_data_analytics_pipeline
Srikanth Sharma Boddupalli
 
PDF
Data Cleaning
Becky Nahas
 
PPTX
The Data Operating System: Changing the Digital Trajectory of Healthcare
Dale Sanders
 
PPTX
The Data Operating System: Changing the Digital Trajectory of Healthcare
Health Catalyst
 
PDF
5 Common Data Science Challenges and Effective Solutions.pdf
Julie Bowie
 
PPTX
Best Practices of Data Governance.pptx
preludesyscloudmigra
 
PPTX
Healthcare data challenges
Angela Boyd
 
PPTX
Healthcare Data Challenges
Angela Boyd
 
PPTX
How Data Integration and Governance Enables HR to Drive Value .pptx
Natasha Ramdial - Roopnarine
 
PPTX
Chapter 4 : Introduction to BigData.pptx
bharatgautam204
 
PPT
Image Resampling Detection Based on Convolutional Neural Network Yaohua Liang...
rapellisrikanth
 
PDF
Rapid-fire BI
Brett Sheppard
 
DOC
Comprehensive Data Governance Program
Steve Sugulas
 
PDF
Why Data Standards?
Accounting_Whitepapers
 
PPTX
Cff data governance best practices
Beth Fitzpatrick
 
PDF
Data Quality Assessment: Key Features and Best Practices | Mr. Business Magazine
Mr. Business Magazine
 
PPTX
Securing big data (july 2012)
Marc Vael
 
PPT
The data science process and fundamentals ppt
asvathiaids
 
Tips --Break Down the Barriers to Better Data Analytics
Abhishek Sood
 
DC Salesforce1 Tour Data Governance Lunch Best Practices deck
Beth Fitzpatrick
 
Qlik wp 2021_q3_data_governance_in_the_modern_data_analytics_pipeline
Srikanth Sharma Boddupalli
 
Data Cleaning
Becky Nahas
 
The Data Operating System: Changing the Digital Trajectory of Healthcare
Dale Sanders
 
The Data Operating System: Changing the Digital Trajectory of Healthcare
Health Catalyst
 
5 Common Data Science Challenges and Effective Solutions.pdf
Julie Bowie
 
Best Practices of Data Governance.pptx
preludesyscloudmigra
 
Healthcare data challenges
Angela Boyd
 
Healthcare Data Challenges
Angela Boyd
 
How Data Integration and Governance Enables HR to Drive Value .pptx
Natasha Ramdial - Roopnarine
 
Chapter 4 : Introduction to BigData.pptx
bharatgautam204
 
Image Resampling Detection Based on Convolutional Neural Network Yaohua Liang...
rapellisrikanth
 
Rapid-fire BI
Brett Sheppard
 
Comprehensive Data Governance Program
Steve Sugulas
 
Why Data Standards?
Accounting_Whitepapers
 
Cff data governance best practices
Beth Fitzpatrick
 
Data Quality Assessment: Key Features and Best Practices | Mr. Business Magazine
Mr. Business Magazine
 
Securing big data (july 2012)
Marc Vael
 
The data science process and fundamentals ppt
asvathiaids
 
Ad

More from Health Care DataWorks (17)

PDF
Hrr cmo-benefits
Health Care DataWorks
 
PDF
Hrr cmio-benefits
Health Care DataWorks
 
PDF
Hrr cio-benefits
Health Care DataWorks
 
PDF
Hrr cfo-benefits
Health Care DataWorks
 
PDF
Hcd wp-2012-value basedpurchasingwhathospitalsandhealthsystemsneed
Health Care DataWorks
 
PDF
Hcd wp-2012-howan enterprisedatawarehousecanmake
Health Care DataWorks
 
PDF
Hcd wp-2012-better analysisofrevenuecycleandvbp
Health Care DataWorks
 
PDF
Hcd fast-facts-2013
Health Care DataWorks
 
PDF
Hcd corporateoverviewbrochure
Health Care DataWorks
 
PDF
Vbp data sheet
Health Care DataWorks
 
PDF
Data sheet ke-top-of-hospital
Health Care DataWorks
 
PDF
Data sheet ke-patient-experience
Health Care DataWorks
 
PDF
Datasheet ke-operating-room
Health Care DataWorks
 
PDF
Datasheet ke-is-support-center
Health Care DataWorks
 
PDF
Datasheet ke-event-reporting
Health Care DataWorks
 
PDF
Datasheet ke-emergency-department
Health Care DataWorks
 
PDF
Datasheet ke-admit-discharge-and-transfer
Health Care DataWorks
 
Hrr cmo-benefits
Health Care DataWorks
 
Hrr cmio-benefits
Health Care DataWorks
 
Hrr cio-benefits
Health Care DataWorks
 
Hrr cfo-benefits
Health Care DataWorks
 
Hcd wp-2012-value basedpurchasingwhathospitalsandhealthsystemsneed
Health Care DataWorks
 
Hcd wp-2012-howan enterprisedatawarehousecanmake
Health Care DataWorks
 
Hcd wp-2012-better analysisofrevenuecycleandvbp
Health Care DataWorks
 
Hcd fast-facts-2013
Health Care DataWorks
 
Hcd corporateoverviewbrochure
Health Care DataWorks
 
Vbp data sheet
Health Care DataWorks
 
Data sheet ke-top-of-hospital
Health Care DataWorks
 
Data sheet ke-patient-experience
Health Care DataWorks
 
Datasheet ke-operating-room
Health Care DataWorks
 
Datasheet ke-is-support-center
Health Care DataWorks
 
Datasheet ke-event-reporting
Health Care DataWorks
 
Datasheet ke-emergency-department
Health Care DataWorks
 
Datasheet ke-admit-discharge-and-transfer
Health Care DataWorks
 
Ad

Recently uploaded (20)

PPTX
UiPath Academic Alliance Educator Panels: Session 2 - Business Analyst Content
DianaGray10
 
PDF
Transcript: New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
PDF
Python basic programing language for automation
DanialHabibi2
 
PDF
From Code to Challenge: Crafting Skill-Based Games That Engage and Reward
aiyshauae
 
PPTX
AUTOMATION AND ROBOTICS IN PHARMA INDUSTRY.pptx
sameeraaabegumm
 
PDF
The Builder’s Playbook - 2025 State of AI Report.pdf
jeroen339954
 
PDF
Log-Based Anomaly Detection: Enhancing System Reliability with Machine Learning
Mohammed BEKKOUCHE
 
PDF
Exolore The Essential AI Tools in 2025.pdf
Srinivasan M
 
PDF
CIFDAQ Token Spotlight for 9th July 2025
CIFDAQ
 
PPTX
"Autonomy of LLM Agents: Current State and Future Prospects", Oles` Petriv
Fwdays
 
PPTX
COMPARISON OF RASTER ANALYSIS TOOLS OF QGIS AND ARCGIS
Sharanya Sarkar
 
PPTX
AI Penetration Testing Essentials: A Cybersecurity Guide for 2025
defencerabbit Team
 
PPTX
OpenID AuthZEN - Analyst Briefing July 2025
David Brossard
 
PDF
Smart Trailers 2025 Update with History and Overview
Paul Menig
 
PDF
New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
PDF
Achieving Consistent and Reliable AI Code Generation - Medusa AI
medusaaico
 
PDF
Empower Inclusion Through Accessible Java Applications
Ana-Maria Mihalceanu
 
PPTX
Webinar: Introduction to LF Energy EVerest
DanBrown980551
 
PDF
Timothy Rottach - Ramp up on AI Use Cases, from Vector Search to AI Agents wi...
AWS Chicago
 
PDF
Newgen 2022-Forrester Newgen TEI_13 05 2022-The-Total-Economic-Impact-Newgen-...
darshakparmar
 
UiPath Academic Alliance Educator Panels: Session 2 - Business Analyst Content
DianaGray10
 
Transcript: New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
Python basic programing language for automation
DanialHabibi2
 
From Code to Challenge: Crafting Skill-Based Games That Engage and Reward
aiyshauae
 
AUTOMATION AND ROBOTICS IN PHARMA INDUSTRY.pptx
sameeraaabegumm
 
The Builder’s Playbook - 2025 State of AI Report.pdf
jeroen339954
 
Log-Based Anomaly Detection: Enhancing System Reliability with Machine Learning
Mohammed BEKKOUCHE
 
Exolore The Essential AI Tools in 2025.pdf
Srinivasan M
 
CIFDAQ Token Spotlight for 9th July 2025
CIFDAQ
 
"Autonomy of LLM Agents: Current State and Future Prospects", Oles` Petriv
Fwdays
 
COMPARISON OF RASTER ANALYSIS TOOLS OF QGIS AND ARCGIS
Sharanya Sarkar
 
AI Penetration Testing Essentials: A Cybersecurity Guide for 2025
defencerabbit Team
 
OpenID AuthZEN - Analyst Briefing July 2025
David Brossard
 
Smart Trailers 2025 Update with History and Overview
Paul Menig
 
New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
Achieving Consistent and Reliable AI Code Generation - Medusa AI
medusaaico
 
Empower Inclusion Through Accessible Java Applications
Ana-Maria Mihalceanu
 
Webinar: Introduction to LF Energy EVerest
DanBrown980551
 
Timothy Rottach - Ramp up on AI Use Cases, from Vector Search to AI Agents wi...
AWS Chicago
 
Newgen 2022-Forrester Newgen TEI_13 05 2022-The-Total-Economic-Impact-Newgen-...
darshakparmar
 

Hcd wp-2012-better dataleadstobetteranalytics

  • 1. Better Data Leads to Better Analytics: Three Ways to Improve Healthcare Data Quality in an EDW Written by Jason B. Buskirk Chief Operating Officer Health Care DataWorks
  • 2.                                                                                                                                                                                                                                                                                       2 Better Data Leads to Better Analytics: Three Ways to Improve Healthcare Data Quality in an EDW Too often, organizations embark on Enterprise Data Warehouse (EDW) projects with the notion that all their data needs will be met once the implementation is complete. It is understandable why this thinking becomes pervasive throughout the organization. Typically, organizations have decided to take on such projects after lengthy and time-intensive meetings, presentations and reviews to bring together the myriad interests of its key stakeholders, followed by the due diligence necessary to secure the funding and select the technology partner. Expectations begin to run very high. While an EDW undoubtedly will empower organizations to do more with their data than ever before and the investment will pay dividends in terms of the value it brings, an EDW is only as good as the data that is fed into it. Every organization will encounter data quality issues during or leading up to EDW implementation, and these issues can negatively affect the timeline of the implementation. If there are issues with data quality, the organization will find that, when it comes time to extract the data, it will not be as useful as expected. It is important to discover and address data quality issues as early as possible. Not doing so becomes expensive, both in terms of the developers’ time and the lack of trust that could occur within the organization. Think of it this way: If you put bad data in, you get bad data out, and the sooner you find the bad data, the better off your project will be. This white paper details three ways to improve data quality in an EDW. Establish realistic expectations Improving data quality starts with understanding the data challenges and proactively communicating and working with stakeholders to address potential pitfalls. Taking these steps will contribute to a successful, cost-efficient and relatively smooth implementation that can achieve results at a quicker pace. It is important that everyone in the organization knows that the EDW will only be as effective as the data that goes into it. This will help manage expectations and reduce potential frustration. Everyone wants access to data that is relevant, understandable and, ultimately, results in actionable knowledge. But the reality is an organization will not know how bad its data is until it begins the task of profiling the data that is to be extracted.
  • 3.                                                                                                                                                                                                                                                                                       3 Know the causes of data issues Virtually all the data issues encountered with a data warehouse implementation are not technological in nature – they are operational. These operational causes of data issues generally fall into two broad categories: • Data collection requirements. Organizations have multiple systems capturing and storing their electronic medical records, financial records and human resource information. But these systems tend to operate in silos. This often contributes to issues around when and if data is collected. Some systems may require data elements to be populated, while others may not make them mandatory for data capture. This leads to sparse data sets that could have limited usefulness in the future. • Lack of standardization. Because myriad systems are in use and individual departments can track data in different ways, problems with standardization often arise and take many forms. For example, two units within a health system track the same information – patient gender. In one system, the information is input and categorized as “male” or “female.” In the other system, gender is input as a “1” or “2.” Even though these issues can and should be fixed during the extract process, the time needed to identify these issues and decide how the data should be stored in the data warehouse is something the organization needs to take into consideration when planning the data warehouse project. Improve data quality By taking the following steps before the implementation process begins, organizations can cleanse and improve the quality of the data, positioning the organization for a successful enterprise data warehouse project. • Establish a governance body or data quality group to create consistent standards. Most organizations do not have this in place prior to an EDW implementation. The body or group should be comprised of stakeholders who know which data is being collected, how it is being categorized, how and where it is stored, and all the other details critical to establishing an organization-wide standard. The goal should be to identify “bad” and non-standardized data. Doing this sooner rather than later can ensure the most cost-efficient implementation.
  • 4.                                                                                                                                                                                                                                                                                       4 Organizations have two options: They can build their own data model or buy one. Health Care DataWorks, for instance, offers a mature data model that is proven over many years of effective use.   • Identify subject matter experts to play an ongoing role in the implementation process. These should be individuals who understand the data and know how it can be used. Make them part of the implementation team. They are valuable resources in that they not only know the data, but also understand how existing operational systems work. By including them on the team, you will identify data quality issues earlier in your implementation. Their involvement will also help provide built-in credibility when it comes time to go live. It’s also important to remember that these subject matter expert resources be freed up from a time commitment standpoint to devote the required attention to the implementation process. It is an in-kind investment that is worthwhile because of the positive outcome that will result. • Standardize your data model up front. Having a data model up front will not only accelerate the data warehouse’s implementation timeline, it also will assist the organization with the data issues mentioned earlier by connecting multiple and disparate source systems. Remember, data elements will be captured inconsistently by different operational systems. When the data model is populated, it will have a place to store each data element regardless of the source system. Data quality rules can be implemented to populate the data based on data availability in each source system. In the example of gender mentioned earlier, the same data elements may be stored using different data values. Possessing and populating a robust data model will force an organization to standardize these data elements and serve as a blueprint for how these data elements should be handled. In this example, the data model will have a conformed dimension to standardize gender values.
  • 5.                                                                                                                                                                                                                                                                                       5 Organizations have two options for obtaining a data model: They can build their own data model or buy one. Health Care DataWorks, for instance, offers a mature data model that is proven over many years of effective use. Regardless of how the organization proceeds, the data model needs to be in place up front in order for an organization to be ready for the data quality issues that it should expect. Conclusion Organizations can expect data quality challenges when undertaking an EDW implementation. But when they understand the potential pitfalls, remain committed to improving the quality of data, and involve their internal experts and users in the process, they will be well on the way to adding value to the entire organization in the most cost-effective and timely manner.
  • 6.                                                                                                                                                                                                                                                                                       6   About the Author Jason Buskirk is responsible for managing the day-to-day operations of Health Care DataWorks (HCD) and leading product strategies for the company's pre-built analytics applications. He is one of the company's founders. Prior to HCD, Buskirk worked for Deloitte Consulting, where he implemented analytic applications built using Oracle's Business Intelligence Enterprise Edition. Buskirk also served as Manager of the Information Warehouse and Research Information Systems at the Wexner Medical Center at The Ohio State University. Buskirk holds a bachelor's degree in computer information systems from DeVry University. About Health Care DataWorks Health Care DataWorks, Inc., a leading provider of business intelligence solutions, empowers healthcare organizations to improve their quality of care and reduce costs. Through its pioneering KnowledgeEdge™ product suite, including its enterprise data model, analytic dashboards, applications, and reports, Health Care DataWorks delivers an Enterprise Data Warehouse necessary for hospitals and health systems to effectively and efficiently gain deeper insights into their operations. For more information, visit www.hcdataworks.com.
  • 7.                                                                                                                                                                                                                                                                                       7 Contacting Health Care DataWorks Published: February 2012 ©2012 Health Care DataWorks, Inc. ALL RIGHTS RESERVED Phone 1-877-979-HCDW (4239) Email [email protected] Address 1801 Watermark Drive, Suite 250 Columbus, OH 43215 Web www.hcdataworks.com