Data Analytics
Philip E. Bourne, PhD, FACMI
Associate Director for Data Science
National Institutes of Health
UM iSchool
November 3, 2015
Pre-reading
Health Informatics: Practical Guide for
Healthcare and Information
Technology Professionals
Chapter 3 Healthcare Data Analytics
William Hersh
This is a Conversation NOT a Lecture
We Have Been Successful
World Climate Report 2011
http://www.cnet.com/news/china-unseats-u-s-in-supercomputer-ranking/
Why Now?
Harnessing Data to Improve Health:
BD2K (Big Data to Knowledge)
NIH’s 6-year initiative to use data science to foster an
open digital ecosystem that will accelerate efficient,
cost-effective biomedical research to enhance health,
lengthen life, and reduce illness and disability
Programs and activities:
Advance discovery for biomedical research
Facilitate use and re-use of biomedical data
Develop analytical methods and software
Enhance biomedical data science training
Big Data in the Life Sciences …
This speaks to something more
fundamental that more data …
It speaks to new methodologies, new
skills, new emphasis, new cultures,
new modes of discovery …
The History of Computational
Biomedicine According to Bourne
1980s 1990s 2000s 2010s 2020
Discipline:
Unknown Expt. Driven Emergent Over-sold A Service A Partner A Driver
The Raw Material:
Non-existent Limited /Poor More/Ontologies Big Data/Siloed Open/Integrated
The People:
No name Technicians Industry recognition data scientists Academics
Searls (ed) The Roots in Bioinformatics Series PLOS Comp Biol
Consider what the expert
prophets are saying …
We are at a Point of Deception …
 Evidence:
– Google car
– 3D printers
– Waze
– Robotics
– Sensors
From: The Second Machine Age: Work, Progress,
and Prosperity in a Time of Brilliant Technologies
by Erik Brynjolfsson & Andrew McAfee
Example - Photography
Digitization
Deception
Disruption
Demonetization
Dematerialization
Democratization
Time
Volume,Velocity,Variety
Digital camera invented by
Kodak but shelved
Megapixels & quality improve slowly;
Kodak slow to react
Film market collapses;
Kodak goes bankrupt
Phones replace
cameras
Instagram,
Flickr become the
value proposition
Digital media becomes bona fide
form of communication
We Are At a Point of Deception
The 6D Exponential Framework
Digitization of Basic &
Clinical Research & EHR’s
Deception
We Are Here
Disruption
Demonetization
Dematerialization
Democratization
Open science
Patient centered health care
What Are Some General Implications
of Such a Future?
 Open collaborative science becomes of increasing
importance
 The value of data and associated analytics becomes
of increasing value to scholarship
 Opportunities exist to improve the efficiency of the
research enterprise and hence fund more research
 Current training content and modalities will not match
supply to demand
 Balancing accessibility vs security becomes more
important yet more complex
An Example of That Promise:
Comorbidity Network for 6.2M Danes
Over 14.9 Years
Jensen et al 2014 Nat Comm 5:4022
Data Analytics
“And that’s why we’re here today. Because something
called precision medicine … gives us one of the greatest
opportunities for new medical breakthroughs that we
have ever seen.”
President Barack Obama
January 30, 2015
Precision Medicine Initiative
 National Research Cohort
– >1 million U.S. volunteers
– Numerous existing cohorts (many funded by NIH)
– New volunteers
 Participants will be centrally involved in design and
implementation of the cohort
 They will be able to share genomic data, lifestyle
information, biological samples – all linked to their
electronic health records
Data Analytics
Center of Excellence for Mobile
Sensor Data-to-Knowledge (MD2K)
Santosh Kumar, Ph.D.
Director, MD2K Center of Excellence
Professor & Moss Chair of Excellence in Computer Science
University of Memphis
https://datascience.nih.gov/bd2k/funded-programs/centers
MD2K Applications – CHF and Smoking
Strategic
Areas
Sustainability
Workforce
Development
& Diversity
Discovery &
Innovation
Policy &
Process
Leadership
Research Objects in the Commons
Voxel Wide Genome Scanning
MRI standardization
Over 100 Public Lectures
Collaboration with a Minority Institution
185 Institutions Involved
Genomic Data Sharing
Policy
Example: BD2K Center
Working Across Strategic Areas
BD2K Targeted Software Topics
Supports innovative analytical methods and software tools
that address critical current and emerging needs of the
biomedical research
2015 Topics (18 awards, U01s)
– Data Compression
– Data Provenance
– Data Visualization
– Data Wrangling
2016 Topics (U01s, under review)
– Data Privacy
– Data Repurposing
– Applying Metadata
– 2016: Crowdsourcing and interactive Digital Media
(UH2)
Goal: To strengthen the ability of a
diverse biomedical workforce to develop
and benefit from data science
Key Chapter Points
 Provenance
 Proof of the value of analytics is still forming
 Work force - shortage
Data Analytics
I not only use all the brains
I have, but all I can borrow.
– Woodrow Wilson
The Team
27
NIHNIH……
Turning Discovery Into HealthTurning Discovery Into Health
philip.bourne@nih.gov
https://datascience.nih.gov/
http://www.ncbi.nlm.nih.gov/research/staff/bourne/

More Related Content

PPT
Big Data in Biomedicine – An NIH Perspective
PPT
Open Data in a Global Ecosystem
PPTX
A SWOT Analysis of Data Science @ NIH
PPT
BD2K Update
PPTX
The Commons: Leveraging the Power of the Cloud for Big Data
PPT
There is No Intelligent Life Down Here
PPTX
From Where Have We Come & Where Are We Going
PPT
The Vision for Data @ the NIH
Big Data in Biomedicine – An NIH Perspective
Open Data in a Global Ecosystem
A SWOT Analysis of Data Science @ NIH
BD2K Update
The Commons: Leveraging the Power of the Cloud for Big Data
There is No Intelligent Life Down Here
From Where Have We Come & Where Are We Going
The Vision for Data @ the NIH

What's hot (20)

PPT
Data Science BD2K Update for NIH
PPT
The NIH as a Digital Enterprise: Implications for PAG
PPTX
Understanding the Big Data Enterprise
PPTX
Big Data as a Catalyst for Collaboration & Innovation
PPTX
SWOT Analysis - What Does it Tell Us?
PPT
RDAP 033111
PPTX
Highlights from NIH Data Science
PPT
Big Data in Biomedicine: Where is the NIH Headed
PPT
Data Science in Biomedicine - Where Are We Headed?
PPT
Meeting the Computational Challenges Associated with Human Health
PPTX
The Analytics and Data Science Landscape
PDF
NIH BD2K DataMed model, DATS
PPT
A Successful Academic Medical Center Must be a Truly Digital Enterprise
PDF
Research Data Census
PPT
AMIA 2014
PPTX
NIH Big Data to Knowledge (BD2K)
PPTX
Introduction to Big Data and its Potential for Dementia Research
PPTX
Open Access as a Means to Produce High Quality Data
PPTX
A VIVO VIEW OF CANCER RESEARCH: Dream, Vision and Reality
PPTX
Symbiosis—Is Collaboration the New Innovation? (Part 3 of 3), Mike Conlon
Data Science BD2K Update for NIH
The NIH as a Digital Enterprise: Implications for PAG
Understanding the Big Data Enterprise
Big Data as a Catalyst for Collaboration & Innovation
SWOT Analysis - What Does it Tell Us?
RDAP 033111
Highlights from NIH Data Science
Big Data in Biomedicine: Where is the NIH Headed
Data Science in Biomedicine - Where Are We Headed?
Meeting the Computational Challenges Associated with Human Health
The Analytics and Data Science Landscape
NIH BD2K DataMed model, DATS
A Successful Academic Medical Center Must be a Truly Digital Enterprise
Research Data Census
AMIA 2014
NIH Big Data to Knowledge (BD2K)
Introduction to Big Data and its Potential for Dementia Research
Open Access as a Means to Produce High Quality Data
A VIVO VIEW OF CANCER RESEARCH: Dream, Vision and Reality
Symbiosis—Is Collaboration the New Innovation? (Part 3 of 3), Mike Conlon
Ad

Similar to Data Analytics (20)

PPT
Hpm100615
PPT
The Thinking Behind Big Data at the NIH
PPT
Data Science at NIH and its Relationship to Social Computing, Behavioral-Cult...
PPT
Health Policy and Management as it Relates to Big Data
PPT
The Role of Automated Function Prediction in the Era of Big Data and Small Bu...
PPTX
PSB2014 A Vision for Biomedical Research
PPTX
Will Biomedical Research Fundamentally Change in the Era of Big Data?
PPT
Data at the NIH
PPT
Human Genome and Big Data Challenges
PPTX
BD2K Update
PPTX
Data commons bonazzi bd2 k fundamentals of science feb 2017
PPTX
2018 10 igneous
PPTX
VIVO Keynote
PPT
Yale Day of Data
PPTX
Towards the Digital Research Enterprise
PPT
Workshop intro090314
PPT
Biomedical Research as Part of the Digital Enterprise
PPT
Secure Data Sharing and Related Matters – An NIH View
PDF
G. Poste. Managing the Data Deluge: Critical Issues in the Integration and An...
PPTX
Bioinformatics in the Era of Open Science and Big Data
Hpm100615
The Thinking Behind Big Data at the NIH
Data Science at NIH and its Relationship to Social Computing, Behavioral-Cult...
Health Policy and Management as it Relates to Big Data
The Role of Automated Function Prediction in the Era of Big Data and Small Bu...
PSB2014 A Vision for Biomedical Research
Will Biomedical Research Fundamentally Change in the Era of Big Data?
Data at the NIH
Human Genome and Big Data Challenges
BD2K Update
Data commons bonazzi bd2 k fundamentals of science feb 2017
2018 10 igneous
VIVO Keynote
Yale Day of Data
Towards the Digital Research Enterprise
Workshop intro090314
Biomedical Research as Part of the Digital Enterprise
Secure Data Sharing and Related Matters – An NIH View
G. Poste. Managing the Data Deluge: Critical Issues in the Integration and An...
Bioinformatics in the Era of Open Science and Big Data
Ad

More from Philip Bourne (20)

PPTX
Your Science Needs You - More Than Ever Before
PPTX
The Biological Data Sustainability Paradox: A Time to Think Differently
PPTX
Data Science and AI in Biomedicine: The World has Changed
PPTX
Data Science and AI in Biomedicine: The World has Changed
PPTX
AI in Medical Education A Meta View to Start a Conversation
PPTX
AI+ Now and Then How Did We Get Here And Where Are We Going
PPTX
Thoughts on Biological Data Sustainability
PPTX
What is FAIR Data and Who Needs It?
PPTX
Data Science Meets Biomedicine, Does Anything Change
PPTX
Data Science Meets Drug Discovery
PPTX
Biomedical Data Science: We Are Not Alone
PPTX
BIMS7100-2023. Social Responsibility in Research
PPTX
AI from the Perspective of a School of Data Science
PPTX
What Data Science Will Mean to You - One Person's View
PPTX
Novo Nordisk 080522.pptx
PPTX
Towards a US Open research Commons (ORC)
PPTX
COVID and Precision Education
PPTX
One View of Data Science
PPTX
Cancer Research Meets Data Science — What Can We Do Together?
PPTX
Data Science Meets Open Scholarship – What Comes Next?
Your Science Needs You - More Than Ever Before
The Biological Data Sustainability Paradox: A Time to Think Differently
Data Science and AI in Biomedicine: The World has Changed
Data Science and AI in Biomedicine: The World has Changed
AI in Medical Education A Meta View to Start a Conversation
AI+ Now and Then How Did We Get Here And Where Are We Going
Thoughts on Biological Data Sustainability
What is FAIR Data and Who Needs It?
Data Science Meets Biomedicine, Does Anything Change
Data Science Meets Drug Discovery
Biomedical Data Science: We Are Not Alone
BIMS7100-2023. Social Responsibility in Research
AI from the Perspective of a School of Data Science
What Data Science Will Mean to You - One Person's View
Novo Nordisk 080522.pptx
Towards a US Open research Commons (ORC)
COVID and Precision Education
One View of Data Science
Cancer Research Meets Data Science — What Can We Do Together?
Data Science Meets Open Scholarship – What Comes Next?

Recently uploaded (20)

PDF
LIFE & LIVING TRILOGY- PART (1) WHO ARE WE.pdf
PPTX
What’s under the hood: Parsing standardized learning content for AI
PPTX
Climate Change and Its Global Impact.pptx
PDF
Journal of Dental Science - UDMY (2021).pdf
PDF
MA in English at Shiv Nadar University – Advanced Literature, Language & Rese...
PPTX
Thinking Routines and Learning Engagements.pptx
PPTX
ELIAS-SEZIURE AND EPilepsy semmioan session.pptx
PDF
Disorder of Endocrine system (1).pdfyyhyyyy
PPTX
Integrated Management of Neonatal and Childhood Illnesses (IMNCI) – Unit IV |...
PDF
semiconductor packaging in vlsi design fab
PDF
Myanmar Dental Journal, The Journal of the Myanmar Dental Association (2015).pdf
PPTX
Macbeth play - analysis .pptx english lit
DOCX
Cambridge-Practice-Tests-for-IELTS-12.docx
PDF
Fun with Grammar (Communicative Activities for the Azar Grammar Series)
PDF
LEARNERS WITH ADDITIONAL NEEDS ProfEd Topic
PPT
REGULATION OF RESPIRATION lecture note 200L [Autosaved]-1-1.ppt
PDF
plant tissues class 6-7 mcqs chatgpt.pdf
PPTX
2025 High Blood Pressure Guideline Slide Set.pptx
PDF
English Textual Question & Ans (12th Class).pdf
PDF
CISA (Certified Information Systems Auditor) Domain-Wise Summary.pdf
LIFE & LIVING TRILOGY- PART (1) WHO ARE WE.pdf
What’s under the hood: Parsing standardized learning content for AI
Climate Change and Its Global Impact.pptx
Journal of Dental Science - UDMY (2021).pdf
MA in English at Shiv Nadar University – Advanced Literature, Language & Rese...
Thinking Routines and Learning Engagements.pptx
ELIAS-SEZIURE AND EPilepsy semmioan session.pptx
Disorder of Endocrine system (1).pdfyyhyyyy
Integrated Management of Neonatal and Childhood Illnesses (IMNCI) – Unit IV |...
semiconductor packaging in vlsi design fab
Myanmar Dental Journal, The Journal of the Myanmar Dental Association (2015).pdf
Macbeth play - analysis .pptx english lit
Cambridge-Practice-Tests-for-IELTS-12.docx
Fun with Grammar (Communicative Activities for the Azar Grammar Series)
LEARNERS WITH ADDITIONAL NEEDS ProfEd Topic
REGULATION OF RESPIRATION lecture note 200L [Autosaved]-1-1.ppt
plant tissues class 6-7 mcqs chatgpt.pdf
2025 High Blood Pressure Guideline Slide Set.pptx
English Textual Question & Ans (12th Class).pdf
CISA (Certified Information Systems Auditor) Domain-Wise Summary.pdf

Data Analytics

  • 1. Data Analytics Philip E. Bourne, PhD, FACMI Associate Director for Data Science National Institutes of Health UM iSchool November 3, 2015
  • 2. Pre-reading Health Informatics: Practical Guide for Healthcare and Information Technology Professionals Chapter 3 Healthcare Data Analytics William Hersh
  • 3. This is a Conversation NOT a Lecture
  • 4. We Have Been Successful World Climate Report 2011 http://www.cnet.com/news/china-unseats-u-s-in-supercomputer-ranking/
  • 6. Harnessing Data to Improve Health: BD2K (Big Data to Knowledge) NIH’s 6-year initiative to use data science to foster an open digital ecosystem that will accelerate efficient, cost-effective biomedical research to enhance health, lengthen life, and reduce illness and disability Programs and activities: Advance discovery for biomedical research Facilitate use and re-use of biomedical data Develop analytical methods and software Enhance biomedical data science training
  • 7. Big Data in the Life Sciences … This speaks to something more fundamental that more data … It speaks to new methodologies, new skills, new emphasis, new cultures, new modes of discovery …
  • 8. The History of Computational Biomedicine According to Bourne 1980s 1990s 2000s 2010s 2020 Discipline: Unknown Expt. Driven Emergent Over-sold A Service A Partner A Driver The Raw Material: Non-existent Limited /Poor More/Ontologies Big Data/Siloed Open/Integrated The People: No name Technicians Industry recognition data scientists Academics Searls (ed) The Roots in Bioinformatics Series PLOS Comp Biol
  • 9. Consider what the expert prophets are saying …
  • 10. We are at a Point of Deception …  Evidence: – Google car – 3D printers – Waze – Robotics – Sensors From: The Second Machine Age: Work, Progress, and Prosperity in a Time of Brilliant Technologies by Erik Brynjolfsson & Andrew McAfee
  • 11. Example - Photography Digitization Deception Disruption Demonetization Dematerialization Democratization Time Volume,Velocity,Variety Digital camera invented by Kodak but shelved Megapixels & quality improve slowly; Kodak slow to react Film market collapses; Kodak goes bankrupt Phones replace cameras Instagram, Flickr become the value proposition Digital media becomes bona fide form of communication
  • 12. We Are At a Point of Deception The 6D Exponential Framework Digitization of Basic & Clinical Research & EHR’s Deception We Are Here Disruption Demonetization Dematerialization Democratization Open science Patient centered health care
  • 13. What Are Some General Implications of Such a Future?  Open collaborative science becomes of increasing importance  The value of data and associated analytics becomes of increasing value to scholarship  Opportunities exist to improve the efficiency of the research enterprise and hence fund more research  Current training content and modalities will not match supply to demand  Balancing accessibility vs security becomes more important yet more complex
  • 14. An Example of That Promise: Comorbidity Network for 6.2M Danes Over 14.9 Years Jensen et al 2014 Nat Comm 5:4022
  • 16. “And that’s why we’re here today. Because something called precision medicine … gives us one of the greatest opportunities for new medical breakthroughs that we have ever seen.” President Barack Obama January 30, 2015
  • 17. Precision Medicine Initiative  National Research Cohort – >1 million U.S. volunteers – Numerous existing cohorts (many funded by NIH) – New volunteers  Participants will be centrally involved in design and implementation of the cohort  They will be able to share genomic data, lifestyle information, biological samples – all linked to their electronic health records
  • 19. Center of Excellence for Mobile Sensor Data-to-Knowledge (MD2K) Santosh Kumar, Ph.D. Director, MD2K Center of Excellence Professor & Moss Chair of Excellence in Computer Science University of Memphis https://datascience.nih.gov/bd2k/funded-programs/centers
  • 20. MD2K Applications – CHF and Smoking
  • 21. Strategic Areas Sustainability Workforce Development & Diversity Discovery & Innovation Policy & Process Leadership Research Objects in the Commons Voxel Wide Genome Scanning MRI standardization Over 100 Public Lectures Collaboration with a Minority Institution 185 Institutions Involved Genomic Data Sharing Policy Example: BD2K Center Working Across Strategic Areas
  • 22. BD2K Targeted Software Topics Supports innovative analytical methods and software tools that address critical current and emerging needs of the biomedical research 2015 Topics (18 awards, U01s) – Data Compression – Data Provenance – Data Visualization – Data Wrangling 2016 Topics (U01s, under review) – Data Privacy – Data Repurposing – Applying Metadata – 2016: Crowdsourcing and interactive Digital Media (UH2)
  • 23. Goal: To strengthen the ability of a diverse biomedical workforce to develop and benefit from data science
  • 24. Key Chapter Points  Provenance  Proof of the value of analytics is still forming  Work force - shortage
  • 26. I not only use all the brains I have, but all I can borrow. – Woodrow Wilson
  • 28. NIHNIH…… Turning Discovery Into HealthTurning Discovery Into Health [email protected] https://datascience.nih.gov/ http://www.ncbi.nlm.nih.gov/research/staff/bourne/

Editor's Notes

  • #7: Updated by ADDS group 8/25/15
  • #15: 16 million hospital inpatient events (24.5% of total), 35 million outpatient clinic events (53.6% of total) and 14 million emergency department events (21.9% of total
  • #17: Photos: FC tweet; RK screen grab
  • #18: Images of people from Infographic (NOTE: Image is just a placeholder—Jill will tweak) Detailed Notes: National Research Cohort <<OR name of study>> >1 million U.S. volunteers committed to participating in research Will combine a number of existing cohorts Will include Dept of Veterans Affairs Million Veteran Program—note Veteran is singular per http://www.research.va.gov/MVP/
  • #22: Detected 8 genetic variants influencing volume of brain structures to provide insight into brain development and neuropsychiatric dysfunction. MRI images from >30,000 people Meta-analysis of GWAS data from >13,000 people Replicated results with data from >17,000 people Designed standardized protocols for image analysis, quality assessment, genetic imputation, and association. Developed 3D models for 1,500 subjects Used freely available software for measurements
  • #24: Short term: produce a searchable catalog of physical and virtual courses; Funding diversity awards to work with BD2K Centers; Expand IRP training started Jan 2015 e.g. Software carpentry and Train the trainers Long term: evaluation