V.2.2
Eric Little, PhD
Chief Data Officer
OSTHUS
eric.little@osthus.com
Data Lifecycle Management
Across The Enterprise
Slide 2
Pharma invests in R&D and has to
make $ back over subsequent years
 Most R&D will fail, so risk is high
Law of Diminishing Returns
 R&D productivity is declining
 Harder treatments have greater costs,
potentially lower returns
 Drugs with minimal improvements
(not as many blockbusters + generics)
The Pharma Industry Is At A Tipping Point
From: Kelvin Stott - https://endpts.com/pharmas-broken-business-model-
an-industry-on-the-brink-of-terminal-decline/
Slide 3
Reduce R&D costs through better use of data
 Many experiments are re-run because scientists cannot find existing data
 Costs of system integration is much higher than data integration
 Standardization upstream can significantly impact costs downstream
Once data is available – automate as much as possible
Connect your internal data to other external data sources
 Many items exist in open source that can be modified easier than built from scratch
How To Help Remedy the Situation
Use the data you have before you generate more!
Start with reoccurring tasks – workflows, models,
query patterns, analytics, etc., then build out!
Don’t reinvent the wheel! Build data communities!
Slide 4
THE MOVE FROM BIG DATA TO
BIG ANALYSIS
STATISTICAL
SEMANTICS
MACHINE
LEARNING
REASONING
Slide 5
Moving to Smart Data
Smart data can be added to existing systems
 Does not require replacement of existing tech
Smart data provides a separation of:
 Model Layer
 Data Layer
Link to the model layer
 Leave data in place
 Smart data links information from the models to instance-level data
Smart Data uses metadata in order to capture context about data
Slide 6
Semantic Spectrum of Knowledge Organization Systems
• Deborah L. McGuinness. "Ontologies Come of Age". In Dieter Fensel, Jim Hendler, Henry Lieberman, and Wolfgang Wahlster, editors. Spinning the Semantic Web: Bringing the World Wide Web to Its Full Potential. MIT Press, 2003.
• Michael Uschold and Michael Gruninger “Ontologies and semantics for seamless connectivity” SIGMOD Rec. 33, 4 (December 2004), 58-64. DOI=http://dx.doi.org/10.1145/1041410.1041420
• Leo Obrst “The Ontology Spectrum”. Book section in of Roberto Poli, Michael Healy, Achilles Kameas “Theory and Applications of Ontology: Computer Applications”. Springer Netherlands, 17 Sep 2010.
• Leo Obrst and Mills Davis "Semantic Wave 2008 Report: Industry Roadmap to Web 3.0 & Multibillion Dollar Market Opportunities”. 2008.
Sources
Slide 7
Advantages of Using This Tech
Use cases where customers report distinct improvement:
 Better defined terms
• Differentiates between Entities and Labels – more specific data dictionary
 Better taxonomic structure
• Hierarchies can be accurately captured – not buried in incorrect tables
 Query Federation
• Can easily use multiple data sources (integration)
 Query Faceting
• Query results can be easily refined (and shared)
 Better use of metadata
• Provides context for users
• Raw data is more valuable over time
 Makes data actionable across an enterprise
• Moves from local data (on people’s machines, in their heads) to explicit sharable resources
• Adding SMART DATA to BIG DATA provides the means to access and use the data
• Requires combining logical data with statistical data in order to find patterns of
interest inside of large data sets
Slide 8
A Semantic Framework can connect the entire enterprise using a common semantics
The Semantic Hub should only focus on metadata (not instance level data)
Benefits: Common Terms, Models, Queries, Rules and Results (End-to-End)
Integrating Data Across the Enterprise
Lab Instruments Clinical Trials Regulatory AffairsProduction eArchiving
Slide 9
Lab Instrument Use Case –
Allotrope Framework
HPLC – UV
Mobile Phase Selection
Slide 10
Ontology for HPLC Example (Allotrope)
resultdevice
material
process
Slide 11
Clinical Trials Use Case –
Astra Zeneca & MedImmune
Slide 12
Connecting The Dots Across AstraZeneca & MedImmune
For Clinical Trials
Slide 13
FAIR Principles Bring Together Clinical Trials Data Across Phases
Slide 14
Domain Knowledge Is Captured In Models
Slide 15
Production Use Case –
Manufacturing Data Integration
Slide 16
Often times R&D and manufacturing cannot easily share data
Competing systems can evolve which cause incompatibilities
Manufacturing data is often lower less complex than R&D data, but significantly
higher in throughput
 QA/QC plays a major role
 Far more interpretation in R&D
 Manufacturing needs results fast
• Alarms
• Trends
 Manufacturing data is less retrospective
Manufacturing Data Vs. R&D Data
Slide 17
Regulatory Use Case –
Unstructured Data Integration
Slide 18
Regulatory compliance requires accessing and mining unstructured data
Linking unstructured data to other data provides significant advantages
 Text to DB links unstructured and structured data
 Text to Public Data Sources leverages open source research
Regulatory Compliance
Regulatory Documentation
Slide 19
E-Archiving: Managing Data
Over Long Lifecycles
Slide 20
Data is made available for easier search and indexing (even after long periods of time)
Archiving is no longer a “vault” concept but is integrated within the Data Mgt. Lifecycle
E-Archiving Using the Allotrope Data Framework
Slide 21
Big Analysis Requires Hybrid Architectures
Semantic DBs
Unstructured Docs
Structured Data
Cloud DBs (NoSQL)Analytics
Dashboards & Reports
Integration Layer
Slide 22
Data Science (machine learning, text analytics, clustering etc.)
FAIR Data Is Now Accessible For Advanced Analytics
Linked Open Data
& Open APIs
Semantic
Graph DB
(Knowledge Graph)
Operational DBs
…
Unstructured
Documents
Analytics Tools
simulations
statistics
reasoning
Visualization
dashboards
exploration
search
…
Semi-structured
Data
Instrument
Data
Lightweight Semantic Integration Layer
(semantic RMDM, APIs, semantic indexing, data annotation, catalogues, meta data and linking)
Reporting
regulatory
internal
external
Slide 23
CONNECTING DATA, PEOPLE AND ORGANIZATIONS
Contact Information:
Email: eric.little@osthus.com
Web: www.osthus.com
www.biganalysis.com
Twitter: OntoEric

More Related Content

PDF
Big Data becomes Big Analysis
PDF
Challenges & Opportunities of Implementation FAIR in Life Sciences
PDF
From allotrope to reference master data management
PDF
Demystifying Semantics:Practical Utilization of Semantic Technologies for Rea...
PPTX
Reasoning over big data
PDF
Why Data is Becoming the Most Valuable Asset Companies Posses
PDF
Smart Data for Smart Labs
PDF
Table of Content - International Journal of Managing Information Technology (...
Big Data becomes Big Analysis
Challenges & Opportunities of Implementation FAIR in Life Sciences
From allotrope to reference master data management
Demystifying Semantics:Practical Utilization of Semantic Technologies for Rea...
Reasoning over big data
Why Data is Becoming the Most Valuable Asset Companies Posses
Smart Data for Smart Labs
Table of Content - International Journal of Managing Information Technology (...

What's hot (20)

PDF
Faster R & D Analysis Tool - TRG
 
PPTX
Big Data & ML for Clinical Data
PDF
5th Forum on Laboratory Informatics
PDF
Data Science: An Emerging Field for Future Jobs
PDF
To Be Digital, Pharma Labs Must Bridge the Gap Between Legacy Systems & Conne...
PDF
Heartificial intelligence - claudio-mirti
PDF
Acceliant white paper_edc_and_epro
DOCX
Datascienceindia article
PPTX
Linked data in pharma
PPTX
Understand the Demand of Analyst Opportunity in U.S
PPTX
Data Science
PDF
Data science lecture1_doaa_mohey
PDF
PDF
Removing the information bottleneck in R&D
PPTX
Pistoia Alliance Debates: PhUSE Framework for the Adoption of Cloud Technolog...
PDF
Hybrid Fuzzy Approches for Networks
PPTX
Finding common ground: integrating the eagle-i and VIVO ontologies
PPT
Sowmya Raghavan Strand Life
PDF
Nvidia why every industry should be thinking about AI today
PDF
Data Science Lecture: Overview and Information Collateral
Faster R & D Analysis Tool - TRG
 
Big Data & ML for Clinical Data
5th Forum on Laboratory Informatics
Data Science: An Emerging Field for Future Jobs
To Be Digital, Pharma Labs Must Bridge the Gap Between Legacy Systems & Conne...
Heartificial intelligence - claudio-mirti
Acceliant white paper_edc_and_epro
Datascienceindia article
Linked data in pharma
Understand the Demand of Analyst Opportunity in U.S
Data Science
Data science lecture1_doaa_mohey
Removing the information bottleneck in R&D
Pistoia Alliance Debates: PhUSE Framework for the Adoption of Cloud Technolog...
Hybrid Fuzzy Approches for Networks
Finding common ground: integrating the eagle-i and VIVO ontologies
Sowmya Raghavan Strand Life
Nvidia why every industry should be thinking about AI today
Data Science Lecture: Overview and Information Collateral
Ad

Similar to Data lifecycle mgt across the enterprise (20)

PDF
Reinventing Laboratory Data To Be Bigger, Smarter & Faster
PDF
Licensing Linked Data
PPT
Thesis Defense MBI
PPTX
CLOUD COMPUTING AND BYOD: BENEFITS AND CHALLENGES IN MODERN HEALTHCARE
PDF
The FAIR data movement and 22 Feb 2023.pdf
PPT
Laboratory Integration John Trigg
PPTX
Ontology Tutorial: Semantic Technology for Intelligence, Defense and Security
PDF
Open Insights Harvard DBMI - Personal Health Train - Kees van Bochove - The Hyve
PDF
Linked Data: Opportunities for Entrepreneurs
PPTX
Tragedy of the Data Commons (ODSC-East, 2021)
PDF
Activate Your Data Lakehouse with an Enterprise Knowledge Graph
PPT
Collaboration - theory & Practice
PPTX
Data accessibility and the role of informatics in predicting the biosphere
PDF
Considerations and challenges in building an end to-end microbiome workflow
PPTX
The Science of Data Science
PDF
Challenges and outlook with Big Data
PPTX
IoT 2014 Value Creation Workshop: SDIL
PPT
A Framework for Geospatial Web Services for Public Health by Dr. Leslie Lenert
PDF
Square Pegs In Round Holes: Rethinking Data Availability in the Age of Automa...
PDF
BigDataAnalytics_Talk_KOCH_FINAL
Reinventing Laboratory Data To Be Bigger, Smarter & Faster
Licensing Linked Data
Thesis Defense MBI
CLOUD COMPUTING AND BYOD: BENEFITS AND CHALLENGES IN MODERN HEALTHCARE
The FAIR data movement and 22 Feb 2023.pdf
Laboratory Integration John Trigg
Ontology Tutorial: Semantic Technology for Intelligence, Defense and Security
Open Insights Harvard DBMI - Personal Health Train - Kees van Bochove - The Hyve
Linked Data: Opportunities for Entrepreneurs
Tragedy of the Data Commons (ODSC-East, 2021)
Activate Your Data Lakehouse with an Enterprise Knowledge Graph
Collaboration - theory & Practice
Data accessibility and the role of informatics in predicting the biosphere
Considerations and challenges in building an end to-end microbiome workflow
The Science of Data Science
Challenges and outlook with Big Data
IoT 2014 Value Creation Workshop: SDIL
A Framework for Geospatial Web Services for Public Health by Dr. Leslie Lenert
Square Pegs In Round Holes: Rethinking Data Availability in the Age of Automa...
BigDataAnalytics_Talk_KOCH_FINAL
Ad

More from OSTHUS (12)

PDF
The Fast Track to Fair Lab Data
PDF
Early AI Adoption Via Advanced Analytics
PDF
Revolutionizing Laboratory Instrument Data for the Pharmaceutical Industry:...
PDF
Why paperless lab is just the first step towards a smart lab
PDF
Allotrope foundation vanderwall_and_little_bio_it_world_2016
PDF
Semantics for Integrated Analytical Laboratory Processes – the Allotrope Pers...
PDF
Semantics for integrated laboratory analytical processes - The Allotrope Pers...
PDF
Best Practice Reference Architecture for Data Curation
PDF
Allotrope Foundation & OSTHUS at SmartLab Exchange 2015: Update on the Allotr...
PDF
OSTHUS-Allotrope presents "Laboratory Informatics Strategy" at SmartLab 2015
PDF
Data Quality- How to clean up your legacy data
PDF
Data Quality- How to clean up your legacy data?
The Fast Track to Fair Lab Data
Early AI Adoption Via Advanced Analytics
Revolutionizing Laboratory Instrument Data for the Pharmaceutical Industry:...
Why paperless lab is just the first step towards a smart lab
Allotrope foundation vanderwall_and_little_bio_it_world_2016
Semantics for Integrated Analytical Laboratory Processes – the Allotrope Pers...
Semantics for integrated laboratory analytical processes - The Allotrope Pers...
Best Practice Reference Architecture for Data Curation
Allotrope Foundation & OSTHUS at SmartLab Exchange 2015: Update on the Allotr...
OSTHUS-Allotrope presents "Laboratory Informatics Strategy" at SmartLab 2015
Data Quality- How to clean up your legacy data
Data Quality- How to clean up your legacy data?

Recently uploaded (20)

PPTX
Full-Stack Developer Courses That Actually Land You Jobs
PDF
PDF-XChange Editor Plus 10.7.0.398.0 Crack Free Download Latest 2025
PDF
CapCut PRO for PC Crack New Download (Fully Activated 2025)
PDF
SOFTWARE ENGINEERING Software Engineering (3rd Edition) by K.K. Aggarwal & Yo...
DOCX
Modern SharePoint Intranet Templates That Boost Employee Engagement in 2025.docx
PPTX
Computer Software - Technology and Livelihood Education
PDF
novaPDF Pro 11.9.482 Crack + License Key [Latest 2025]
PPTX
4Seller: The All-in-One Multi-Channel E-Commerce Management Platform for Glob...
PDF
Internet Download Manager IDM Crack powerful download accelerator New Version...
PPTX
Plex Media Server 1.28.2.6151 With Crac5 2022 Free .
PPTX
ROI Analysis for Newspaper Industry with Odoo ERP
PPT
3.Software Design for software engineering
PPTX
Matchmaking for JVMs: How to Pick the Perfect GC Partner
PDF
Understanding the Need for Systemic Change in Open Source Through Intersectio...
PPTX
ROI from Efficient Content & Campaign Management in the Digital Media Industry
PDF
Website Design & Development_ Professional Web Design Services.pdf
PDF
infoteam HELLAS company profile 2025 presentation
PPTX
Viber For Windows 25.7.1 Crack + Serial Keygen
PPTX
Download Adobe Photoshop Crack 2025 Free
PPTX
hospital managemt ,san.dckldnklcdnkdnkdnjadnjdjn
Full-Stack Developer Courses That Actually Land You Jobs
PDF-XChange Editor Plus 10.7.0.398.0 Crack Free Download Latest 2025
CapCut PRO for PC Crack New Download (Fully Activated 2025)
SOFTWARE ENGINEERING Software Engineering (3rd Edition) by K.K. Aggarwal & Yo...
Modern SharePoint Intranet Templates That Boost Employee Engagement in 2025.docx
Computer Software - Technology and Livelihood Education
novaPDF Pro 11.9.482 Crack + License Key [Latest 2025]
4Seller: The All-in-One Multi-Channel E-Commerce Management Platform for Glob...
Internet Download Manager IDM Crack powerful download accelerator New Version...
Plex Media Server 1.28.2.6151 With Crac5 2022 Free .
ROI Analysis for Newspaper Industry with Odoo ERP
3.Software Design for software engineering
Matchmaking for JVMs: How to Pick the Perfect GC Partner
Understanding the Need for Systemic Change in Open Source Through Intersectio...
ROI from Efficient Content & Campaign Management in the Digital Media Industry
Website Design & Development_ Professional Web Design Services.pdf
infoteam HELLAS company profile 2025 presentation
Viber For Windows 25.7.1 Crack + Serial Keygen
Download Adobe Photoshop Crack 2025 Free
hospital managemt ,san.dckldnklcdnkdnkdnjadnjdjn

Data lifecycle mgt across the enterprise

  • 1. V.2.2 Eric Little, PhD Chief Data Officer OSTHUS [email protected] Data Lifecycle Management Across The Enterprise
  • 2. Slide 2 Pharma invests in R&D and has to make $ back over subsequent years  Most R&D will fail, so risk is high Law of Diminishing Returns  R&D productivity is declining  Harder treatments have greater costs, potentially lower returns  Drugs with minimal improvements (not as many blockbusters + generics) The Pharma Industry Is At A Tipping Point From: Kelvin Stott - https://endpts.com/pharmas-broken-business-model- an-industry-on-the-brink-of-terminal-decline/
  • 3. Slide 3 Reduce R&D costs through better use of data  Many experiments are re-run because scientists cannot find existing data  Costs of system integration is much higher than data integration  Standardization upstream can significantly impact costs downstream Once data is available – automate as much as possible Connect your internal data to other external data sources  Many items exist in open source that can be modified easier than built from scratch How To Help Remedy the Situation Use the data you have before you generate more! Start with reoccurring tasks – workflows, models, query patterns, analytics, etc., then build out! Don’t reinvent the wheel! Build data communities!
  • 4. Slide 4 THE MOVE FROM BIG DATA TO BIG ANALYSIS STATISTICAL SEMANTICS MACHINE LEARNING REASONING
  • 5. Slide 5 Moving to Smart Data Smart data can be added to existing systems  Does not require replacement of existing tech Smart data provides a separation of:  Model Layer  Data Layer Link to the model layer  Leave data in place  Smart data links information from the models to instance-level data Smart Data uses metadata in order to capture context about data
  • 6. Slide 6 Semantic Spectrum of Knowledge Organization Systems • Deborah L. McGuinness. "Ontologies Come of Age". In Dieter Fensel, Jim Hendler, Henry Lieberman, and Wolfgang Wahlster, editors. Spinning the Semantic Web: Bringing the World Wide Web to Its Full Potential. MIT Press, 2003. • Michael Uschold and Michael Gruninger “Ontologies and semantics for seamless connectivity” SIGMOD Rec. 33, 4 (December 2004), 58-64. DOI=http://dx.doi.org/10.1145/1041410.1041420 • Leo Obrst “The Ontology Spectrum”. Book section in of Roberto Poli, Michael Healy, Achilles Kameas “Theory and Applications of Ontology: Computer Applications”. Springer Netherlands, 17 Sep 2010. • Leo Obrst and Mills Davis "Semantic Wave 2008 Report: Industry Roadmap to Web 3.0 & Multibillion Dollar Market Opportunities”. 2008. Sources
  • 7. Slide 7 Advantages of Using This Tech Use cases where customers report distinct improvement:  Better defined terms • Differentiates between Entities and Labels – more specific data dictionary  Better taxonomic structure • Hierarchies can be accurately captured – not buried in incorrect tables  Query Federation • Can easily use multiple data sources (integration)  Query Faceting • Query results can be easily refined (and shared)  Better use of metadata • Provides context for users • Raw data is more valuable over time  Makes data actionable across an enterprise • Moves from local data (on people’s machines, in their heads) to explicit sharable resources • Adding SMART DATA to BIG DATA provides the means to access and use the data • Requires combining logical data with statistical data in order to find patterns of interest inside of large data sets
  • 8. Slide 8 A Semantic Framework can connect the entire enterprise using a common semantics The Semantic Hub should only focus on metadata (not instance level data) Benefits: Common Terms, Models, Queries, Rules and Results (End-to-End) Integrating Data Across the Enterprise Lab Instruments Clinical Trials Regulatory AffairsProduction eArchiving
  • 9. Slide 9 Lab Instrument Use Case – Allotrope Framework HPLC – UV Mobile Phase Selection
  • 10. Slide 10 Ontology for HPLC Example (Allotrope) resultdevice material process
  • 11. Slide 11 Clinical Trials Use Case – Astra Zeneca & MedImmune
  • 12. Slide 12 Connecting The Dots Across AstraZeneca & MedImmune For Clinical Trials
  • 13. Slide 13 FAIR Principles Bring Together Clinical Trials Data Across Phases
  • 14. Slide 14 Domain Knowledge Is Captured In Models
  • 15. Slide 15 Production Use Case – Manufacturing Data Integration
  • 16. Slide 16 Often times R&D and manufacturing cannot easily share data Competing systems can evolve which cause incompatibilities Manufacturing data is often lower less complex than R&D data, but significantly higher in throughput  QA/QC plays a major role  Far more interpretation in R&D  Manufacturing needs results fast • Alarms • Trends  Manufacturing data is less retrospective Manufacturing Data Vs. R&D Data
  • 17. Slide 17 Regulatory Use Case – Unstructured Data Integration
  • 18. Slide 18 Regulatory compliance requires accessing and mining unstructured data Linking unstructured data to other data provides significant advantages  Text to DB links unstructured and structured data  Text to Public Data Sources leverages open source research Regulatory Compliance Regulatory Documentation
  • 19. Slide 19 E-Archiving: Managing Data Over Long Lifecycles
  • 20. Slide 20 Data is made available for easier search and indexing (even after long periods of time) Archiving is no longer a “vault” concept but is integrated within the Data Mgt. Lifecycle E-Archiving Using the Allotrope Data Framework
  • 21. Slide 21 Big Analysis Requires Hybrid Architectures Semantic DBs Unstructured Docs Structured Data Cloud DBs (NoSQL)Analytics Dashboards & Reports Integration Layer
  • 22. Slide 22 Data Science (machine learning, text analytics, clustering etc.) FAIR Data Is Now Accessible For Advanced Analytics Linked Open Data & Open APIs Semantic Graph DB (Knowledge Graph) Operational DBs … Unstructured Documents Analytics Tools simulations statistics reasoning Visualization dashboards exploration search … Semi-structured Data Instrument Data Lightweight Semantic Integration Layer (semantic RMDM, APIs, semantic indexing, data annotation, catalogues, meta data and linking) Reporting regulatory internal external
  • 23. Slide 23 CONNECTING DATA, PEOPLE AND ORGANIZATIONS Contact Information: Email: [email protected] Web: www.osthus.com www.biganalysis.com Twitter: OntoEric