APPROACHES TO DATA INTEGRATION
AND BRINGING TOGETHER
PRESENTED BY,
SHARADHA.M
I M PHARM
DEPT.OF PHARMACEUTICS
JSS COLLEGE OF PHARMACY
MYSORE
CONTENTS:
1.Approaches To Data Integration Standards
- OMG,13C (RIP) LSIT & W3C
2.Bringing Together All Three Disciplines
APPROACHES TO DATA INTEGRATION STANDARDS
-OMG, I3C (RIP) LSIT, and W3C
• As stated on the OMG (Object Management) website
(http://www.omg.org/), a lack of data standards results
in data conversions, loss of information, lack of
interoperability, etc.
• Current standards are XML (Extensible Markup
Language) , LSID (Life Sciences Identifiers), and now the
RDF(Resource Description Framework) from the W3C
(World Wide Web Consortium),which is extensible
though hard to implement. Substantial work on OO
(Object Oriented) modeling of life science data types
takes place at the OMG’s LSR (Life Sciences Research)
group
• The OMG (Object Management) adopts and publishes
“Interface” specifications. Specifications may also be
chosen from existing products in competitive selection
process. Any interface specifications are freely available
to both members and nonmembers.
• Implementations must be available from an OMG
member. The OMG uses many approaches to object
oriented modeling of complex data types.
• The OMG has specific domain task force (DTF) groups
that deal with these specific types.
• Working groups are formed to address specific areas
of interest within the task force. Of course, whenever
there is potential for reuse of existing standards, it is
positively encouraged!
• The life sciences domain task force (LSRDTF) has
several working groups: architecture and road map,
biochemical pathways, chemi-informatics, gene
expression, sequence analysis, and single nucleotide
polymorphisms.
• Each working group has a corresponding chairperson
who champions requests for proposals (RFPs) from
any interested parties.
• The working group members identify key needs and
help with the building of RFPs from a “boilerplate”
standard document issued by the OMG. Anyone can
submit a letter of intent (LOI) to respond to a RFP;
however, to become a submitter, the organization
must become an OMG member.
• A typical OMG standards adoption process is 20
months. The gene expression RFP issued on March
10th 2000 and was an available specification on 16th
Nov 2001.
• The LECIS (Laboratory Equipment Control Interface
Specification) standard is used by Creon as part of their
Q-DIS data standard support.
• There are many “open” tools out there, too—bio-
molecular sequence analysis standard (BSA) is at the
EBI in the form of Open BSA.
• The bibliographic query service standard (BQS) is
also at the EBI as Open BQS.
• The macromolecular structure standard is supported
by the Protein Data Bank as the Open MM toolkit.
• The reason that LSR works is not technology but
people—participation is essential for organizations,
individuals, and evangelists. OMG’s constitution is
both fair and equitable—Having a well-defined
process that is transparent in operation to allow open
sharing of information is the key to its success
• The I3C (Interoperable Informatics Infrastructure
Consortium) that, like its website, no longer
functions, but in the main, standards emerge with the
backing of one or two major vendors and the
consumers follow.
• Very rarely, the consumers rally together and force
change upon the vendors.
• Finally, government bodies enforce mandatory
changes that we struggle to comply with (just ask any
CEO about Sarbanes–Oxley).
• The authors speculate what would happen if the FDA
(Food and Drug Administration) stated that all
electronic submissions had to be in XML for CFR 21
part 11 compliance (Title 21 Code of Federal
Regulations, part 11)!
• This is why information management and knowledge
management are so important to data standards.
Fig: The request for proposals life cycle.
BRINGING TOGETHER ALL THREE DISCIPLINES
• Overcoming the “three big reasons” is the first milestone
in bringing together information and knowledge
management with data standards.
• Domain-specific knowledge is also critical and cross-
domain knowledge even better.
• Finding the data architect who understands the process
and workflow of a chemist is like mining for a rare gem
among the seams of coal. These people are hard to find
and harder to retain.
• As expert disciplines mature and become more
accessible to younger scientists, then multi-skilled
employees will gradually filter upward. However, as this
will take several years, the most widely used approach is
to lure staff from a parallel organization into the
business.
• The only downside is that new ways of thinking and
innovation are now at a premium.
• As with all successful projects, a small “proof-of
concept” pilot that addresses key stakeholder needs is
the best way of gathering momentum to achieve lasting
change and progress.
• Fixing the time delay between compound submission
and biology IC50 (inhibitory concentration at which
50% of the enzyme is inhibited) results has a better
defined scope than building a “science Google” for all
users.
Reference:
COMPUTER APPLICATIONS IN PHARMACEUTICAL
RESEARCH AND DEVELOPMENT BT SEAN EKINS,
Page No. 177-179.
THANK YOU…!

More Related Content

PPTX
PSI-MI stadards
PPT
The eCrystals Federation
PDF
Open PHACTS MIOSS may 2016
PPTX
EOSC-MAR-update.pptx
PPTX
Open Research Data & H2020
PDF
Application of recently developed FAIR metrics to the ELIXIR Core Data Resources
PPTX
Borgman orcid dryadsymposiumoxford20130523
PPTX
FAIR Data and Model Management for Systems Biology (and SOPs too!)
PSI-MI stadards
The eCrystals Federation
Open PHACTS MIOSS may 2016
EOSC-MAR-update.pptx
Open Research Data & H2020
Application of recently developed FAIR metrics to the ELIXIR Core Data Resources
Borgman orcid dryadsymposiumoxford20130523
FAIR Data and Model Management for Systems Biology (and SOPs too!)

Similar to Approaches to data integration and bringing together (20)

PPTX
ELIXIR . Technical Coordinator
PPT
Sansone bio sharing introduction
PPT
Sansone mibbi-intro
PPT
2011-10-11 Open PHACTS at BioIT World Europe
PDF
Overview to: BBSRC Oxford Doctoral Training Partnership - Dr Sansone - July 2014
PDF
Going FAIR: premises, promises and challenges of interoperability standards
PDF
Managing Big Data - Berlin, July 9-10, 201.
PDF
"Standards landscape" NIF Big Data 2 Knowledge (BD2K) Initiative, Sep, 2013
PDF
Big Data Standards - Workshop, ExpBio, Boston, 2015
PPT
Standardisation in BMS European infrastructures
PPTX
The crusade for big data in the AAL domain
PPTX
Linked Data for Biopharma
PDF
All Things Biocuration
PDF
2013HT12504-Dissertation Report
PDF
BioSharing - Update - Feb2016
PPTX
effective data sharing for a learning healthcare system
PPT
2011-11-28 Open PHACTS at RSC CICAG
PDF
Scientific Data Management
PDF
Standards: awareness, information, education
PDF
Life science odin-oct2013-sa-sansone
ELIXIR . Technical Coordinator
Sansone bio sharing introduction
Sansone mibbi-intro
2011-10-11 Open PHACTS at BioIT World Europe
Overview to: BBSRC Oxford Doctoral Training Partnership - Dr Sansone - July 2014
Going FAIR: premises, promises and challenges of interoperability standards
Managing Big Data - Berlin, July 9-10, 201.
"Standards landscape" NIF Big Data 2 Knowledge (BD2K) Initiative, Sep, 2013
Big Data Standards - Workshop, ExpBio, Boston, 2015
Standardisation in BMS European infrastructures
The crusade for big data in the AAL domain
Linked Data for Biopharma
All Things Biocuration
2013HT12504-Dissertation Report
BioSharing - Update - Feb2016
effective data sharing for a learning healthcare system
2011-11-28 Open PHACTS at RSC CICAG
Scientific Data Management
Standards: awareness, information, education
Life science odin-oct2013-sa-sansone
Ad

Recently uploaded (20)

PPTX
20th Century Theater, Methods, History.pptx
PDF
IGGE1 Understanding the Self1234567891011
PPTX
History, Philosophy and sociology of education (1).pptx
PDF
LDMMIA Reiki Yoga Finals Review Spring Summer
PDF
1.3 FINAL REVISED K-10 PE and Health CG 2023 Grades 4-10 (1).pdf
PDF
MBA _Common_ 2nd year Syllabus _2021-22_.pdf
PDF
Vision Prelims GS PYQ Analysis 2011-2022 www.upscpdf.com.pdf
PDF
advance database management system book.pdf
PPTX
Onco Emergencies - Spinal cord compression Superior vena cava syndrome Febr...
PDF
BP 704 T. NOVEL DRUG DELIVERY SYSTEMS (UNIT 2).pdf
DOCX
Cambridge-Practice-Tests-for-IELTS-12.docx
PDF
FORM 1 BIOLOGY MIND MAPS and their schemes
PDF
What if we spent less time fighting change, and more time building what’s rig...
PPTX
Computer Architecture Input Output Memory.pptx
PDF
Hazard Identification & Risk Assessment .pdf
PPTX
Unit 4 Computer Architecture Multicore Processor.pptx
PDF
medical_surgical_nursing_10th_edition_ignatavicius_TEST_BANK_pdf.pdf
PDF
BP 704 T. NOVEL DRUG DELIVERY SYSTEMS (UNIT 1)
PPTX
CHAPTER IV. MAN AND BIOSPHERE AND ITS TOTALITY.pptx
PDF
David L Page_DCI Research Study Journey_how Methodology can inform one's prac...
20th Century Theater, Methods, History.pptx
IGGE1 Understanding the Self1234567891011
History, Philosophy and sociology of education (1).pptx
LDMMIA Reiki Yoga Finals Review Spring Summer
1.3 FINAL REVISED K-10 PE and Health CG 2023 Grades 4-10 (1).pdf
MBA _Common_ 2nd year Syllabus _2021-22_.pdf
Vision Prelims GS PYQ Analysis 2011-2022 www.upscpdf.com.pdf
advance database management system book.pdf
Onco Emergencies - Spinal cord compression Superior vena cava syndrome Febr...
BP 704 T. NOVEL DRUG DELIVERY SYSTEMS (UNIT 2).pdf
Cambridge-Practice-Tests-for-IELTS-12.docx
FORM 1 BIOLOGY MIND MAPS and their schemes
What if we spent less time fighting change, and more time building what’s rig...
Computer Architecture Input Output Memory.pptx
Hazard Identification & Risk Assessment .pdf
Unit 4 Computer Architecture Multicore Processor.pptx
medical_surgical_nursing_10th_edition_ignatavicius_TEST_BANK_pdf.pdf
BP 704 T. NOVEL DRUG DELIVERY SYSTEMS (UNIT 1)
CHAPTER IV. MAN AND BIOSPHERE AND ITS TOTALITY.pptx
David L Page_DCI Research Study Journey_how Methodology can inform one's prac...
Ad

Approaches to data integration and bringing together

  • 1. APPROACHES TO DATA INTEGRATION AND BRINGING TOGETHER PRESENTED BY, SHARADHA.M I M PHARM DEPT.OF PHARMACEUTICS JSS COLLEGE OF PHARMACY MYSORE
  • 2. CONTENTS: 1.Approaches To Data Integration Standards - OMG,13C (RIP) LSIT & W3C 2.Bringing Together All Three Disciplines
  • 3. APPROACHES TO DATA INTEGRATION STANDARDS -OMG, I3C (RIP) LSIT, and W3C • As stated on the OMG (Object Management) website (http://www.omg.org/), a lack of data standards results in data conversions, loss of information, lack of interoperability, etc. • Current standards are XML (Extensible Markup Language) , LSID (Life Sciences Identifiers), and now the RDF(Resource Description Framework) from the W3C (World Wide Web Consortium),which is extensible though hard to implement. Substantial work on OO (Object Oriented) modeling of life science data types takes place at the OMG’s LSR (Life Sciences Research) group
  • 4. • The OMG (Object Management) adopts and publishes “Interface” specifications. Specifications may also be chosen from existing products in competitive selection process. Any interface specifications are freely available to both members and nonmembers. • Implementations must be available from an OMG member. The OMG uses many approaches to object oriented modeling of complex data types. • The OMG has specific domain task force (DTF) groups that deal with these specific types.
  • 5. • Working groups are formed to address specific areas of interest within the task force. Of course, whenever there is potential for reuse of existing standards, it is positively encouraged! • The life sciences domain task force (LSRDTF) has several working groups: architecture and road map, biochemical pathways, chemi-informatics, gene expression, sequence analysis, and single nucleotide polymorphisms.
  • 6. • Each working group has a corresponding chairperson who champions requests for proposals (RFPs) from any interested parties. • The working group members identify key needs and help with the building of RFPs from a “boilerplate” standard document issued by the OMG. Anyone can submit a letter of intent (LOI) to respond to a RFP; however, to become a submitter, the organization must become an OMG member. • A typical OMG standards adoption process is 20 months. The gene expression RFP issued on March 10th 2000 and was an available specification on 16th Nov 2001.
  • 7. • The LECIS (Laboratory Equipment Control Interface Specification) standard is used by Creon as part of their Q-DIS data standard support. • There are many “open” tools out there, too—bio- molecular sequence analysis standard (BSA) is at the EBI in the form of Open BSA. • The bibliographic query service standard (BQS) is also at the EBI as Open BQS.
  • 8. • The macromolecular structure standard is supported by the Protein Data Bank as the Open MM toolkit. • The reason that LSR works is not technology but people—participation is essential for organizations, individuals, and evangelists. OMG’s constitution is both fair and equitable—Having a well-defined process that is transparent in operation to allow open sharing of information is the key to its success
  • 9. • The I3C (Interoperable Informatics Infrastructure Consortium) that, like its website, no longer functions, but in the main, standards emerge with the backing of one or two major vendors and the consumers follow. • Very rarely, the consumers rally together and force change upon the vendors.
  • 10. • Finally, government bodies enforce mandatory changes that we struggle to comply with (just ask any CEO about Sarbanes–Oxley). • The authors speculate what would happen if the FDA (Food and Drug Administration) stated that all electronic submissions had to be in XML for CFR 21 part 11 compliance (Title 21 Code of Federal Regulations, part 11)! • This is why information management and knowledge management are so important to data standards.
  • 11. Fig: The request for proposals life cycle.
  • 12. BRINGING TOGETHER ALL THREE DISCIPLINES • Overcoming the “three big reasons” is the first milestone in bringing together information and knowledge management with data standards. • Domain-specific knowledge is also critical and cross- domain knowledge even better. • Finding the data architect who understands the process and workflow of a chemist is like mining for a rare gem among the seams of coal. These people are hard to find and harder to retain. • As expert disciplines mature and become more accessible to younger scientists, then multi-skilled employees will gradually filter upward. However, as this will take several years, the most widely used approach is to lure staff from a parallel organization into the business.
  • 13. • The only downside is that new ways of thinking and innovation are now at a premium. • As with all successful projects, a small “proof-of concept” pilot that addresses key stakeholder needs is the best way of gathering momentum to achieve lasting change and progress. • Fixing the time delay between compound submission and biology IC50 (inhibitory concentration at which 50% of the enzyme is inhibited) results has a better defined scope than building a “science Google” for all users.
  • 14. Reference: COMPUTER APPLICATIONS IN PHARMACEUTICAL RESEARCH AND DEVELOPMENT BT SEAN EKINS, Page No. 177-179.