The EPA CompTox Dashboard as a Data
Integration Hub for Environmental
Chemistry Data
Antony Williams1, Andrew McEachran2, Imran Shah1, Richard Judson1,
John Wambaugh1, Nancy Baker3, George Helman2, Chris Grulke1,
Kamel Mansouri4, Grace Patlewicz1, Ann Richard1,
Jeremy Dunne1 and Jeff Edwards1
1) National Center for Computational Toxicology, U.S. Environmental Protection Agency, RTP, NC
2) Oak Ridge Institute of Science and Education (ORISE) Research Participant, RTP, NC
3) Leidos, RTP, NC
4) Integrated Laboratory Systems, Inc., RTP, NC
August 2018
ACS Fall Meeting, Boston
The views expressed in this presentation are those of the author and do not necessarily reflect the views or policies of the U.S. EPA
http://orcid.org/0000-0003-1423-330X
ENVR 152
• National Center for Computational Toxicology
established in 2005 to integrate:
– High-throughput and high-content technologies
– Modern molecular biology
– Data mining and statistical modeling
– Computational biology and chemistry
• Researching computational approaches to
quickly evaluate the safety of chemicals for
potential risk.
• Outputs: a lot of data, models, algorithms and
software applications
National Center for
Computational Toxicology
The CompTox Chemistry Dashboard
• A publicly accessible website delivering access:
– New entry portal for all NCCT dashboards
– ~762,000 chemicals with related property data
– Searchable by chemical, product use, gene and assay
(ToxCast)
– Experimental and predicted physicochemical property data
– “Bioactivity data” for the ToxCast/Tox21 project
– Generalized Read-Across (GenRA) module
– Links to other agency websites and public data resources
– “Literature” searches for chemicals using public resources
– “Batch searching” for thousands of chemicals
– DOWNLOADABLE Open Data for reuse and repurposing
2
CompTox Portal
3
CompTox Dashboard
https://comptox.epa.gov/dashboard
4
CompTox Dashboard
Chemicals
5
CompTox Dashboard
Products and Use Categories
6
CompTox Dashboard
Assays and Genes
7
Detailed Chemical Pages
8
Physicochemical properties
9
OPERA Predicted Properties
10
Detailed OPERA Prediction Reports
11
Access to Chemical Hazard Data
12
In Vitro Bioassay Screening
ToxCast and Tox21
13
In Vitro Bioassay Screening
ToxCast and Tox21
14
In Vitro Bioassay Screening
ToxCast and Tox21
15
Sources of Exposure to Chemicals
16
Sources of Exposure to Chemicals
17
Generalized Read-Across (GenRA)
18
Identifiers to Support Searches
19
Literature Searches and Links
20
Abstract Sifter – PubMed Integration
21
Abstract Sifter for Excel
22
External Links to ~80 websites
23
Integrated Linkouts
24
Not just chemical “structures”
• Chemicals in commerce, of interest to the
EPA, are not all easily represented by
structures
• Different chemical substances supported
– Chemical structures
– “UVCB chemicals” - Unknown or Variable Composition,
Complex Reaction Products and Biological Materials
– Metabolites and transformation products
– Homologous series as Markush Structures
– Curated classes of chemicals
25
UVCB Chemicals
26
Markush Structures
27
Transformation Products
28
Not just chemical “structures”
• Different chemical substances supported
– Chemical structures
– “UVCB chemicals” - Unknown or Variable Composition,
Complex Reaction Products and Biological Materials
– Metabolites and transformation products
– Homologous series as Markush Structures
– Curated classes of chemicals
• Lists of chemicals
– Submitted lists of chemicals – Federal, State, International
and other general lists
– Growing lists to support specific projects – e.g. ToxCast
phases, algal toxins, our publication datasets
29
List of Chemicals
30
Algal Toxins
31
Mass and Formula Searches
Supporting Mass Spectrometry
32
Advanced Searches
Mass Based Search
33
Advanced Searches
Mass Based Search
34
Batch Searching
• Singleton searches are useful but we work
with thousands of chemicals!
• Typical questions
– What are the SMILES strings for a list of 1000 chemicals?
– Do any of this list of chemicals have XXX type of data?
– What are the predicted logP values for a list of chemicals?
– Can I get chemical lists in Excel files? In SDF files?
35
Batch Searching
36
Batch Searching
37
Real-Time Predictions
38
Real-Time Predictions
39
Crowdsourced Curation
40
Crowdsourced Curation
• Annotation of any table cell – admin panel
views highlighted cell
41
Our support for FAIR Data
42
Downloadable Data
43
Work in Progress
• Present work in development
– Real time prediction using OPERA models
– Structure/substructure/similarity search integration
– Ongoing expansion of chemicals
– Release of new ToxCast database (v3_2018)
– Addition of products data from 10s of thousands of
MSDS sheets
– Analytical Data support
• Integration of analytical data for ToxCast/Tox21 data
• Spectral searching against predicted Mass Spectra
44
Prototype Development
45
Predicted Mass Spectra
http://cfmid.wishartlab.com/
• MS/MS spectra prediction for ESI+, ESI-, and EI
• Predictions generated and stored for >700,000
structures, to be accessible via Dashboard
46
Conclusion
• The EPA CompTox Dashboard provides access to
data for ~762,000 chemicals, ToxCast assays and
associated product use categories
• High quality data from ongoing curation efforts
• An integration hub for multiple “modules”
– Experimental and predicted properties
– Human and Ecological Hazard data
– Exposure data – products, data in the environment
– In vitro bioassay data – ToxCast/Tox21
– Literature searching – Google Scholar and PubMed
– Specialized searches – mass/formula for analytical support
– Batch searching and Real Time Predictions
• Data and functionality increases with every release
47
Acknowledgments
• The NCCT CompTox Chemistry Dashboard
Development Team
• NERL scientists (Jon Sobus, Elin Ulrich) –
Mass Spectrometry
• Kamel Mansouri – OPERA models
• Todd Martin and Valery Tkachenko – TEST
predictions
• Nancy Baker – Abstract Sifter
48
Contact
Andrew McEachran
ORISE Postdoctoral Fellow
US EPA Office of Research and Development
mceachran.andrew@epa.gov
http://orcid.org/0000-0003-1423-330X
Antony Williams
US EPA Office of Research and Development
National Center for Computational Toxicology (NCCT)
Williams.Antony@epa.gov
ORCID: https://orcid.org/0000-0002-2668-4821
49

More Related Content

PPTX
Accessing information for chemicals in hydraulic fracturing fluids using the ...
PPTX
New developments in delivering public access to data from the National Center...
PPTX
Using the US EPA’s CompTox Chemistry Dashboard for structure identification a...
PPTX
US-EPA CompTox Chemicals Dashboard – integrating chemistry and biology data t...
PPTX
Chemical identification of unknowns in high resolution mass spectrometry usin...
PPTX
Development of a Tool for Systematic Integration of Traditional and New Appro...
PPTX
US-EPA CompTox Chemicals Dashboard providing access to experimental and predi...
PPTX
Non-targeted analysis supported by data and cheminformatics delivered via the...
Accessing information for chemicals in hydraulic fracturing fluids using the ...
New developments in delivering public access to data from the National Center...
Using the US EPA’s CompTox Chemistry Dashboard for structure identification a...
US-EPA CompTox Chemicals Dashboard – integrating chemistry and biology data t...
Chemical identification of unknowns in high resolution mass spectrometry usin...
Development of a Tool for Systematic Integration of Traditional and New Appro...
US-EPA CompTox Chemicals Dashboard providing access to experimental and predi...
Non-targeted analysis supported by data and cheminformatics delivered via the...

What's hot (20)

PPTX
Does bigger mean better in the world of chemistry databases?
PPTX
Structure identification approaches using the EPA CompTox Chemicals Dashboard...
PPTX
Structure identification by Mass Spectrometry Non-Targeted Analysis using the...
PPTX
US EPA CompTox Chemistry Dashboard as a source of data to fill data gaps for ...
PPTX
US-EPA Chemicals Dashboard – an integrated data hub for environmental science
PPTX
US EPA CompTox Chemicals Dashboard Data Integration Hub to Support Environmen...
PPTX
How to place your research questions or results into the context of the "Lega...
PPTX
Non-targeted analysis supported by data and cheminformatics delivered via the...
PPTX
Web-based access to data for >600 disinfection by-products via the EPA CompTo...
PPTX
Incorporating new technologies and High Throughput Screening in the design an...
PPTX
What chemicals constitute the Exposome? Accessing data via the US EPA’s Comp...
PPTX
Delivering access to chemistry and bioassay data from the National Center for...
PPTX
Introduction to Cheminformatics: Accessing data through the CompTox Chemicals...
PPTX
Delivering The Benefits of Chemical-Biological Integration in Computational T...
PPTX
Applications of the US EPA’s CompTox Chemistry Dashboard to support structure...
PPTX
CompTox Chemicals Dashboard: Data and tools to support chemical and environme...
PPTX
EPA CompTox chemicals dashboard: An online resource for environmental chemists
PPTX
Structure Identification Using High Resolution Mass Spectrometry Data and the...
PPTX
Environmental Chemistry Compound Identification Using High Resolution Mass Sp...
Does bigger mean better in the world of chemistry databases?
Structure identification approaches using the EPA CompTox Chemicals Dashboard...
Structure identification by Mass Spectrometry Non-Targeted Analysis using the...
US EPA CompTox Chemistry Dashboard as a source of data to fill data gaps for ...
US-EPA Chemicals Dashboard – an integrated data hub for environmental science
US EPA CompTox Chemicals Dashboard Data Integration Hub to Support Environmen...
How to place your research questions or results into the context of the "Lega...
Non-targeted analysis supported by data and cheminformatics delivered via the...
Web-based access to data for >600 disinfection by-products via the EPA CompTo...
Incorporating new technologies and High Throughput Screening in the design an...
What chemicals constitute the Exposome? Accessing data via the US EPA’s Comp...
Delivering access to chemistry and bioassay data from the National Center for...
Introduction to Cheminformatics: Accessing data through the CompTox Chemicals...
Delivering The Benefits of Chemical-Biological Integration in Computational T...
Applications of the US EPA’s CompTox Chemistry Dashboard to support structure...
CompTox Chemicals Dashboard: Data and tools to support chemical and environme...
EPA CompTox chemicals dashboard: An online resource for environmental chemists
Structure Identification Using High Resolution Mass Spectrometry Data and the...
Environmental Chemistry Compound Identification Using High Resolution Mass Sp...
Ad

Similar to The EPA CompTox Dashboard as a Data Integration Hub for Environmental Chemistry Data (20)

PPTX
Delivering web-based access to data and algorithms to support computational t...
PPTX
PPTX
Delivering chemical-associated data via EPA web applications
PPTX
EPA CompTox Chemicals Dashboard as a Data Integration Hub for Environmental C...
PPTX
Accessing Environmental Chemistry Data via Data Dashboards
PPTX
Accessing Environmental Chemistry Data via Data Dashboards and Applications t...
PPTX
Progress in Using Big Data in Chemical Toxicity Research at the National Cent...
PPTX
The EPA Comptox Chemicals Dashboard as a Data Integration Hub for Environment...
PPTX
Accessing data to support pesticide residue and emerging contaminant analysis...
PPTX
The EPA Comptox Chemistry Dashboard: A Web-Based Data Integration Hub for Tox...
PPTX
Structure Identification Using High Resolution Mass Spectrometry Data and the...
PPTX
Accessing Data to Support Pesticide Residue and Emerging Contaminant Analysis...
PPTX
Applications of the US EPA’s CompTox chemicals dashboard to support structure...
PPTX
The US-EPA CompTox Chemicals Dashboard – a key player in the domain of Open S...
PPTX
The US-EPA CompTox Chemicals Dashboard to support Non-Targeted Analysis
PPTX
Chemistry Data Delivery from the US-EPA Center for Computational Toxicology a...
PPTX
EPA CompTox Chemicals Dashboard as a Data Integration Hub for Environmental C...
PPTX
US-EPA Chemicals Dashboard and Applications to Digital Design of Molecules
PPTX
The US-EPA CompTox Chemicals Dashboard – an online data integration hub suppo...
PPTX
US-EPA Chemicals Dashboard – an integrated data hub for environmental science
Delivering web-based access to data and algorithms to support computational t...
Delivering chemical-associated data via EPA web applications
EPA CompTox Chemicals Dashboard as a Data Integration Hub for Environmental C...
Accessing Environmental Chemistry Data via Data Dashboards
Accessing Environmental Chemistry Data via Data Dashboards and Applications t...
Progress in Using Big Data in Chemical Toxicity Research at the National Cent...
The EPA Comptox Chemicals Dashboard as a Data Integration Hub for Environment...
Accessing data to support pesticide residue and emerging contaminant analysis...
The EPA Comptox Chemistry Dashboard: A Web-Based Data Integration Hub for Tox...
Structure Identification Using High Resolution Mass Spectrometry Data and the...
Accessing Data to Support Pesticide Residue and Emerging Contaminant Analysis...
Applications of the US EPA’s CompTox chemicals dashboard to support structure...
The US-EPA CompTox Chemicals Dashboard – a key player in the domain of Open S...
The US-EPA CompTox Chemicals Dashboard to support Non-Targeted Analysis
Chemistry Data Delivery from the US-EPA Center for Computational Toxicology a...
EPA CompTox Chemicals Dashboard as a Data Integration Hub for Environmental C...
US-EPA Chemicals Dashboard and Applications to Digital Design of Molecules
The US-EPA CompTox Chemicals Dashboard – an online data integration hub suppo...
US-EPA Chemicals Dashboard – an integrated data hub for environmental science
Ad

More from Andrew McEachran (9)

PDF
Consensus ranking and fragmentation prediction for identification of unknowns...
PDF
Developing tools for high resolution mass spectrometry-based screening via th...
PDF
Leveraging chemistry data to improve exposure analyses using the EPA’s CompTo...
PPTX
Using the US EPA's CompTox Chemistry Dashboard to advance non-targeted analys...
PDF
An open workflow to generate "MS-Ready" structures and improve non-targeted m...
PDF
A comparison of three chromatographic retention time prediction models
PPTX
Using the US EPA's CompTox Chemistry Dashboard to support identification and ...
PPTX
The EPA iCSS Chemistry Dashboard to Support Compound Identification Using Hig...
PPTX
Structure identification using high resolution mass spectrometry data and the...
Consensus ranking and fragmentation prediction for identification of unknowns...
Developing tools for high resolution mass spectrometry-based screening via th...
Leveraging chemistry data to improve exposure analyses using the EPA’s CompTo...
Using the US EPA's CompTox Chemistry Dashboard to advance non-targeted analys...
An open workflow to generate "MS-Ready" structures and improve non-targeted m...
A comparison of three chromatographic retention time prediction models
Using the US EPA's CompTox Chemistry Dashboard to support identification and ...
The EPA iCSS Chemistry Dashboard to Support Compound Identification Using Hig...
Structure identification using high resolution mass spectrometry data and the...

Recently uploaded (20)

PDF
From Molecular Interactions to Solubility in Deep Eutectic Solvents: Explorin...
PDF
Glycolysis by Rishikanta Usham, Dhanamanjuri University
PDF
Cosmology using numerical relativity - what hapenned before big bang?
PDF
The Future of Telehealth: Engineering New Platforms for Care (www.kiu.ac.ug)
PPT
Enhancing Laboratory Quality Through ISO 15189 Compliance
PDF
Exploring PCR Techniques and Applications
PPTX
Thyroid disorders presentation for MBBS.pptx
PPTX
The Female Reproductive System - Grade 10 ppt
PPTX
Arterial Blood Pressure_Blood Flow_Hemodynamics.pptx
PPTX
Basic principles of chromatography techniques
PPT
Chapter 6 Introductory course Biology Camp
PPTX
Cutaneous tuberculosis Dermatology
PPTX
Preformulation.pptx Preformulation studies-Including all parameter
PPTX
Neuro Ophthalmic diseases and their lesions
PPTX
CELL DIVISION Biology meiosis and mitosis
PDF
Chemistry and Changes 8th Grade Science .pdf
PDF
Social preventive and pharmacy. Pdf
PPT
ecg for noob ecg interpretation ecg recall
PDF
Sujay Rao Mandavilli IJISRT25AUG764 context based approaches to population ma...
PDF
Sumer, Akkad and the mythology of the Toradja Sa'dan.pdf
From Molecular Interactions to Solubility in Deep Eutectic Solvents: Explorin...
Glycolysis by Rishikanta Usham, Dhanamanjuri University
Cosmology using numerical relativity - what hapenned before big bang?
The Future of Telehealth: Engineering New Platforms for Care (www.kiu.ac.ug)
Enhancing Laboratory Quality Through ISO 15189 Compliance
Exploring PCR Techniques and Applications
Thyroid disorders presentation for MBBS.pptx
The Female Reproductive System - Grade 10 ppt
Arterial Blood Pressure_Blood Flow_Hemodynamics.pptx
Basic principles of chromatography techniques
Chapter 6 Introductory course Biology Camp
Cutaneous tuberculosis Dermatology
Preformulation.pptx Preformulation studies-Including all parameter
Neuro Ophthalmic diseases and their lesions
CELL DIVISION Biology meiosis and mitosis
Chemistry and Changes 8th Grade Science .pdf
Social preventive and pharmacy. Pdf
ecg for noob ecg interpretation ecg recall
Sujay Rao Mandavilli IJISRT25AUG764 context based approaches to population ma...
Sumer, Akkad and the mythology of the Toradja Sa'dan.pdf

The EPA CompTox Dashboard as a Data Integration Hub for Environmental Chemistry Data

  • 1. The EPA CompTox Dashboard as a Data Integration Hub for Environmental Chemistry Data Antony Williams1, Andrew McEachran2, Imran Shah1, Richard Judson1, John Wambaugh1, Nancy Baker3, George Helman2, Chris Grulke1, Kamel Mansouri4, Grace Patlewicz1, Ann Richard1, Jeremy Dunne1 and Jeff Edwards1 1) National Center for Computational Toxicology, U.S. Environmental Protection Agency, RTP, NC 2) Oak Ridge Institute of Science and Education (ORISE) Research Participant, RTP, NC 3) Leidos, RTP, NC 4) Integrated Laboratory Systems, Inc., RTP, NC August 2018 ACS Fall Meeting, Boston The views expressed in this presentation are those of the author and do not necessarily reflect the views or policies of the U.S. EPA http://orcid.org/0000-0003-1423-330X ENVR 152
  • 2. • National Center for Computational Toxicology established in 2005 to integrate: – High-throughput and high-content technologies – Modern molecular biology – Data mining and statistical modeling – Computational biology and chemistry • Researching computational approaches to quickly evaluate the safety of chemicals for potential risk. • Outputs: a lot of data, models, algorithms and software applications National Center for Computational Toxicology
  • 3. The CompTox Chemistry Dashboard • A publicly accessible website delivering access: – New entry portal for all NCCT dashboards – ~762,000 chemicals with related property data – Searchable by chemical, product use, gene and assay (ToxCast) – Experimental and predicted physicochemical property data – “Bioactivity data” for the ToxCast/Tox21 project – Generalized Read-Across (GenRA) module – Links to other agency websites and public data resources – “Literature” searches for chemicals using public resources – “Batch searching” for thousands of chemicals – DOWNLOADABLE Open Data for reuse and repurposing 2
  • 13. Access to Chemical Hazard Data 12
  • 14. In Vitro Bioassay Screening ToxCast and Tox21 13
  • 15. In Vitro Bioassay Screening ToxCast and Tox21 14
  • 16. In Vitro Bioassay Screening ToxCast and Tox21 15
  • 17. Sources of Exposure to Chemicals 16
  • 18. Sources of Exposure to Chemicals 17
  • 20. Identifiers to Support Searches 19
  • 22. Abstract Sifter – PubMed Integration 21
  • 24. External Links to ~80 websites 23
  • 26. Not just chemical “structures” • Chemicals in commerce, of interest to the EPA, are not all easily represented by structures • Different chemical substances supported – Chemical structures – “UVCB chemicals” - Unknown or Variable Composition, Complex Reaction Products and Biological Materials – Metabolites and transformation products – Homologous series as Markush Structures – Curated classes of chemicals 25
  • 30. Not just chemical “structures” • Different chemical substances supported – Chemical structures – “UVCB chemicals” - Unknown or Variable Composition, Complex Reaction Products and Biological Materials – Metabolites and transformation products – Homologous series as Markush Structures – Curated classes of chemicals • Lists of chemicals – Submitted lists of chemicals – Federal, State, International and other general lists – Growing lists to support specific projects – e.g. ToxCast phases, algal toxins, our publication datasets 29
  • 33. Mass and Formula Searches Supporting Mass Spectrometry 32
  • 36. Batch Searching • Singleton searches are useful but we work with thousands of chemicals! • Typical questions – What are the SMILES strings for a list of 1000 chemicals? – Do any of this list of chemicals have XXX type of data? – What are the predicted logP values for a list of chemicals? – Can I get chemical lists in Excel files? In SDF files? 35
  • 42. Crowdsourced Curation • Annotation of any table cell – admin panel views highlighted cell 41
  • 43. Our support for FAIR Data 42
  • 45. Work in Progress • Present work in development – Real time prediction using OPERA models – Structure/substructure/similarity search integration – Ongoing expansion of chemicals – Release of new ToxCast database (v3_2018) – Addition of products data from 10s of thousands of MSDS sheets – Analytical Data support • Integration of analytical data for ToxCast/Tox21 data • Spectral searching against predicted Mass Spectra 44
  • 47. Predicted Mass Spectra http://cfmid.wishartlab.com/ • MS/MS spectra prediction for ESI+, ESI-, and EI • Predictions generated and stored for >700,000 structures, to be accessible via Dashboard 46
  • 48. Conclusion • The EPA CompTox Dashboard provides access to data for ~762,000 chemicals, ToxCast assays and associated product use categories • High quality data from ongoing curation efforts • An integration hub for multiple “modules” – Experimental and predicted properties – Human and Ecological Hazard data – Exposure data – products, data in the environment – In vitro bioassay data – ToxCast/Tox21 – Literature searching – Google Scholar and PubMed – Specialized searches – mass/formula for analytical support – Batch searching and Real Time Predictions • Data and functionality increases with every release 47
  • 49. Acknowledgments • The NCCT CompTox Chemistry Dashboard Development Team • NERL scientists (Jon Sobus, Elin Ulrich) – Mass Spectrometry • Kamel Mansouri – OPERA models • Todd Martin and Valery Tkachenko – TEST predictions • Nancy Baker – Abstract Sifter 48
  • 50. Contact Andrew McEachran ORISE Postdoctoral Fellow US EPA Office of Research and Development [email protected] http://orcid.org/0000-0003-1423-330X Antony Williams US EPA Office of Research and Development National Center for Computational Toxicology (NCCT) [email protected] ORCID: https://orcid.org/0000-0002-2668-4821 49