SlideShare a Scribd company logo
Integrated omics analysis 
pipeline for model organism 
with Cytoscape 
Kozo Nishida 
RIKEN, Quantitavie Biology Center(QBiC) 
Kozo Nishida @ RECOMB2014, Nov 11, 2014 1
Goal 
Reproducible and modifiable omics analysis pipeline 
in a single environment 
(www.genome.jp) 
(Rohn, 2012) 
Omics 
experiment 
Omics data 
analysis 
Pathway data 
integration 
Network 
Analysis 
Kozo Nishida @ RECOMB2014, Nov 11, 2014 2
• Each Process is separated and is NOT easy to reproduce the whole analysis pipeline. 
• Especially in need of modifying the process and aggregating the result. 
• Cytoscape is good software for pathway data integration and network analysis but… 
• NOT the best for whole analysis pipeline, Java app is NOT easy to modify. 
• R is common for omics data preprocessing and analysis 
• Python is good for data aggregation 
• both can be used for data integration and network analysis. 
(www.genome.jp) (Rohn, 2012) 
Why? 
Omics experiment Omics data analysis Pathway data integration Network Analysis 
3
How? 
Kozo Nishida @ RECOMB2014, Nov 11, 2014 4
Seamless, reproducible, and modifiable IPython notebook environment 
• Cytoscape is controlled by IPython notebook 
• Low-revel access to Cytoscape with cyREST app 
• Omics analysis with Bioconductor R packages 
• Pathway data integration with Python and graph-database 
• KEGG-based pathway data integration with KEGGscape app 
5
cyREST and KEGGscape app 
• cyREST provides us with scripting language interface 
• cyREST is useful and suitable for KEGG-based pathway data integration 
• KEGGscape supports KEGG pathway xml(KGML) import on Cytoscape 
• Difference from CytoKEGG and CyKEGGparser 
• CytoKEGG and CyKEGGparser have several additional features, but too 
specialized in their purpose and some un-supported pathways. 
• KEGGscape simply supports importing and reconstructing KEGG pathway as it 
is, as many as KEGG provides. (Currently supports all KEGG pathways.) 
Kozo Nishida @ RECOMB2014, Nov 11, 2014 6
Demo for E. coli 1 
OR 
Mapping differentially expressed genes 
(Between WT and lrp-) to KEGG 
Kozo Nishida @ RECOMB2014, Nov 11, 2014 7
Demo for E. coli 2 
OR 
Mapping E. coli drugtargets to KEGG 
Kozo Nishida @ RECOMB2014, Nov 11, 2014 8
Other example for Arabidopsis thaliana 
OR 
Mapping time-series metabolome profile 
to KEGG (http://goo.gl/jk01HP) 
Kozo Nishida @ RECOMB2014, Nov 11, 2014 9
Conclusions, Future work 
• Constructed reproducible (and flexible) omics analysis 
pipeline with cyREST app. 
• You can replace KEGG to WikiPathways, Reactome or 
other pathway databases 
• Packaging Python and R utility functions 
• py2cytoscape (github.com/idekerlab/py2cytoscape) 
• More example IPython notebooks!! 
•Welcome your contribution, please see 
github.com/idekerlab/cy-rest-python 
Kozo Nishida @ RECOMB2014, Nov 11, 2014 10
Acknowledgments 
• The Cytoscape consortium 
• Keiichiro Ono (UCSD) 
• cyREST, KEGGscape 
• Atsushi Fukushima (RIKEN CSRS) 
• AtMetExpress Arabidopsis thaliana metabolome database 
• Jun Sese (AIST CBRC) 
• Mentoring in “Tool Prototype for Integrated Database Analysis” project 
This project is supported by National Bioscience Database Center(NBDC), Japan 
Kozo Nishida @ RECOMB2014, Nov 11, 2014 11

More Related Content

ODP
Bio2RDF@BH2010
François Belleau
 
PDF
PMR metabolomics and transcriptomics database and its RESTful web APIs: A dat...
Araport
 
PDF
ICAR 2015 Workshop - Agnes Chan
Araport
 
PPTX
Vaughn aip walkthru_pag2015
Araport
 
PDF
HRGRN: enabling graph search and integrative analysis of Arabidopsis signalin...
Araport
 
PDF
Plant ontology web services on Araport
Araport
 
PDF
Tripal within the Arabidopsis Information Portal - PAG XXIII
Vivek Krishnakumar
 
PPTX
A guided tour of Araport
Araport
 
Bio2RDF@BH2010
François Belleau
 
PMR metabolomics and transcriptomics database and its RESTful web APIs: A dat...
Araport
 
ICAR 2015 Workshop - Agnes Chan
Araport
 
Vaughn aip walkthru_pag2015
Araport
 
HRGRN: enabling graph search and integrative analysis of Arabidopsis signalin...
Araport
 
Plant ontology web services on Araport
Araport
 
Tripal within the Arabidopsis Information Portal - PAG XXIII
Vivek Krishnakumar
 
A guided tour of Araport
Araport
 

What's hot (20)

PDF
ICAR 2015 Workshop - Blake Meyers
Araport
 
PPTX
Lisa Johnson at #ICG13: Re-assembly, quality evaluation, and annotation of 67...
GigaScience, BGI Hong Kong
 
PDF
LDV: Light-weight Database Virtualization
Tanu Malik
 
PPTX
Gryphon Framework - Preliminary Results Feb-2014
Adriel Café
 
PPTX
Big Data Initiatives for Agroecosystems
Cyndy Parr
 
PPTX
Websci17 final
Amar Viswanathan
 
PPTX
Getting Started Of Elasticsearch
Mustafa Dağdelen
 
PDF
Producing, publishing and consuming linked data - CSHALS 2013
François Belleau
 
PPTX
Madrid SPARQL handson
Victor de Boer
 
PDF
Evaluating Named Entity Recognition and Disambiguation in News and Tweets
Marieke van Erp
 
PDF
GEN: A Database Interface Generator for HPC Programs
Tanu Malik
 
PPT
eScience Resources for the Chemistry Community from the Royal Society of Chem...
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
PDF
The Materials Project - Combining Science and Informatics to Accelerate Mater...
University of California, San Diego
 
PDF
ICAR 2015 Plenary - Chris Town
Araport
 
PDF
Introduction to Galaxy (UEB-UAT Bioinformatics Course - Session 2.2 - VHIR, B...
VHIR Vall d’Hebron Institut de Recerca
 
PDF
The Galaxy bioinformatics workflow environment
Rutger Vos
 
PDF
Computational workflows for omics analyses at the IARC
Matthieu Foll
 
PPTX
PTU: Using Provenance for Repeatability
Tanu Malik
 
PPTX
Jcdl2013 mklein
Martin Klein
 
ODP
2009 0807 Lod Gmod
Jun Zhao
 
ICAR 2015 Workshop - Blake Meyers
Araport
 
Lisa Johnson at #ICG13: Re-assembly, quality evaluation, and annotation of 67...
GigaScience, BGI Hong Kong
 
LDV: Light-weight Database Virtualization
Tanu Malik
 
Gryphon Framework - Preliminary Results Feb-2014
Adriel Café
 
Big Data Initiatives for Agroecosystems
Cyndy Parr
 
Websci17 final
Amar Viswanathan
 
Getting Started Of Elasticsearch
Mustafa Dağdelen
 
Producing, publishing and consuming linked data - CSHALS 2013
François Belleau
 
Madrid SPARQL handson
Victor de Boer
 
Evaluating Named Entity Recognition and Disambiguation in News and Tweets
Marieke van Erp
 
GEN: A Database Interface Generator for HPC Programs
Tanu Malik
 
eScience Resources for the Chemistry Community from the Royal Society of Chem...
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
The Materials Project - Combining Science and Informatics to Accelerate Mater...
University of California, San Diego
 
ICAR 2015 Plenary - Chris Town
Araport
 
Introduction to Galaxy (UEB-UAT Bioinformatics Course - Session 2.2 - VHIR, B...
VHIR Vall d’Hebron Institut de Recerca
 
The Galaxy bioinformatics workflow environment
Rutger Vos
 
Computational workflows for omics analyses at the IARC
Matthieu Foll
 
PTU: Using Provenance for Repeatability
Tanu Malik
 
Jcdl2013 mklein
Martin Klein
 
2009 0807 Lod Gmod
Jun Zhao
 
Ad

Viewers also liked (7)

PPTX
Pathway解析のためのSPARQL wapper packageの作成
Kozo Nishida
 
PPTX
wikidataへの化合物idの追加
Kozo Nishida
 
PPT
Cytoscape retreat 2010_demo
Kozo Nishida
 
PPTX
Pathway解析のためのSPARQL wrapper packageの作成
Kozo Nishida
 
PDF
integration_Aug2015
Francisco Garc
 
PDF
正則化つき線形モデル(「入門機械学習第6章」より)
Eric Sartre
 
PDF
Integrative bioinformatics analysis of Parkinson's disease related omics data
Enrico Glaab
 
Pathway解析のためのSPARQL wapper packageの作成
Kozo Nishida
 
wikidataへの化合物idの追加
Kozo Nishida
 
Cytoscape retreat 2010_demo
Kozo Nishida
 
Pathway解析のためのSPARQL wrapper packageの作成
Kozo Nishida
 
integration_Aug2015
Francisco Garc
 
正則化つき線形モデル(「入門機械学習第6章」より)
Eric Sartre
 
Integrative bioinformatics analysis of Parkinson's disease related omics data
Enrico Glaab
 
Ad

Similar to Integrated omics analysis pipeline for model organism with Cytoscape, Kozo Nishida (20)

PDF
Esa 2014 qiime
Zech Xu
 
PPTX
Rapid outbreak characterisation - UK Genome Sciences 2014 - wed 3 sep 2014
Torsten Seemann
 
PPTX
Multi-omics infrastructure and data for R/Bioconductor
Levi Waldron
 
PPTX
GLBIO/CCBC Metagenomics Workshop
Morgan Langille
 
PDF
Workshop: Introduction to Cytoscape at UT-KBRIN Bioinformatics Summit 2014 (4...
Keiichiro Ono
 
PDF
Algal Functional Annotation Tool
Sarah Adams
 
PPT
Toast 2015 qiime_talk
TOASTworkshop
 
PDF
Reproducible Workflow with Cytoscape and Jupyter Notebook
Keiichiro Ono
 
PDF
Investigating plant systems using data integration and network analysis
Catherine Canevet
 
PPTX
May 15 workshop
Fahadahammed2
 
PPTX
May workshop
Fahadahammed2
 
PDF
poster
Debjit Ray
 
PPTX
Pipeline or pipe dream - Midlands Micro Meeting UK - mon 15 sep 2014
Torsten Seemann
 
PDF
User-friendly bioinformatics (Monthly Informational workshop)
Elia Brodsky
 
PDF
Introduction to Biological Network Analysis and Visualization with Cytoscape ...
Keiichiro Ono
 
PPTX
GMI proficiency testing- Progress report 2016
ExternalEvents
 
PDF
Getting Started with RNA-Seq Data Analysis
Andreas Wilm
 
PDF
Pasteur deep seq_analysis_theory_2016
Christophe Antoniewski
 
PDF
Tutorial for 16S rRNA Gene Analysis with QIIME2.pdf
aqil azizi
 
PPTX
Caporaso sloan qiime_workshop_slides_18_oct2012
gregcaporaso
 
Esa 2014 qiime
Zech Xu
 
Rapid outbreak characterisation - UK Genome Sciences 2014 - wed 3 sep 2014
Torsten Seemann
 
Multi-omics infrastructure and data for R/Bioconductor
Levi Waldron
 
GLBIO/CCBC Metagenomics Workshop
Morgan Langille
 
Workshop: Introduction to Cytoscape at UT-KBRIN Bioinformatics Summit 2014 (4...
Keiichiro Ono
 
Algal Functional Annotation Tool
Sarah Adams
 
Toast 2015 qiime_talk
TOASTworkshop
 
Reproducible Workflow with Cytoscape and Jupyter Notebook
Keiichiro Ono
 
Investigating plant systems using data integration and network analysis
Catherine Canevet
 
May 15 workshop
Fahadahammed2
 
May workshop
Fahadahammed2
 
poster
Debjit Ray
 
Pipeline or pipe dream - Midlands Micro Meeting UK - mon 15 sep 2014
Torsten Seemann
 
User-friendly bioinformatics (Monthly Informational workshop)
Elia Brodsky
 
Introduction to Biological Network Analysis and Visualization with Cytoscape ...
Keiichiro Ono
 
GMI proficiency testing- Progress report 2016
ExternalEvents
 
Getting Started with RNA-Seq Data Analysis
Andreas Wilm
 
Pasteur deep seq_analysis_theory_2016
Christophe Antoniewski
 
Tutorial for 16S rRNA Gene Analysis with QIIME2.pdf
aqil azizi
 
Caporaso sloan qiime_workshop_slides_18_oct2012
gregcaporaso
 

Recently uploaded (20)

PPTX
Sleep_pysilogy_types_REM_NREM_duration_Sleep center
muralinath2
 
DOCX
Echoes_of_Andromeda_Partial (1).docx9989
yakshitkrishnia5a3
 
PDF
The Cosmic Symphony: How Photons Shape the Universe and Our Place Within It
kutatomoshi
 
PPTX
Cell Structure and Organelles Slides PPT
JesusNeyra8
 
PPTX
Evolution of diet breadth in herbivorus insects.pptx
Mr. Suresh R. Jambagi
 
PPTX
ANTIANGINAL DRUGS.pptx m pharm pharmacology
46JaybhayAshwiniHari
 
PPTX
METABOLIC_SYNDROME Dr Shadab- kgmu lucknow pptx
ShadabAlam169087
 
PDF
Identification of unnecessary object allocations using static escape analysis
ESUG
 
PDF
A deep Search for Ethylene Glycol and Glycolonitrile in the V883 Ori Protopla...
Sérgio Sacani
 
PDF
Renewable Energy Resources (Solar, Wind, Nuclear, Geothermal) Presentation
RimshaNaeem23
 
PPTX
The Obesity Paradox. Friend or Foe ?pptx
drdgd1972
 
PPT
1a. Basic Principles of Medical Microbiology Part 2 [Autosaved].ppt
separatedwalk
 
PPTX
Brain_stem_Medulla oblongata_functions of pons_mid brain
muralinath2
 
PPTX
mirna_2025_clase_genética_cinvestav_Dralvarez
Cinvestav
 
PPTX
Role of GIS in precision farming.pptx
BikramjitDeuri
 
PDF
Drones in Disaster Response: Real-Time Data Collection and Analysis (www.kiu...
publication11
 
PDF
Approximating manifold orbits by means of Machine Learning Techniques
Esther Barrabés Vera
 
PPTX
fghvqwhfugqaifbiqufbiquvbfuqvfuqyvfqvfouiqvfq
PERMISONJERWIN
 
PPTX
Nanofertilizer: Its potential benefits and associated challenges.pptx
BikramjitDeuri
 
PPTX
Hepatopulmonary syndrome power point presentation
raknasivar1997
 
Sleep_pysilogy_types_REM_NREM_duration_Sleep center
muralinath2
 
Echoes_of_Andromeda_Partial (1).docx9989
yakshitkrishnia5a3
 
The Cosmic Symphony: How Photons Shape the Universe and Our Place Within It
kutatomoshi
 
Cell Structure and Organelles Slides PPT
JesusNeyra8
 
Evolution of diet breadth in herbivorus insects.pptx
Mr. Suresh R. Jambagi
 
ANTIANGINAL DRUGS.pptx m pharm pharmacology
46JaybhayAshwiniHari
 
METABOLIC_SYNDROME Dr Shadab- kgmu lucknow pptx
ShadabAlam169087
 
Identification of unnecessary object allocations using static escape analysis
ESUG
 
A deep Search for Ethylene Glycol and Glycolonitrile in the V883 Ori Protopla...
Sérgio Sacani
 
Renewable Energy Resources (Solar, Wind, Nuclear, Geothermal) Presentation
RimshaNaeem23
 
The Obesity Paradox. Friend or Foe ?pptx
drdgd1972
 
1a. Basic Principles of Medical Microbiology Part 2 [Autosaved].ppt
separatedwalk
 
Brain_stem_Medulla oblongata_functions of pons_mid brain
muralinath2
 
mirna_2025_clase_genética_cinvestav_Dralvarez
Cinvestav
 
Role of GIS in precision farming.pptx
BikramjitDeuri
 
Drones in Disaster Response: Real-Time Data Collection and Analysis (www.kiu...
publication11
 
Approximating manifold orbits by means of Machine Learning Techniques
Esther Barrabés Vera
 
fghvqwhfugqaifbiqufbiquvbfuqvfuqyvfqvfouiqvfq
PERMISONJERWIN
 
Nanofertilizer: Its potential benefits and associated challenges.pptx
BikramjitDeuri
 
Hepatopulmonary syndrome power point presentation
raknasivar1997
 

Integrated omics analysis pipeline for model organism with Cytoscape, Kozo Nishida

  • 1. Integrated omics analysis pipeline for model organism with Cytoscape Kozo Nishida RIKEN, Quantitavie Biology Center(QBiC) Kozo Nishida @ RECOMB2014, Nov 11, 2014 1
  • 2. Goal Reproducible and modifiable omics analysis pipeline in a single environment (www.genome.jp) (Rohn, 2012) Omics experiment Omics data analysis Pathway data integration Network Analysis Kozo Nishida @ RECOMB2014, Nov 11, 2014 2
  • 3. • Each Process is separated and is NOT easy to reproduce the whole analysis pipeline. • Especially in need of modifying the process and aggregating the result. • Cytoscape is good software for pathway data integration and network analysis but… • NOT the best for whole analysis pipeline, Java app is NOT easy to modify. • R is common for omics data preprocessing and analysis • Python is good for data aggregation • both can be used for data integration and network analysis. (www.genome.jp) (Rohn, 2012) Why? Omics experiment Omics data analysis Pathway data integration Network Analysis 3
  • 4. How? Kozo Nishida @ RECOMB2014, Nov 11, 2014 4
  • 5. Seamless, reproducible, and modifiable IPython notebook environment • Cytoscape is controlled by IPython notebook • Low-revel access to Cytoscape with cyREST app • Omics analysis with Bioconductor R packages • Pathway data integration with Python and graph-database • KEGG-based pathway data integration with KEGGscape app 5
  • 6. cyREST and KEGGscape app • cyREST provides us with scripting language interface • cyREST is useful and suitable for KEGG-based pathway data integration • KEGGscape supports KEGG pathway xml(KGML) import on Cytoscape • Difference from CytoKEGG and CyKEGGparser • CytoKEGG and CyKEGGparser have several additional features, but too specialized in their purpose and some un-supported pathways. • KEGGscape simply supports importing and reconstructing KEGG pathway as it is, as many as KEGG provides. (Currently supports all KEGG pathways.) Kozo Nishida @ RECOMB2014, Nov 11, 2014 6
  • 7. Demo for E. coli 1 OR Mapping differentially expressed genes (Between WT and lrp-) to KEGG Kozo Nishida @ RECOMB2014, Nov 11, 2014 7
  • 8. Demo for E. coli 2 OR Mapping E. coli drugtargets to KEGG Kozo Nishida @ RECOMB2014, Nov 11, 2014 8
  • 9. Other example for Arabidopsis thaliana OR Mapping time-series metabolome profile to KEGG (http://goo.gl/jk01HP) Kozo Nishida @ RECOMB2014, Nov 11, 2014 9
  • 10. Conclusions, Future work • Constructed reproducible (and flexible) omics analysis pipeline with cyREST app. • You can replace KEGG to WikiPathways, Reactome or other pathway databases • Packaging Python and R utility functions • py2cytoscape (github.com/idekerlab/py2cytoscape) • More example IPython notebooks!! •Welcome your contribution, please see github.com/idekerlab/cy-rest-python Kozo Nishida @ RECOMB2014, Nov 11, 2014 10
  • 11. Acknowledgments • The Cytoscape consortium • Keiichiro Ono (UCSD) • cyREST, KEGGscape • Atsushi Fukushima (RIKEN CSRS) • AtMetExpress Arabidopsis thaliana metabolome database • Jun Sese (AIST CBRC) • Mentoring in “Tool Prototype for Integrated Database Analysis” project This project is supported by National Bioscience Database Center(NBDC), Japan Kozo Nishida @ RECOMB2014, Nov 11, 2014 11

Editor's Notes

  • #2: I’m Kozo Nishida. From RIKEN, Japan. I would like introduce new omics analysis environment project for Cytoscape.
  • #3: My project goal is to realize reproducible and modifiable omics analysis pipleline in a single environment. These processes are the component for the pipeline.
  • #4: The reason why I do this project is Each process is separated and is NOT easy to reproduce the whole analysis pipeline. Especially this is hard in need of modifying the process and aggregating the result connecting them. Of course Cytoscape is good for the latter part of the pipeline, but Is NOT the best for whole analysis pipeline. Because this pipeline needs flexibility, but Java app requires compiling and is NOT easy to modify. And for the former part of pipeline, R language is common for omics data preprocessing and analysis. And Python language is good for data aggregation and can be used for a general purpose. These languages are easy to modify the pipeline through a trial and error process.
  • #5: So I implemented a pipeline like this image.
  • #6: Usually Cytoscape users mainly control Cytoscape with GUI. But in my case, Cytoscape is programmatically controlled by IPython notebook with cyREST app. I leave omics analysis to bioconductor packages, and pathway data integration to Python and graph-database. And the main pathway integration target is KEGG.
  • #7: You need to install cyREST and KEGGscape app to reproduce our pipeline. Python requires cyREST interface to control Cytoscape. cyREST is useful and suitable for KEGG-based pathway data integration. And default Cytoscape does not support KEGG pathway. So you need to install KEGGscape app. There are CytoKEGG and CyKEGGparser apps for KEGG pathway support in Cytoscape3. But these are specialized in their workflow, and you may feel difficult to control these app from Python. So I recommend KEGGscape, currently KEGGscape simply supports importing and reconstructing KEGG pathway as it is. And currently KEGGscape supports all KEGG pathways.
  • #8: Next I show you two demoes for E.coli. 1 is a pipeline for mapping differentially expressed genes(between WT and lrp mutant strain) to KEGG First I import KEGG pathway from Ipython notebook. At this stage pipeline is not finished, so no data integrated yet. Next I run whole pipeline, the differentially expressed gene table are merged, Yellow highlighted nodes are enzyme nodes including the differentially expressed genes.
  • #9: 2 is a pipeline for mapping E.coli drugtargest in drugbank to KEGG. First I also import a KEGG pathway from Ipython notebook. At this stage pipeline is not finished yet, no data integrated. Next I run whole pipeline, the E coli drugtargets in Drugbank are mapped to KEGG pathway. This column is the drugs, and next column is target protein of KEGG gene product. These pipelines are independent and you can combine them. But here for simplicity I separated them two movies.
  • #10: And RIKEN has rich resource for plant metabolome, so I’m also trying to construct more complicated pipeline for Arabidopsis thaliana. I cannot show you all metabolome data yet, but you can see a sample metabolome mapping Ipython notebook from here.
  • #11: I showed you some reproducible omics analysis pipelines with cyREST app. For example I used KEGG but you can replace target pathway from KEGG to Wikipathways, Reactome or other databases. Of course you can integrate these all data with Python. I’ve just started this project, so the Python and R packaging is not finished yet. I hope to contribute py2cytoscape project. And Increasing the number of notebook example is important for this project. If you have interest about notebook contribution, please see this URL.
  • #12: I thank the following people, thank you.