IRIDA: Canada’s federated
platform for genomic
epidemiology
William Hsiao, Ph.D.
William.hsiao@bccdc.ca
@wlhsiao
BC Centre for Disease Control Public Health Laboratory
and University of British Columbia
IRIDA Platform Overview
• IRIDA= Integrated Rapid Infectious Disease Analysis
• A free, open source, standards compliant, high quality genomic
epidemiology analysis platform to support real-time disease
outbreak investigations
Core Functions:
• Management of strain and genomic sequence data
• Rapid processing and analysis of genomic data
• Informative display of genomic results
• Sample, Case, and aggregate data (“metadata”) Management
Target audience:
• Public health agencies who need a platform to manage and
process genomic data
• Public health agencies who need a platform to use genomics for
outbreak investigations
IRIDA
Sequencing
Instruments
Web
Application
Data
management
Built-in
Analytical
Tools
External
Galaxy
Command-
line Tools
10 simple rules (wish list) to build a
better public health microbiology
genomic epidemiology analysis
system
Download
Latest version at https://github.com/phac-nml/irida
1: Engage the Users Through the Entire Software
Development Cycle
National
Public Health Agency
Provincial
Public Health Agency
Academic/Public
- Project Team has direct access to state of the
art research in academia
- Project Team is directly embedded in user
organization
2: Have A Simple User Interface
Line List View (under testing)
Timeline View (Conceptualization)
Selectable fields
Travel
Symptoms and Onset
Exposure Types
Hospitalization
Launch a pipeline
Be Like
3: Build a Robust, Extensible Platform
• IRIDA uses Galaxy to
manage workflows
• Adding additional
pipelines is relatively
easy
• Using a standard
API to allow 3rd party
tools to obtain data
from IRIDA (e.g.
IslandViewer and
GenGIS)
IRIDA
ServletContainer
REST API
Central File
Storage
Web
Interface
ApplicationLogic
Compute
Cluster
Galaxy
$ ~ >_ Galaxy
http://www.pathogenomics.sfu.ca/islandviewer/
http://kiwi.cs.dal.ca/GenGIS/Main_Page
4: Have Extensive Documentation
• Documentation should be available for
• Users – step by step tutorial with screen shots / FAQ
• System Administrators – installation instructions / issue trackers
• Developers – open source, collaborative development / IRC Channel
• Easily Accessible at https://irida.corefacility.ca/documentation/
5: Implement QC Throughout the Whole Application
• Genomics is sensitive and sequence data are inherently noisy
• Genomics is a rapidly advancing technology
• Standardizing pipelines difficult and can stifle innovation
• Better to standardize the performance and reporting metrics and ensure any
validated pipelines meet the testing criteria
• Developing a general QC testing module (RCQC) that use ontology to
standardize QC metrics (https://github.com/Public-Health-Bioinformatics/rcqc)
• Data Provenance and Version Control (data + Pipelines) are must’s for
Diagnostic Labs
6: Build to Enable Collaboration
• Be able to compare pipelines
• Pipeline implemented using Galaxy – transparent
and shareable
• Define QC criteria using ontology to compare the
different pipelines of the same purpose
• Be able to share data in standard formats to
minimize data re-entry from one platform to
another
• Federation of platforms using standard API to
share data and analysis results
7: Use Compatible Data Standards
• Sequence data are more compatible / shareable but
metadata are currently in silo and incompatible
• Collaboration and Sharing are difficult when data are
incompatible
• Compatibility != Sameness
• Use Ontology to allow customization of term list but
all terms with same meaning (semantics) should have
the same universal ID (e.g. an URL) to facilitate
mapping of terms
8: Implement Fine Grained Access Control
Detailed View Restricted View
E.g. User role permissions control visibility and editing of content
Authorization
• Industry-standard
authentication and
authorization mechanisms
• Local authorization per
instance.
• Method-level authorization.
• Object-level authorization.
9: Use Technology to Safeguard Patient Privacy
It’s easy to lose control of the Excel Line List -
someone can make a copy of the content and pass
it around without your knowledge; typos are
common and cumulative!
Technology can control who sees what and when
Separate out sensitive patient data from pathogen
sequence data but be able to bring them together
when necessary without resorting to emailing of
line lists!
10: Have Multiple, Flexible Access Options
• No one size fits all solution; Having many platforms to choose from is
a good thing (but data should be portable across platforms!)
• IRIDA is available in several different flavours:
Local Install Virtual Machine Cloud Instance Public Version
Advantages Full control of the
system; your data
never leave your
centre
Full control of the
system; Easy to setup
Full control of the
system; does not
require local
computing
infrastructure
No setup required,
upload your data and
have it processed
using Compute
Canada Resource
Disadvantages Computing
infrastructure and IT
support needed to
main the resource
Not really scalable if
run on your own
desktop; some
performance loss
Data go into a cloud
environment;
uploading to cloud
environment can be
slow
Data go into a public
instance (data
remain private to
your account);
upload can be slow
Acknowledgements
Project Leaders
Fiona Brinkman – SFU
Will Hsiao – PHMRL
Gary Van Domselaar – NML
University of Lisbon
Joᾶo Carriҫo
National Microbiology Laboratory (NML)
Franklin Bristow
Aaron Petkau
Thomas Matthews
Josh Adam
Adam Olson
Tarah Lynch
Shaun Tyler
Philip Mabon
Philip Au
Celine Nadon
Matthew Stuart-Edwards
Morag Graham
Chrystal Berry
Lorelee Tschetter
Aleisha Reimer
Laboratory for Foodborne Zoonoses (LFZ)
Eduardo Taboada
Peter Kruczkiewicz
Chad Laing
Vic Gannon
Matthew Whiteside
Ross Duncan
Steven Mutschall
Simon Fraser University (SFU)
Melanie Courtot
Emma Griffiths
Geoff Winsor
Julie Shay
Matthew Laird
Bhav Dhillon
Raymond Lo
BC Public Health Microbiology &
Reference Laboratory (PHMRL) and BC
Centre for Disease Control (BCCDC)
Judy Isaac-Renton
Patrick Tang
Natalie Prystajecky
Jennifer Gardy
Damion Dooley
Linda Hoang
Kim MacDonald
Yin Chang
Eleni Galanis
Marsha Taylor
Cletus D’Souza
Ana Paccagnella
University of Maryland
Lynn Schriml
Canadian Food Inspection Agency (CFIA)
Burton Blais
Catherine Carrillo
Dominic Lambert
Dalhousie University
Rob Beiko
Alex Keddy
14
McMaster University
Andrew McArthur
Daim Sardar
European Nucleotide Archive
Guy Cochrane
Petra ten Hoopen
Clara Amid
European Food Safety Agency
Leibana Criado Ernesto
Vernazza Francesco
Rizzi Valentina
15
15
IRIDA Annual General Meeting
Winnipeg, April 8-9, 2015

More Related Content

PPTX
Irida bccdc dec10_2015
PPTX
Data mashups skyhook
PPTX
Making obamacare work with Big Data
PPTX
Enabling faster analysis of vaccine adverse event reports with ontology support
PDF
IFD&TC 2018: A Novel Approach for Conveniently and Securely Collecting Person...
PPTX
AIRA Update
PDF
Beacon Network
PDF
dkNET Webinar - FAIR Data Require Better Metadata: The Case for CEDAR 11/13/2020
Irida bccdc dec10_2015
Data mashups skyhook
Making obamacare work with Big Data
Enabling faster analysis of vaccine adverse event reports with ontology support
IFD&TC 2018: A Novel Approach for Conveniently and Securely Collecting Person...
AIRA Update
Beacon Network
dkNET Webinar - FAIR Data Require Better Metadata: The Case for CEDAR 11/13/2020

What's hot (14)

PPTX
Relevance feedback algorithm inspired by Quantum detection
PPTX
AFIX IIS Inegration
PPT
ACK Response Messages
PDF
Two Clinical Workflows - From Unfiltered Variants to a Clinical Report
PDF
Curoverse Presentation at ICG-11 (November 2016)
PPTX
Discover Introduction to REDCap
PDF
Covance Laboratory FSPx Solutions
PPTX
Ncicbiit
PPT
eHealth - Mark Yendt
PDF
Developing Apps: Exposing Your Data Through Araport
PPT
Website ranking system
PPTX
JeremyHaydenOZRI
PPT
Prov4J: A Semantic Web Framework for Generic Provenance Management
Relevance feedback algorithm inspired by Quantum detection
AFIX IIS Inegration
ACK Response Messages
Two Clinical Workflows - From Unfiltered Variants to a Clinical Report
Curoverse Presentation at ICG-11 (November 2016)
Discover Introduction to REDCap
Covance Laboratory FSPx Solutions
Ncicbiit
eHealth - Mark Yendt
Developing Apps: Exposing Your Data Through Araport
Website ranking system
JeremyHaydenOZRI
Prov4J: A Semantic Web Framework for Generic Provenance Management
Ad

Similar to Irida immemxi hsiao (20)

PDF
IRIDA: Canada’s federated platform for genomic epidemiology, ABPHM 2015 WHsiao
PPTX
IRIDA: Canada’s federated platform for genomic epidemiology
PDF
GenomeTrakr: Perspectives on linking internationally - Canada and IRIDA.ca
PDF
IMMEM XI: Ten Simple Rules to Build a Better Public Health Genomic Epidemiolo...
PPTX
IRIDA: A Federated Bioinformatics Platform Enabling Richer Genomic Epidemiolo...
PPTX
How Can We Make Genomic Epidemiology a Widespread Reality? - William Hsiao
PPTX
Grand round whsiao_may2015
PDF
Domselaar GMI8 Beijing Canadian WGS Surveillance Experience
PPTX
Making Use of NGS Data: From Reads to Trees and Annotations
PDF
2018 ABRF Tools for improving rigor and reproducibility in bioinformatics
PPT
Final group research_project
PPTX
2015 06-12-beiko-irida-big data
PDF
COMPARE: A global platform for the sequence-based rapid identification of pat...
PPTX
Computational Resources In Infectious Disease
PPTX
Best Practices for Building an End-to-End Workflow for Microbial Genomics
PDF
Standards for public health genomic epidemiology - Biocuration 2015
PPTX
Implementing Pathogen Genomics
PPT
Cloud Computing and Innovations for Optimizing Life Sciences Research
PDF
Virus Analytics Poster
PDF
The Future of Healthcare with Big Data and AI with Ion Stoica and Frank Nothaft
IRIDA: Canada’s federated platform for genomic epidemiology, ABPHM 2015 WHsiao
IRIDA: Canada’s federated platform for genomic epidemiology
GenomeTrakr: Perspectives on linking internationally - Canada and IRIDA.ca
IMMEM XI: Ten Simple Rules to Build a Better Public Health Genomic Epidemiolo...
IRIDA: A Federated Bioinformatics Platform Enabling Richer Genomic Epidemiolo...
How Can We Make Genomic Epidemiology a Widespread Reality? - William Hsiao
Grand round whsiao_may2015
Domselaar GMI8 Beijing Canadian WGS Surveillance Experience
Making Use of NGS Data: From Reads to Trees and Annotations
2018 ABRF Tools for improving rigor and reproducibility in bioinformatics
Final group research_project
2015 06-12-beiko-irida-big data
COMPARE: A global platform for the sequence-based rapid identification of pat...
Computational Resources In Infectious Disease
Best Practices for Building an End-to-End Workflow for Microbial Genomics
Standards for public health genomic epidemiology - Biocuration 2015
Implementing Pathogen Genomics
Cloud Computing and Innovations for Optimizing Life Sciences Research
Virus Analytics Poster
The Future of Healthcare with Big Data and AI with Ion Stoica and Frank Nothaft
Ad

More from IRIDA_community (11)

PPSX
Robertson immemxi final March 2016
PPSX
Hetman immem xi final March 2016
PPSX
Barker immemxi final March 2016
PPTX
Emma FoodON poster3
PPTX
Emma Food on workshop allergy_eg
PDF
Biocuration gen epio_poster
PDF
Emma Griffiths ASM microbe gen_epio_poster
PDF
Julie Shay CCBC poster may 11 2016
PPTX
Integrate Ontologies into your apps
PDF
Report Calc for Quality Control
PPTX
Gen epio immem_griffiths
Robertson immemxi final March 2016
Hetman immem xi final March 2016
Barker immemxi final March 2016
Emma FoodON poster3
Emma Food on workshop allergy_eg
Biocuration gen epio_poster
Emma Griffiths ASM microbe gen_epio_poster
Julie Shay CCBC poster may 11 2016
Integrate Ontologies into your apps
Report Calc for Quality Control
Gen epio immem_griffiths

Recently uploaded (20)

PPTX
AWMI case presentation ppt AWMI case presentation ppt
PPTX
presentation on causes and treatment of glomerular disorders
PPTX
FORENSIC MEDICINE and branches of forensic medicine.pptx
PDF
Tackling Intensified Climatic Civil and Meteorological Aviation Weather Chall...
PPTX
Type 2 Diabetes Mellitus (T2DM) Part 3 v2.pptx
PDF
Nematodes - by Sanjan PV 20-52.pdf based on all aspects
PPTX
This book is about some common childhood
PDF
Integrating Traditional Medicine with Modern Engineering Solutions (www.kiu....
PPTX
Nutrition needs in a Surgical Patient.pptx
PPTX
Genetics and health: study of genes and their roles in inheritance
PPTX
Approach to Abdominal trauma Gemme(COMMENT).pptx
PPTX
Local Anesthesia Local Anesthesia Local Anesthesia
PPTX
Computed Tomography: Hardware and Instrumentation
PPTX
CASE PRESENTATION CLUB FOOT management.pptx
PPTX
Geriatrics_(0).pptxxvvbbbbbbbnnnnnnnnnnk
PDF
neonatology-for-nurses.pdfggghjjkkkkkkjhhg
PPTX
gut microbiomes AND Type 2 diabetes.pptx
PPTX
Acute Abdomen and its management updates.pptx
PPTX
Biostatistics Lecture Notes_Dadason.pptx
DOCX
ORGAN SYSTEM DISORDERS Zoology Class Ass
AWMI case presentation ppt AWMI case presentation ppt
presentation on causes and treatment of glomerular disorders
FORENSIC MEDICINE and branches of forensic medicine.pptx
Tackling Intensified Climatic Civil and Meteorological Aviation Weather Chall...
Type 2 Diabetes Mellitus (T2DM) Part 3 v2.pptx
Nematodes - by Sanjan PV 20-52.pdf based on all aspects
This book is about some common childhood
Integrating Traditional Medicine with Modern Engineering Solutions (www.kiu....
Nutrition needs in a Surgical Patient.pptx
Genetics and health: study of genes and their roles in inheritance
Approach to Abdominal trauma Gemme(COMMENT).pptx
Local Anesthesia Local Anesthesia Local Anesthesia
Computed Tomography: Hardware and Instrumentation
CASE PRESENTATION CLUB FOOT management.pptx
Geriatrics_(0).pptxxvvbbbbbbbnnnnnnnnnnk
neonatology-for-nurses.pdfggghjjkkkkkkjhhg
gut microbiomes AND Type 2 diabetes.pptx
Acute Abdomen and its management updates.pptx
Biostatistics Lecture Notes_Dadason.pptx
ORGAN SYSTEM DISORDERS Zoology Class Ass

Irida immemxi hsiao

  • 1. IRIDA: Canada’s federated platform for genomic epidemiology William Hsiao, Ph.D. [email protected] @wlhsiao BC Centre for Disease Control Public Health Laboratory and University of British Columbia
  • 2. IRIDA Platform Overview • IRIDA= Integrated Rapid Infectious Disease Analysis • A free, open source, standards compliant, high quality genomic epidemiology analysis platform to support real-time disease outbreak investigations Core Functions: • Management of strain and genomic sequence data • Rapid processing and analysis of genomic data • Informative display of genomic results • Sample, Case, and aggregate data (“metadata”) Management Target audience: • Public health agencies who need a platform to manage and process genomic data • Public health agencies who need a platform to use genomics for outbreak investigations IRIDA Sequencing Instruments Web Application Data management Built-in Analytical Tools External Galaxy Command- line Tools
  • 3. 10 simple rules (wish list) to build a better public health microbiology genomic epidemiology analysis system Download Latest version at https://github.com/phac-nml/irida
  • 4. 1: Engage the Users Through the Entire Software Development Cycle National Public Health Agency Provincial Public Health Agency Academic/Public - Project Team has direct access to state of the art research in academia - Project Team is directly embedded in user organization
  • 5. 2: Have A Simple User Interface Line List View (under testing) Timeline View (Conceptualization) Selectable fields Travel Symptoms and Onset Exposure Types Hospitalization Launch a pipeline Be Like
  • 6. 3: Build a Robust, Extensible Platform • IRIDA uses Galaxy to manage workflows • Adding additional pipelines is relatively easy • Using a standard API to allow 3rd party tools to obtain data from IRIDA (e.g. IslandViewer and GenGIS) IRIDA ServletContainer REST API Central File Storage Web Interface ApplicationLogic Compute Cluster Galaxy $ ~ >_ Galaxy http://www.pathogenomics.sfu.ca/islandviewer/ http://kiwi.cs.dal.ca/GenGIS/Main_Page
  • 7. 4: Have Extensive Documentation • Documentation should be available for • Users – step by step tutorial with screen shots / FAQ • System Administrators – installation instructions / issue trackers • Developers – open source, collaborative development / IRC Channel • Easily Accessible at https://irida.corefacility.ca/documentation/
  • 8. 5: Implement QC Throughout the Whole Application • Genomics is sensitive and sequence data are inherently noisy • Genomics is a rapidly advancing technology • Standardizing pipelines difficult and can stifle innovation • Better to standardize the performance and reporting metrics and ensure any validated pipelines meet the testing criteria • Developing a general QC testing module (RCQC) that use ontology to standardize QC metrics (https://github.com/Public-Health-Bioinformatics/rcqc) • Data Provenance and Version Control (data + Pipelines) are must’s for Diagnostic Labs
  • 9. 6: Build to Enable Collaboration • Be able to compare pipelines • Pipeline implemented using Galaxy – transparent and shareable • Define QC criteria using ontology to compare the different pipelines of the same purpose • Be able to share data in standard formats to minimize data re-entry from one platform to another • Federation of platforms using standard API to share data and analysis results
  • 10. 7: Use Compatible Data Standards • Sequence data are more compatible / shareable but metadata are currently in silo and incompatible • Collaboration and Sharing are difficult when data are incompatible • Compatibility != Sameness • Use Ontology to allow customization of term list but all terms with same meaning (semantics) should have the same universal ID (e.g. an URL) to facilitate mapping of terms
  • 11. 8: Implement Fine Grained Access Control Detailed View Restricted View E.g. User role permissions control visibility and editing of content Authorization • Industry-standard authentication and authorization mechanisms • Local authorization per instance. • Method-level authorization. • Object-level authorization.
  • 12. 9: Use Technology to Safeguard Patient Privacy It’s easy to lose control of the Excel Line List - someone can make a copy of the content and pass it around without your knowledge; typos are common and cumulative! Technology can control who sees what and when Separate out sensitive patient data from pathogen sequence data but be able to bring them together when necessary without resorting to emailing of line lists!
  • 13. 10: Have Multiple, Flexible Access Options • No one size fits all solution; Having many platforms to choose from is a good thing (but data should be portable across platforms!) • IRIDA is available in several different flavours: Local Install Virtual Machine Cloud Instance Public Version Advantages Full control of the system; your data never leave your centre Full control of the system; Easy to setup Full control of the system; does not require local computing infrastructure No setup required, upload your data and have it processed using Compute Canada Resource Disadvantages Computing infrastructure and IT support needed to main the resource Not really scalable if run on your own desktop; some performance loss Data go into a cloud environment; uploading to cloud environment can be slow Data go into a public instance (data remain private to your account); upload can be slow
  • 14. Acknowledgements Project Leaders Fiona Brinkman – SFU Will Hsiao – PHMRL Gary Van Domselaar – NML University of Lisbon Joᾶo Carriҫo National Microbiology Laboratory (NML) Franklin Bristow Aaron Petkau Thomas Matthews Josh Adam Adam Olson Tarah Lynch Shaun Tyler Philip Mabon Philip Au Celine Nadon Matthew Stuart-Edwards Morag Graham Chrystal Berry Lorelee Tschetter Aleisha Reimer Laboratory for Foodborne Zoonoses (LFZ) Eduardo Taboada Peter Kruczkiewicz Chad Laing Vic Gannon Matthew Whiteside Ross Duncan Steven Mutschall Simon Fraser University (SFU) Melanie Courtot Emma Griffiths Geoff Winsor Julie Shay Matthew Laird Bhav Dhillon Raymond Lo BC Public Health Microbiology & Reference Laboratory (PHMRL) and BC Centre for Disease Control (BCCDC) Judy Isaac-Renton Patrick Tang Natalie Prystajecky Jennifer Gardy Damion Dooley Linda Hoang Kim MacDonald Yin Chang Eleni Galanis Marsha Taylor Cletus D’Souza Ana Paccagnella University of Maryland Lynn Schriml Canadian Food Inspection Agency (CFIA) Burton Blais Catherine Carrillo Dominic Lambert Dalhousie University Rob Beiko Alex Keddy 14 McMaster University Andrew McArthur Daim Sardar European Nucleotide Archive Guy Cochrane Petra ten Hoopen Clara Amid European Food Safety Agency Leibana Criado Ernesto Vernazza Francesco Rizzi Valentina
  • 15. 15 15 IRIDA Annual General Meeting Winnipeg, April 8-9, 2015

Editor's Notes

  • #3: What is IRIDA?
  • #4: Inspired by Jenn’s keynote, I reworked my slides in the 10 simple rules format Many systems are and will be available for analyzing public health microbiology data and we have seen a few throughout this conference. So I thought I’d present what I think are some of the rules and my wishlist for building a better public health genomic epidemiology platform. Highlighting how some of this thinking apply to our implementation of a platform Some of these rules have been implemented well in others applications
  • #15: Large Group of People who contributed to this work
  • #16: We also have a wonderful group of advisors