SlideShare a Scribd company logo
Bio2RDF's namespace
   SPARQL endpoint




                 François Belleau
         Centre de Biologie Computationnelle
                     du CRCHUQ
You know them?

  How can we help to navigate in the
huge Bioinformatics databases cloud ?
2005 BioPAX.gif next to Semantic Web
   image vision of Tim Berner Lee
Linked Data in 2007
Bio2RDF contribution in 2009
2011 Linked Data cloud
Databases of databases names
●   PathGuide
●   Bioformatics.ca Links Directory
●   Annual NAR Database issue
●   Go, Uniprot, Genbank cross-reference list
●   LSRN initiative
●   MIRIAM EBI project
●   BioPAX dataprovider community
●   Bio2RDF Linked Data space
Two interesting questions
●   Which namespace are the most popular for
    identifying database ?

●   How far is the BioPAX community to adopt
    MIRIAM new namespace standard ?
Which namespaces are the most
popular to identify a database ?
Namespaces collection used by the BioPAX
      data provider community.
How far is the BioPAX community to adopt
  MIRIAM new namespace standard ?
How we did this ?




   To answer a complex question
   we first need to build the
   database that will potentially
   answer it: a semantic mashup.
A mashup, in web development, is a web page,
or web application, that uses and combines
data, presentation or functionality from two or
more sources to create new services. The term
implies easy, fast integration, frequently using
open Application programming interfaces (API)
and data sources to produce enriched results
that were not necessarily the original reason for
producing the raw source data.


              http://en.wikipedia.org/wiki/Mashup_(web_application_hybrid)
Building a mashup is a lot easier when
using Semantic Web technologies like
RDF and SPARQL design for data
interoperability.
A three steps method
●   Get the data form the data provider and
    transform it into RDF, we use Talend open
    source Eclipse base ETL software.
●   Load the data in a triplestore many software are
    available (Virtuoso, Sesame, Jena, store,
    Mulgara, etc.) to load your mashup
●   Explore the new dataset using specialised user
    interface (RelFinder, Virtuoso facet browser)
●   Design your SPARQL query and get the answer
Data provider xref resource
●   http://www.ebi.ac.uk/miriam/main/ (XML format)
●   Bio2RDF DNS zone description file (text)
●   http://lsrn.org/ (RDF/XML)
●   http://www.geneontology.org/doc/GO.xrf_abbs
    (key/value format)
●   http://www.uniprot.org/docs/dbxref (key/value format)
●   http://www.ncbi.nlm.nih.gov/genbank/collab/db_xref/
    (HTML)
●   12 BioPAX providers (Reactome, Biomodels, Biocyc,
    Panther, INOH, etc)
Lesson #1

Produce RDF triples with a
  profesionnal ETL tool.
Talend ETL opensource free
         software




       http://www.talend.com/index.php
Talend workflows to convert HTML Genbank
      page to triples and MIRAM XML
Lesson #2

Publish with a SPARQL endpoint.
       (to get a free 5 stars cup)
Load RDF triples into a triplestore
   (we use Openlink Virtuoso)




   http://virtuoso.openlinksw.com/dataspace/dav/wiki/Main/
Full text search




http://namespace.bio2rdf.org/fct/
Discover entity name
and browse the triplestore
Lesson #3

  Consume as you like.
HTTP GET to obtain RDF from URI,
      SPARQL endpoint,
 SOAP services returning RDF,
  semantic web new software...
The needed SPARQL query
to draw the previous graph
 using ManyEyes service




     http://namespace.bio2rdf.org/sparql
Use a SOAP service




http://namespace.bio2rdf.org/bio2rdf/services.wsdl
Discover your relations
graphicly with RelFinder




 http://www.visualdataweb.org/relfinder.php
Conclusion

●   Building a mashup is easy with the actual
    software, we still need the RDF data.
●   One SPARQL query in the proper triplestore
    (Bio2RDF's namespace mashup) could answer
    our two initial questions.
●   Why not consider publish your own SPARQL
    endpoint to make semantic hacker's life
    easier ?
Acknowledgements


●   Bio2RDF is a community project available at http://bio2rdf.org
●   The community can be joined at
    https://groups.google.com/forum/?fromgroups#!forum/bio2rdf
●   This work was done under the supervision of Dr Arnaud Droit, assistant
    professor and director of the Centre de Biologie Computationnelle du
    CRCHUQ at Laval University, where a mirror of Bio2RDF is hosted.
●   Michel Dumontier, from the Dumontier Lab at Carleton University, is also
    hosting Bio2RDF server and actually leads the project
●   Thanks to all the people member of the Bio2RDF community, and especially
    Marc-Alexandre Nolin and Peter Ansell, initial developers.
Come in Montreal July 2013 with your
SPARQL endpoint an get a FREE cup!




     http://www.unbsj.ca/sase/csas/data/semantic-trilogy-2013/

More Related Content

What's hot (19)

PDF
Mind the gap! Reflections on the state of repository data harvesting
Simeon Warner
 
PPTX
Application integration with the W3C Linked Data standards
Nandana Mihindukulasooriya
 
PPTX
The nature.com ontologies portal: nature.com/ontologies
Tony Hammond
 
PDF
Web of Data Usage Mining
Markus Luczak-Rösch
 
PPTX
FAIR Software (and Data) Citation: Europe, Research Object Systems, Networks ...
Carole Goble
 
PPTX
LDP4j: A framework for the development of interoperable read-write Linked Da...
Nandana Mihindukulasooriya
 
PPTX
Geography in Linked Ancient World Data
paregorios
 
PDF
DBpedia Tutorial - Feb 2015, Dublin
m_ackermann
 
PDF
Linking Open Government Data at Scale
Bernadette Hyland-Wood
 
PDF
Skills_Details
Bernd Reinhold
 
PPTX
Presentation at the EMBL-EBI Industry RDF meeting
Johannes Keizer
 
PPTX
SWIB14 Weaving repository contents into the Semantic Web
Pascal-Nicolas Becker
 
PDF
Research Shared: researchobject.org
Norman Morrison
 
PPTX
Libraries and Linked Data: Looking to the Future (3)
ALATechSource
 
PPTX
Introduction to Linked Data Platform (LDP)
Hector Correa
 
PPTX
Describing LDP Applications with the Hydra Core Vocabulary
Nandana Mihindukulasooriya
 
PDF
Dataverse Netowrk Project
Julie Goldman
 
PPTX
Pride and ProteomeXchange
Juan Antonio Vizcaino
 
PPTX
Improving the chemistry content of Wikipedia using workflow tools
Mitch Miller
 
Mind the gap! Reflections on the state of repository data harvesting
Simeon Warner
 
Application integration with the W3C Linked Data standards
Nandana Mihindukulasooriya
 
The nature.com ontologies portal: nature.com/ontologies
Tony Hammond
 
Web of Data Usage Mining
Markus Luczak-Rösch
 
FAIR Software (and Data) Citation: Europe, Research Object Systems, Networks ...
Carole Goble
 
LDP4j: A framework for the development of interoperable read-write Linked Da...
Nandana Mihindukulasooriya
 
Geography in Linked Ancient World Data
paregorios
 
DBpedia Tutorial - Feb 2015, Dublin
m_ackermann
 
Linking Open Government Data at Scale
Bernadette Hyland-Wood
 
Skills_Details
Bernd Reinhold
 
Presentation at the EMBL-EBI Industry RDF meeting
Johannes Keizer
 
SWIB14 Weaving repository contents into the Semantic Web
Pascal-Nicolas Becker
 
Research Shared: researchobject.org
Norman Morrison
 
Libraries and Linked Data: Looking to the Future (3)
ALATechSource
 
Introduction to Linked Data Platform (LDP)
Hector Correa
 
Describing LDP Applications with the Hydra Core Vocabulary
Nandana Mihindukulasooriya
 
Dataverse Netowrk Project
Julie Goldman
 
Pride and ProteomeXchange
Juan Antonio Vizcaino
 
Improving the chemistry content of Wikipedia using workflow tools
Mitch Miller
 

Similar to Bio2RDF presentation at Combine 2012 (20)

PDF
Producing, publishing and consuming linked data - CSHALS 2013
François Belleau
 
PDF
Bio2RDF @ W3C HCLS2009
François Belleau
 
PDF
Use of open_linked_data_in_bioinformatics
Remzi Çelebi
 
PDF
Bio ontologies and semantic technologies
Prof. Wim Van Criekinge
 
PDF
Producing, Publishing and Consuming Linked Data Three lessons from the Bio2RD...
François Belleau
 
PPTX
2013 eswc-bio2rdf-r2
Michel Dumontier
 
ODP
Bio2RDF@BH2010
François Belleau
 
PDF
Querying Bio2RDF data
alison.callahan
 
PDF
Bio ontologies and semantic technologies
Prof. Wim Van Criekinge
 
PPTX
The Progress on Sagace and Data Integration
Maori Ito
 
PPT
2008 11 13 Hcls Call
Jun Zhao
 
PDF
Powering Scientific Discovery with the Semantic Web (VanBUG 2014)
Michel Dumontier
 
PPTX
Bio2RDF Release 2: Improved coverage, interoperability and provenance of Link...
Michel Dumontier
 
PDF
BioSamples Database Linked Data, SWAT4LS Tutorial
Rothamsted Research, UK
 
PPTX
Building a Network of Interoperable and Independently Produced Linked and Ope...
Michel Dumontier
 
PDF
Best practices for generating Bio2RDF linked data
alison.callahan
 
PDF
BioSD Tutorial 2014 Editition
Rothamsted Research, UK
 
PPTX
BioPAX Models and Pathways
Michel Dumontier
 
PDF
Bio2 Rdf Presentation V3
nolmar01
 
PDF
Uni protsparqlcloud
Jerven Bolleman
 
Producing, publishing and consuming linked data - CSHALS 2013
François Belleau
 
Bio2RDF @ W3C HCLS2009
François Belleau
 
Use of open_linked_data_in_bioinformatics
Remzi Çelebi
 
Bio ontologies and semantic technologies
Prof. Wim Van Criekinge
 
Producing, Publishing and Consuming Linked Data Three lessons from the Bio2RD...
François Belleau
 
2013 eswc-bio2rdf-r2
Michel Dumontier
 
Bio2RDF@BH2010
François Belleau
 
Querying Bio2RDF data
alison.callahan
 
Bio ontologies and semantic technologies
Prof. Wim Van Criekinge
 
The Progress on Sagace and Data Integration
Maori Ito
 
2008 11 13 Hcls Call
Jun Zhao
 
Powering Scientific Discovery with the Semantic Web (VanBUG 2014)
Michel Dumontier
 
Bio2RDF Release 2: Improved coverage, interoperability and provenance of Link...
Michel Dumontier
 
BioSamples Database Linked Data, SWAT4LS Tutorial
Rothamsted Research, UK
 
Building a Network of Interoperable and Independently Produced Linked and Ope...
Michel Dumontier
 
Best practices for generating Bio2RDF linked data
alison.callahan
 
BioSD Tutorial 2014 Editition
Rothamsted Research, UK
 
BioPAX Models and Pathways
Michel Dumontier
 
Bio2 Rdf Presentation V3
nolmar01
 
Uni protsparqlcloud
Jerven Bolleman
 
Ad

More from François Belleau (17)

PPTX
intermine.bio2rdf.org : A QLever SPARQL endpoint
François Belleau
 
PDF
Bio2RDF @ DILS 2008
François Belleau
 
PDF
Pitch Reactome2json_ld @ swat4hcls 2020
François Belleau
 
PDF
Show de boucane pour ELK
François Belleau
 
PDF
Pitch Qliic coopérathon 2017
François Belleau
 
PDF
2015-11-17 Présentation SEAO et ES
François Belleau
 
PDF
Linuq 20160130
François Belleau
 
PDF
textOdossier
François Belleau
 
PDF
BD2K hackathon - Bio2RDF submission
François Belleau
 
PDF
Découvrir le web sémantique en 15 minutes (Decideo 2014)
François Belleau
 
PDF
Bio2RDF poster for Biocurator 2014 conference
François Belleau
 
PDF
Acfas 2013 - Comment publier sur le web sémantique : la méthode de Bio2RDF
François Belleau
 
PDF
Bio2RDF-ISMB2008
François Belleau
 
PDF
Bio2RDF : A Semantic Web Atlas of post genomic knowledge about Human and Mouse
François Belleau
 
PDF
Bio2RDF should we do it
François Belleau
 
PDF
Bio2RDF: Towards A Mashup To Build Bioinformatics Knowledge System
François Belleau
 
PDF
Bio2RDF/Virtuoso
François Belleau
 
intermine.bio2rdf.org : A QLever SPARQL endpoint
François Belleau
 
Bio2RDF @ DILS 2008
François Belleau
 
Pitch Reactome2json_ld @ swat4hcls 2020
François Belleau
 
Show de boucane pour ELK
François Belleau
 
Pitch Qliic coopérathon 2017
François Belleau
 
2015-11-17 Présentation SEAO et ES
François Belleau
 
Linuq 20160130
François Belleau
 
textOdossier
François Belleau
 
BD2K hackathon - Bio2RDF submission
François Belleau
 
Découvrir le web sémantique en 15 minutes (Decideo 2014)
François Belleau
 
Bio2RDF poster for Biocurator 2014 conference
François Belleau
 
Acfas 2013 - Comment publier sur le web sémantique : la méthode de Bio2RDF
François Belleau
 
Bio2RDF-ISMB2008
François Belleau
 
Bio2RDF : A Semantic Web Atlas of post genomic knowledge about Human and Mouse
François Belleau
 
Bio2RDF should we do it
François Belleau
 
Bio2RDF: Towards A Mashup To Build Bioinformatics Knowledge System
François Belleau
 
Bio2RDF/Virtuoso
François Belleau
 
Ad

Recently uploaded (20)

PPTX
Building Search Using OpenSearch: Limitations and Workarounds
Sease
 
PDF
Blockchain Transactions Explained For Everyone
CIFDAQ
 
PDF
Chris Elwell Woburn, MA - Passionate About IT Innovation
Chris Elwell Woburn, MA
 
PDF
"Beyond English: Navigating the Challenges of Building a Ukrainian-language R...
Fwdays
 
PDF
Achieving Consistent and Reliable AI Code Generation - Medusa AI
medusaaico
 
PDF
Agentic AI lifecycle for Enterprise Hyper-Automation
Debmalya Biswas
 
PDF
LLMs.txt: Easily Control How AI Crawls Your Site
Keploy
 
PPTX
AUTOMATION AND ROBOTICS IN PHARMA INDUSTRY.pptx
sameeraaabegumm
 
PDF
Biography of Daniel Podor.pdf
Daniel Podor
 
PDF
Using FME to Develop Self-Service CAD Applications for a Major UK Police Force
Safe Software
 
PDF
Empower Inclusion Through Accessible Java Applications
Ana-Maria Mihalceanu
 
PPTX
OpenID AuthZEN - Analyst Briefing July 2025
David Brossard
 
PDF
[Newgen] NewgenONE Marvin Brochure 1.pdf
darshakparmar
 
PDF
Smart Trailers 2025 Update with History and Overview
Paul Menig
 
PDF
Jak MŚP w Europie Środkowo-Wschodniej odnajdują się w świecie AI
dominikamizerska1
 
PDF
From Code to Challenge: Crafting Skill-Based Games That Engage and Reward
aiyshauae
 
PPTX
AI Penetration Testing Essentials: A Cybersecurity Guide for 2025
defencerabbit Team
 
PDF
Building Real-Time Digital Twins with IBM Maximo & ArcGIS Indoors
Safe Software
 
PPTX
From Sci-Fi to Reality: Exploring AI Evolution
Svetlana Meissner
 
PDF
How Startups Are Growing Faster with App Developers in Australia.pdf
India App Developer
 
Building Search Using OpenSearch: Limitations and Workarounds
Sease
 
Blockchain Transactions Explained For Everyone
CIFDAQ
 
Chris Elwell Woburn, MA - Passionate About IT Innovation
Chris Elwell Woburn, MA
 
"Beyond English: Navigating the Challenges of Building a Ukrainian-language R...
Fwdays
 
Achieving Consistent and Reliable AI Code Generation - Medusa AI
medusaaico
 
Agentic AI lifecycle for Enterprise Hyper-Automation
Debmalya Biswas
 
LLMs.txt: Easily Control How AI Crawls Your Site
Keploy
 
AUTOMATION AND ROBOTICS IN PHARMA INDUSTRY.pptx
sameeraaabegumm
 
Biography of Daniel Podor.pdf
Daniel Podor
 
Using FME to Develop Self-Service CAD Applications for a Major UK Police Force
Safe Software
 
Empower Inclusion Through Accessible Java Applications
Ana-Maria Mihalceanu
 
OpenID AuthZEN - Analyst Briefing July 2025
David Brossard
 
[Newgen] NewgenONE Marvin Brochure 1.pdf
darshakparmar
 
Smart Trailers 2025 Update with History and Overview
Paul Menig
 
Jak MŚP w Europie Środkowo-Wschodniej odnajdują się w świecie AI
dominikamizerska1
 
From Code to Challenge: Crafting Skill-Based Games That Engage and Reward
aiyshauae
 
AI Penetration Testing Essentials: A Cybersecurity Guide for 2025
defencerabbit Team
 
Building Real-Time Digital Twins with IBM Maximo & ArcGIS Indoors
Safe Software
 
From Sci-Fi to Reality: Exploring AI Evolution
Svetlana Meissner
 
How Startups Are Growing Faster with App Developers in Australia.pdf
India App Developer
 

Bio2RDF presentation at Combine 2012

  • 1. Bio2RDF's namespace SPARQL endpoint François Belleau Centre de Biologie Computationnelle du CRCHUQ
  • 2. You know them? How can we help to navigate in the huge Bioinformatics databases cloud ?
  • 3. 2005 BioPAX.gif next to Semantic Web image vision of Tim Berner Lee
  • 7. Databases of databases names ● PathGuide ● Bioformatics.ca Links Directory ● Annual NAR Database issue ● Go, Uniprot, Genbank cross-reference list ● LSRN initiative ● MIRIAM EBI project ● BioPAX dataprovider community ● Bio2RDF Linked Data space
  • 8. Two interesting questions ● Which namespace are the most popular for identifying database ? ● How far is the BioPAX community to adopt MIRIAM new namespace standard ?
  • 9. Which namespaces are the most popular to identify a database ?
  • 10. Namespaces collection used by the BioPAX data provider community.
  • 11. How far is the BioPAX community to adopt MIRIAM new namespace standard ?
  • 12. How we did this ? To answer a complex question we first need to build the database that will potentially answer it: a semantic mashup.
  • 13. A mashup, in web development, is a web page, or web application, that uses and combines data, presentation or functionality from two or more sources to create new services. The term implies easy, fast integration, frequently using open Application programming interfaces (API) and data sources to produce enriched results that were not necessarily the original reason for producing the raw source data. http://en.wikipedia.org/wiki/Mashup_(web_application_hybrid)
  • 14. Building a mashup is a lot easier when using Semantic Web technologies like RDF and SPARQL design for data interoperability.
  • 15. A three steps method ● Get the data form the data provider and transform it into RDF, we use Talend open source Eclipse base ETL software. ● Load the data in a triplestore many software are available (Virtuoso, Sesame, Jena, store, Mulgara, etc.) to load your mashup ● Explore the new dataset using specialised user interface (RelFinder, Virtuoso facet browser) ● Design your SPARQL query and get the answer
  • 16. Data provider xref resource ● http://www.ebi.ac.uk/miriam/main/ (XML format) ● Bio2RDF DNS zone description file (text) ● http://lsrn.org/ (RDF/XML) ● http://www.geneontology.org/doc/GO.xrf_abbs (key/value format) ● http://www.uniprot.org/docs/dbxref (key/value format) ● http://www.ncbi.nlm.nih.gov/genbank/collab/db_xref/ (HTML) ● 12 BioPAX providers (Reactome, Biomodels, Biocyc, Panther, INOH, etc)
  • 17. Lesson #1 Produce RDF triples with a profesionnal ETL tool.
  • 18. Talend ETL opensource free software http://www.talend.com/index.php
  • 19. Talend workflows to convert HTML Genbank page to triples and MIRAM XML
  • 20. Lesson #2 Publish with a SPARQL endpoint. (to get a free 5 stars cup)
  • 21. Load RDF triples into a triplestore (we use Openlink Virtuoso) http://virtuoso.openlinksw.com/dataspace/dav/wiki/Main/
  • 23. Discover entity name and browse the triplestore
  • 24. Lesson #3 Consume as you like. HTTP GET to obtain RDF from URI, SPARQL endpoint, SOAP services returning RDF, semantic web new software...
  • 25. The needed SPARQL query to draw the previous graph using ManyEyes service http://namespace.bio2rdf.org/sparql
  • 26. Use a SOAP service http://namespace.bio2rdf.org/bio2rdf/services.wsdl
  • 27. Discover your relations graphicly with RelFinder http://www.visualdataweb.org/relfinder.php
  • 28. Conclusion ● Building a mashup is easy with the actual software, we still need the RDF data. ● One SPARQL query in the proper triplestore (Bio2RDF's namespace mashup) could answer our two initial questions. ● Why not consider publish your own SPARQL endpoint to make semantic hacker's life easier ?
  • 29. Acknowledgements ● Bio2RDF is a community project available at http://bio2rdf.org ● The community can be joined at https://groups.google.com/forum/?fromgroups#!forum/bio2rdf ● This work was done under the supervision of Dr Arnaud Droit, assistant professor and director of the Centre de Biologie Computationnelle du CRCHUQ at Laval University, where a mirror of Bio2RDF is hosted. ● Michel Dumontier, from the Dumontier Lab at Carleton University, is also hosting Bio2RDF server and actually leads the project ● Thanks to all the people member of the Bio2RDF community, and especially Marc-Alexandre Nolin and Peter Ansell, initial developers.
  • 30. Come in Montreal July 2013 with your SPARQL endpoint an get a FREE cup! http://www.unbsj.ca/sase/csas/data/semantic-trilogy-2013/