SlideShare a Scribd company logo
Primer for Predocs
17-19 January 2011
Rafael Jimenez
rafael@ebi.ac.uk
EnCORE
presentation
DAS
Technical introduction to the
Distributed Annotation System
Table of contents
• DAS
 Commands and queries
 Design principles
 Documentation
 Clients and servers
Commands and queries
DAS – Andy Jenkinson
23.08.184
Query model
Structured REST URL
– http://server/das/source/command?arguments
– servers, data sources, commands, parameters
Reference object
– e.g. “chromosome X”
Reference servers provide sequence
– http://server/das/source/sequence?segment=X:1,500
Annotation servers provide features
– http://server/das/source/features?segment=X:1,500
DAS – Andy Jenkinson
23.08.185
Data model
Lightweight XML
http://server/das/source/features?segment=X:1,500
<SEGMENT id=“X” start=“1” stop=“500”>
<FEATURE id=“…”>
<TYPE id=“…” category=“…”>…</TYPE>
<METHOD id=“…”>…</METHOD>
<START>…</START>
<END>…</END>
</FEATURE>
<FEATURE id=“…”>
…
</FEATURE>
</SEGMENT>
http://server/das/source/features?segment=X:1,500
<SEGMENT id=“X” start=“1” stop=“500”>
<FEATURE id=“…”>
<TYPE id=“…” category=“…”>…</TYPE>
<METHOD id=“…”>…</METHOD>
<START>…</START>
<END>…</END>
</FEATURE>
<FEATURE id=“…”>
…
</FEATURE>
</SEGMENT>
DAS Annotation source - Protein Feature Request
Non-positional feature
Positional feature
http://www.ebi.ac.uk/das-srv/uniprot/das/uniprot/features?segment=Q12345
DAS Reference source - Protein Sequence Request
http://www.ebi.ac.uk/das-srv/uniprot/das/uniprot/sequence?segment=Q12345
More DAS Commands
• Alignment, Structure and Interaction
• More …
http://server/das/source/entry_points
– entry_points: List of available “chromosomes | contigs | proteins | …”
http://server/das/source/types
– types – provides a summary of the feature types for a segment.
http://server/das/source/stylesheet
– stylesheet – gives hints to the DAS client about how to display the
feature types. Can be ignored of course.
http://server/das/sources
– sources – list of available sources in one DAS server. Replaces the
original, underspecified dsn command.
http://www.biodas.org/wiki/DAS1.6
Design principles
DAS – Andy Jenkinson
23.08.1810
DAS Design Principles
• Data remains distributed
– “live” data
– data providers retain responsibility
– good for changing data
– spreads resources
• Easy for data providers to implement
– simple protocol
– lots of data providers
Documentation
BioDAS
http://www.biodas.org
Tutorials
http://www.biodas.org/wiki/DASWorkshop2010
Versions of DAS
2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011
~250
sources
~380
sources
~650
sources
~ 8 sources
DAS
1.01
~1300
sources
DAS
1.53
DAS
2.0
DAS
2.1
DAS
1.53E
DAS
1.6DAS 1 DAS/2
DAS Specification 1.6
http://www.biodas.org/wiki/DAS1.6
Clients and servers
List of DAS Servers
23.08.1817
DAS Client libraries
23.08.1818
• Bio::Das::Lite (Perl)
• Dasobert (Java)
• jsDAS (Javascript)
List of DAS Clients
23.08.1819
• Ensembl uses DAS to pull in genomic, gene and protein annotations. It also
provides data via DAS.
• Gbrowse is a generic genome browser, and is both a consumer and provider
of DAS.
• IGB is a desktop application for viewing genomic data.
• SPICE is an application for projecting protein annotations onto 3D structures.
• Dasty2 is a web-based viewer for protein annotations
• Jalview is a multiple alignment editor.
• PeppeR is a graphical viewer for 3D electron microscopy data.
• DASMI is an integration portal for protein interaction data.
• DASher is a Java-based viewer for protein annotations.
• EpiC presents structure-function summaries for antibody design.
• STRAP is a STRucture-based sequence Alignment Program.
23.08.18 20
Protein sequence data
Dasty2
23.08.18 21
Genome sequence data
Ensembl
23.08.18 22
Protein structure data
Spice-Sisyphus
23.08.18 23
Protein-protein interaction data
iPfam
23.08.18 24
Sequence alignment data
Pfam
23.08.18 25
EMAP data
EMAP: The Edinburgh Mouse Atlas Project
Gene expression databases (EMAGE & GXD)

DAS reference server

EMAP - Ontology
DAS annotation servers

EMAGE

GXD
Thank you!
Questions?
ProteomicsServicesTeam

More Related Content

Similar to DAS. Technical introduction to the Distributed Annotation System. (20)

PPT
DAS update. Proteomics Services Team
Rafael C. Jimenez
 
PDF
Introduction to firebidSQL 3.x
Fabio Codebue
 
PPTX
Managing Your Security Logs with Elasticsearch
Vic Hargrave
 
PPT
Standardization and integration of molecular biology information with DAS
Rafael C. Jimenez
 
PPTX
Data saturday malta - ADX Azure Data Explorer overview
Riccardo Zamana
 
PDF
The Role of Atom/AtomPub in Digital Archive Services at The University of Tex...
Peter Keane
 
PDF
Introduction to IBM Spectrum Scale and Its Use in Life Science
Sandeep Patil
 
PPTX
Getting started with postgresql
botsplash.com
 
PPTX
Introducing U-SQL (SQLPASS 2016)
Michael Rys
 
PDF
Informatica slides
sureshpaladi12
 
PPT
MOSS 2007 Deployment Fundamentals -Part2
Information Technology
 
PPTX
Webinar 2017. Supercharge your analytics with ClickHouse. Alexander Zaitsev
Altinity Ltd
 
PDF
Thinking restfully
Stelios Gorilas
 
PPTX
RESTful application with JAX-RS and how to expose and test them
Kumaraswamy M
 
PDF
EDB & ELOS Technologies - Break Free from Oracle
EDB
 
PDF
The Data Web and PLM
Koneksys
 
PPTX
Building RESTfull Data Services with WebAPI
Gert Drapers
 
PDF
MySpace Data Architecture June 2009
Mark Ginnebaugh
 
PPTX
OData: Universal Data Solvent or Clunky Enterprise Goo? (GlueCon 2015)
Pat Patterson
 
PPT
Oracle 10g introduction
sagaroceanic11
 
DAS update. Proteomics Services Team
Rafael C. Jimenez
 
Introduction to firebidSQL 3.x
Fabio Codebue
 
Managing Your Security Logs with Elasticsearch
Vic Hargrave
 
Standardization and integration of molecular biology information with DAS
Rafael C. Jimenez
 
Data saturday malta - ADX Azure Data Explorer overview
Riccardo Zamana
 
The Role of Atom/AtomPub in Digital Archive Services at The University of Tex...
Peter Keane
 
Introduction to IBM Spectrum Scale and Its Use in Life Science
Sandeep Patil
 
Getting started with postgresql
botsplash.com
 
Introducing U-SQL (SQLPASS 2016)
Michael Rys
 
Informatica slides
sureshpaladi12
 
MOSS 2007 Deployment Fundamentals -Part2
Information Technology
 
Webinar 2017. Supercharge your analytics with ClickHouse. Alexander Zaitsev
Altinity Ltd
 
Thinking restfully
Stelios Gorilas
 
RESTful application with JAX-RS and how to expose and test them
Kumaraswamy M
 
EDB & ELOS Technologies - Break Free from Oracle
EDB
 
The Data Web and PLM
Koneksys
 
Building RESTfull Data Services with WebAPI
Gert Drapers
 
MySpace Data Architecture June 2009
Mark Ginnebaugh
 
OData: Universal Data Solvent or Clunky Enterprise Goo? (GlueCon 2015)
Pat Patterson
 
Oracle 10g introduction
sagaroceanic11
 

More from Rafael C. Jimenez (20)

PPTX
BMB Resource Integration Workshop
Rafael C. Jimenez
 
PPTX
Proteomics repositories integration using EUDAT resources
Rafael C. Jimenez
 
PPTX
Summary of Technical Coordinators discussions
Rafael C. Jimenez
 
PPTX
The European life-science data infrastructure: Data, Computing and Services ...
Rafael C. Jimenez
 
PPT
Standardisation in BMS European infrastructures
Rafael C. Jimenez
 
PPT
Standards
Rafael C. Jimenez
 
PPT
ELIXIR TCG update
Rafael C. Jimenez
 
PPT
An introduction to programmatic access
Rafael C. Jimenez
 
PPTX
Life science requirements from e-infrastructure: initial results from a joint...
Rafael C. Jimenez
 
PPT
Technical activities in ELIXIR Europe
Rafael C. Jimenez
 
PPTX
Challenges of big data. Summary day 1.
Rafael C. Jimenez
 
PPTX
Challenges of big data. Aims of the workshop.
Rafael C. Jimenez
 
PPTX
Data submissions and archiving raw data in life sciences. A pilot with Proteo...
Rafael C. Jimenez
 
PPT
ELIXIR and data grand challenges in life sciences
Rafael C. Jimenez
 
PPT
SASI, A lightweight standard for exchanging course information
Rafael C. Jimenez
 
BMB Resource Integration Workshop
Rafael C. Jimenez
 
Proteomics repositories integration using EUDAT resources
Rafael C. Jimenez
 
Summary of Technical Coordinators discussions
Rafael C. Jimenez
 
The European life-science data infrastructure: Data, Computing and Services ...
Rafael C. Jimenez
 
Standardisation in BMS European infrastructures
Rafael C. Jimenez
 
ELIXIR TCG update
Rafael C. Jimenez
 
An introduction to programmatic access
Rafael C. Jimenez
 
Life science requirements from e-infrastructure: initial results from a joint...
Rafael C. Jimenez
 
Technical activities in ELIXIR Europe
Rafael C. Jimenez
 
Challenges of big data. Summary day 1.
Rafael C. Jimenez
 
Challenges of big data. Aims of the workshop.
Rafael C. Jimenez
 
Data submissions and archiving raw data in life sciences. A pilot with Proteo...
Rafael C. Jimenez
 
ELIXIR and data grand challenges in life sciences
Rafael C. Jimenez
 
SASI, A lightweight standard for exchanging course information
Rafael C. Jimenez
 
Ad

Recently uploaded (20)

PDF
Asthamudi lake and its fisheries&importance .pdf
J. Bovas Joel BFSc
 
PPTX
CNS.pptx Central nervous system meninges ventricles of brain it's structure a...
Ashwini I Chuncha
 
PDF
A Man of the Forest: The Contributions of Gifford Pinchot
RowanSales
 
PDF
Carbonate formation and fluctuating habitability on Mars
Sérgio Sacani
 
PDF
Unit-5 ppt.pdf unit 5 organic chemistry 3
visionshukla007
 
PDF
High-speedBouldersandtheDebrisFieldinDARTEjecta
Sérgio Sacani
 
PDF
Rapid protoplanet formation in the outer Solar System recorded in a dunite fr...
Sérgio Sacani
 
PPTX
Microbiome_Engineering_Poster_Fixed.pptx
SupriyaPolisetty1
 
PPTX
LESSON 2 PSYCHOSOCIAL DEVELOPMENT.pptx L
JeanCarolColico1
 
PDF
Calcium in a supernova remnant as a fingerprint of a sub-Chandrasekhar-mass e...
Sérgio Sacani
 
PPT
Experimental Design by Cary Willard v3.ppt
MohammadRezaNirooman1
 
PDF
BlackBody Radiation experiment report.pdf
Ghadeer Shaabna
 
PPTX
Neuroinflammation and microglial subtypes
KanakChaudhary10
 
PDF
Carbon-richDustInjectedintotheInterstellarMediumbyGalacticWCBinaries Survives...
Sérgio Sacani
 
PDF
soil and environmental microbiology.pdf
Divyaprabha67
 
PDF
Portable Hyperspectral Imaging (pHI) for the enhanced recording of archaeolog...
crabbn
 
PDF
Annual report 2024 - Inria - English version.pdf
Inria
 
PDF
Integrating Lifestyle Data into Personalized Health Solutions (www.kiu.ac.ug)
publication11
 
PDF
Unit-3 ppt.pdf organic chemistry unit 3 heterocyclic
visionshukla007
 
PDF
Preserving brand authenticity amid AI-driven misinformation: Sustaining consu...
Selcen Ozturkcan
 
Asthamudi lake and its fisheries&importance .pdf
J. Bovas Joel BFSc
 
CNS.pptx Central nervous system meninges ventricles of brain it's structure a...
Ashwini I Chuncha
 
A Man of the Forest: The Contributions of Gifford Pinchot
RowanSales
 
Carbonate formation and fluctuating habitability on Mars
Sérgio Sacani
 
Unit-5 ppt.pdf unit 5 organic chemistry 3
visionshukla007
 
High-speedBouldersandtheDebrisFieldinDARTEjecta
Sérgio Sacani
 
Rapid protoplanet formation in the outer Solar System recorded in a dunite fr...
Sérgio Sacani
 
Microbiome_Engineering_Poster_Fixed.pptx
SupriyaPolisetty1
 
LESSON 2 PSYCHOSOCIAL DEVELOPMENT.pptx L
JeanCarolColico1
 
Calcium in a supernova remnant as a fingerprint of a sub-Chandrasekhar-mass e...
Sérgio Sacani
 
Experimental Design by Cary Willard v3.ppt
MohammadRezaNirooman1
 
BlackBody Radiation experiment report.pdf
Ghadeer Shaabna
 
Neuroinflammation and microglial subtypes
KanakChaudhary10
 
Carbon-richDustInjectedintotheInterstellarMediumbyGalacticWCBinaries Survives...
Sérgio Sacani
 
soil and environmental microbiology.pdf
Divyaprabha67
 
Portable Hyperspectral Imaging (pHI) for the enhanced recording of archaeolog...
crabbn
 
Annual report 2024 - Inria - English version.pdf
Inria
 
Integrating Lifestyle Data into Personalized Health Solutions (www.kiu.ac.ug)
publication11
 
Unit-3 ppt.pdf organic chemistry unit 3 heterocyclic
visionshukla007
 
Preserving brand authenticity amid AI-driven misinformation: Sustaining consu...
Selcen Ozturkcan
 
Ad

DAS. Technical introduction to the Distributed Annotation System.

  • 1. Primer for Predocs 17-19 January 2011 Rafael Jimenez [email protected] EnCORE presentation DAS Technical introduction to the Distributed Annotation System
  • 2. Table of contents • DAS  Commands and queries  Design principles  Documentation  Clients and servers
  • 4. DAS – Andy Jenkinson 23.08.184 Query model Structured REST URL – http://server/das/source/command?arguments – servers, data sources, commands, parameters Reference object – e.g. “chromosome X” Reference servers provide sequence – http://server/das/source/sequence?segment=X:1,500 Annotation servers provide features – http://server/das/source/features?segment=X:1,500
  • 5. DAS – Andy Jenkinson 23.08.185 Data model Lightweight XML http://server/das/source/features?segment=X:1,500 <SEGMENT id=“X” start=“1” stop=“500”> <FEATURE id=“…”> <TYPE id=“…” category=“…”>…</TYPE> <METHOD id=“…”>…</METHOD> <START>…</START> <END>…</END> </FEATURE> <FEATURE id=“…”> … </FEATURE> </SEGMENT> http://server/das/source/features?segment=X:1,500 <SEGMENT id=“X” start=“1” stop=“500”> <FEATURE id=“…”> <TYPE id=“…” category=“…”>…</TYPE> <METHOD id=“…”>…</METHOD> <START>…</START> <END>…</END> </FEATURE> <FEATURE id=“…”> … </FEATURE> </SEGMENT>
  • 6. DAS Annotation source - Protein Feature Request Non-positional feature Positional feature http://www.ebi.ac.uk/das-srv/uniprot/das/uniprot/features?segment=Q12345
  • 7. DAS Reference source - Protein Sequence Request http://www.ebi.ac.uk/das-srv/uniprot/das/uniprot/sequence?segment=Q12345
  • 8. More DAS Commands • Alignment, Structure and Interaction • More … http://server/das/source/entry_points – entry_points: List of available “chromosomes | contigs | proteins | …” http://server/das/source/types – types – provides a summary of the feature types for a segment. http://server/das/source/stylesheet – stylesheet – gives hints to the DAS client about how to display the feature types. Can be ignored of course. http://server/das/sources – sources – list of available sources in one DAS server. Replaces the original, underspecified dsn command. http://www.biodas.org/wiki/DAS1.6
  • 10. DAS – Andy Jenkinson 23.08.1810 DAS Design Principles • Data remains distributed – “live” data – data providers retain responsibility – good for changing data – spreads resources • Easy for data providers to implement – simple protocol – lots of data providers
  • 14. Versions of DAS 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 ~250 sources ~380 sources ~650 sources ~ 8 sources DAS 1.01 ~1300 sources DAS 1.53 DAS 2.0 DAS 2.1 DAS 1.53E DAS 1.6DAS 1 DAS/2
  • 17. List of DAS Servers 23.08.1817
  • 18. DAS Client libraries 23.08.1818 • Bio::Das::Lite (Perl) • Dasobert (Java) • jsDAS (Javascript)
  • 19. List of DAS Clients 23.08.1819 • Ensembl uses DAS to pull in genomic, gene and protein annotations. It also provides data via DAS. • Gbrowse is a generic genome browser, and is both a consumer and provider of DAS. • IGB is a desktop application for viewing genomic data. • SPICE is an application for projecting protein annotations onto 3D structures. • Dasty2 is a web-based viewer for protein annotations • Jalview is a multiple alignment editor. • PeppeR is a graphical viewer for 3D electron microscopy data. • DASMI is an integration portal for protein interaction data. • DASher is a Java-based viewer for protein annotations. • EpiC presents structure-function summaries for antibody design. • STRAP is a STRucture-based sequence Alignment Program.
  • 22. 23.08.18 22 Protein structure data Spice-Sisyphus
  • 25. 23.08.18 25 EMAP data EMAP: The Edinburgh Mouse Atlas Project Gene expression databases (EMAGE & GXD)  DAS reference server  EMAP - Ontology DAS annotation servers  EMAGE  GXD

Editor's Notes

  • #5: well-formed hierarchical URL, each server has one or more sources, and each source implements one or more commands sequence command provides sequence, and features command provides sequence annotations stylesheet command allows the server to govern how the feature will be rendered by the client. it works by specifying the type and colour of glyph to use for each type of feature. So for instance the COSMIC cancer mutation database DAS server specifies that substitutions should be drawn as crosses, whereas insertions are drawn as triangles.
  • #11: live – warehouses allow fast access but data is often not in sync with source database providers are responsible for data, and clients are shielded from database changes rapidly changing data e.g. ENCODE, c.f. warehouses. makes a lot of sense to spread resources given the topology of the network intrinsically simple protocol, and: dumb server – all it has to do is access its adapt the data medium to XML, and existing implementations make that easy clever client –presentation of the data
  • #15: Graphic representation of the evolution of &amp;quot;Versions of DAS&amp;quot;. It gives a rough idea of when the different specifications were adopted and when DAS/2 started a as independent specification. It also shows an estimation of available DAS sources per year for DAS 1 and DAS/2.
  • #27: Integration of biological data of various types and development of adapted bioinformatics tools represent critical objectives to enable research at the systems level. The European Network of Excellence ENFIN is engaged in developing an adapted infrastructure to connect databases, and platforms to enable both generation of new bioinformatics tools and experimental validation of computational predictions. Beyond the use of common standards to format individual datasets, there is a need for sophisticated informatics platforms to enable mining data across various domains, sources, formats and types. The aim of the EnCORE project is to integrate across different disciplines an extensive list of database resources and analysis tools in a computationally accessible and extensible manner, facilitating automated data retrieval and processing with a special focus on systems biology. The EnCORE platform is available as a collection of webservices with a common standard format easy to integrate in Workflow management software such as Taverna. Additionally EnCORE services are also accessible thought EnVISION, a web graphical user interface providing elaborated information such as molecular interaction, biological pathways and computational models of pathways.