The Encyclopedia of Life: How realistic is it? Ana Dal Molin ENTO681 Seminar Texas A&M University 23 Feb 2009 This work is licensed under the Creative Commons Attribution-Noncommercial-Share Alike 3.0 United States License. To view a copy of this license, visit  http://creativecommons.org/licenses/by-nc-sa/3.0/us/   or send a letter to Creative Commons, 171 Second Street, Suite 300, San Francisco, California, 94105, USA.
18(2) 2003
Why?
Imagine an electronic page for each species of organism on Earth, available everywhere by single access on command. Edward O. Wilson
E. O. Wilson’s idea Entries with genome, proteome, morphology, geographical distribution, habitat, phylogenetic position, ecological relationships and practical importance Communicate with other DBs Content peer-reviewed Taxonomy is underfunded for the size of the enterprise, and there are too few taxonomists E-types “ accelerate as traditional taxonomic procedures (…) are replaced by high-resolution digital photography, nucleic acid sequencing and Internet publications” Three overlapping phases: The Catalog of Life (collaborative effort of sp2000, ITIS, CBD and GBIF) Inventories (All Species Foundation) Expand the EOL over the Catalog of Life
species images general  information description genetics museums classif. Just a matter of organizing existing information? literature copyright format IUCN Red List BHL
 
Many Internet taxonomy initiatives exist J. Mallet perhaps too many K. Willmott
http://www.itis.gov ~483,000 names (Jan 2009) http://www.catalogueoflife.org 1.1 million names (includes LSIDs) (Dec 2008) Compiles several databases, including ITIS, GBIF, sp2000, CBD Redundancy of tools ? http://www.ubio.org Focus on searches http://www.ecoport.org   http://www.cbd.int/gti/ From Rio’92 Earth Summit (UN) Several databases (separate programs) http://ispecies.org/ “ iSpecies is a test of E O Wilson's idea of a web page for each species” http://nlbif.eti.uva.nl/bis   Results from independent initiatives that use specific software : site inactive! http://www.biodiversitylibrary.org/ 10207 titles 10,000,000 pages (Nov 2008) http://www.gbif.org ~171,400,000  occurrence records (v. 1.2.3)
http://www.cria.org.br   http://www.lifemapper.org   Multiple initiatives overlap continues for: Keys Regional inventories / faunistic databases Taxon-specific information Museum-specific information (types, holdings) Literature databases Catalogs Tools Etc. http://antbase.org/
Are we lacking funds? NSF: Biodiversity Surveys and Inventories (BS&I) including support for Planetary Biodiversity Inventories: Mission to an (almost) unknown planet (PBI) NSF: PEET All Species Foundation Summit (Harvard, 2001) Earth Summit (CBD, 1992; RIO+10, 2002) “ Important people jet frequently to international biodiversity conferences in expensive locales, while few improvements in taxonomy are yet evident” (Mallet & Willmott) C. Hine’s copy of “What on Earth” House of the Lords report: flags are mentions to information and communication technologies (in “Systematics as Cyberscience”, MIT, 2008)
Mallet & Willmott’s points Biologists need to seek consensus Do not fragment information Unitary taxonomy, DNA taxonomy and the Phylocode all argue that existing rules of nomenclature are  inadequate / inefficient  Is it sensible to add another requirement to the already slow process of describing new taxa? ICBN and ICZN rejected central registries in 1999 The taxonomic impediment exists Not for lack of money Not for lack of purpose Not for lack of structure For lack of basic work
http://www.organismnames.com   (…) a unitary organization (…) and web taxonomy should replace printed taxonomy Taxonomists lack goals that are both realistic and relevant. C. J. Godfray Int J Syst Evol Microbiol + LPSN  http://www.bacterio.cict.fr   http://www.dsmz.de/bactnom/bactname.htm   http://www.ipni.org a.k.a.  Index Kewensis
Dreams of consumption:  GenBank GenBank is frequently referenced as what taxonomists should be doing…
However, it is not an exclusive/central resource, not free from redundancy with other DBs. Solution: synchronization. “ Taxonomic information could become much more unitary even under existing codes.  GenBank and EMBL did not become primary sources of DNA sequence information by decree .” (Mallet & Willmott)
Dreams of consumption: PubMed
Is this possible? Metadata Data Metadata repository Name Index Occurrence Index Yellow Pages Regional Atlas Annotation Tools Biosecurity Portal Analysis Tools Products LaSalle, 2008. Atlas of Living Australia, ICE2008 presentation
http://www.tdwg.org
People lie People are lazy People are stupid Mission Impossible: know thyself Schemas aren't neutral Metrics influence results There's more than one way to describe something Cory Doctorow
The fragility of metadata is an important concern because things such as the semantic web rely on conventions on data markup becoming widely adopted and used with care, which, according to Doctorow, will not and cannot happen. Ex. AY281248 - Australia: Gubbata, NSW (GPS: 33 38' 07'', 146 33' 12'' Genbank instructions:  degrees latitude and  longitude in format  "d[d.dd] N|S d[dd.dd] W|E" Translating: Examples from Page, R.  http://iphylo.blogspot.com/2008/01/metacrap.html   -33° 38' 7.08", +146° 33' 10.80“ IS in Australia Ex. DQ502492 - Nicaragua: Rio San Juan, Near Isla de Diamante  (ca. 15 km SE El Castillo on Rio San Juan), 10deg56'N Ex. DQ226041 - /lat_lon="6 28.06'N; 58 37.16'W"
Present criticisms about such initiatives Difficulty to inventory everything (Wilson) Incongruence of species concept across taxa (Wilson) Quality control (Wilson) Information overload (Wilson) Lack of cooperation: competing proposals, organizations and websites abound (Mallet & Willmott) It had no significant impact on the taxonomic process (Mallet & Willmott) Metadata are not reliable (Doctorow) To that, add Make people able to get LSIDs (or the identifier required) Make people use LSIDs (or the identifier required) Make tools communicate Recently, even the format of such central encyclopedias: that they should be “wikis”
The biodiversity information pipeline   The capacity to deliver biodiversity information How we are inputting biodiversity information LaSalle, 2008. Overcoming the taxonomic impediment. ICE2008 presentation
Questions 1. How realistic is it to have a web page for every species, including an image database that can ultimately be used in fingerprint-like fashion?  2. What exactly are the objectives behind the EOL, GBIF, and the other initiatives? Are they in fact overlapping? 3. Is this collaboration or: 3a. Unnecessary split of resources? 3b. Adding to the mess of linked data without actual information? 4. Can we learn from the example of other areas? Is our situation that different from astronomy or molecular databases, for example? 5. Do we need to change the way taxonomy is being done? 6. Do we need to change the way we deliver information? What are we doing wrong?

More Related Content

PDF
Project Overview: Comparative Phylogeography and Phylogenetic Community Struc...
PPTX
EOL in context
PPTX
Scratchpads introductory presentation 45mins
PDF
Assessing the information content of fossil Glires using 'artificial extinction'
PPT
Jim Woolley - Name Registration: One Less Impediment to Taxonomy
PPT
Biodiversity Heritage Library
PDF
DSD-INT 2015 - Community model in San Francisco Bay - Mick van der Wegen
PPT
iPlant Tree of Life
Project Overview: Comparative Phylogeography and Phylogenetic Community Struc...
EOL in context
Scratchpads introductory presentation 45mins
Assessing the information content of fossil Glires using 'artificial extinction'
Jim Woolley - Name Registration: One Less Impediment to Taxonomy
Biodiversity Heritage Library
DSD-INT 2015 - Community model in San Francisco Bay - Mick van der Wegen
iPlant Tree of Life

What's hot (11)

PPT
Forging the Digital Roadmap: The Preservation, Curation and Stewardship Nexus
PPTX
Crowd-sourcing the creation of "articles" within the Biodiversity Heritage Li...
PPT
BSC Shorthouse ESC 2011
PDF
Increasing Access, Promoting Progress: Empowering Global Research through the...
PPTX
Schindel i evobio norman ok - jun 11
PPT
Per de Place Bjørn - Revolutionizing taxonomy through an open-access web-regi...
PPTX
Metadata in the age of data curation and linked data
PPT
Digitizing Entomology: The Biodiversity Heritage Library @ the Smithsonian
PPTX
Introduction to Biodiversity Informatics
PDF
A Current Overview of the Biodiversity Heritage Library
PPT
Gaining Weight for Good Reason: Analysis of Fuller Bibliographic Records in S...
Forging the Digital Roadmap: The Preservation, Curation and Stewardship Nexus
Crowd-sourcing the creation of "articles" within the Biodiversity Heritage Li...
BSC Shorthouse ESC 2011
Increasing Access, Promoting Progress: Empowering Global Research through the...
Schindel i evobio norman ok - jun 11
Per de Place Bjørn - Revolutionizing taxonomy through an open-access web-regi...
Metadata in the age of data curation and linked data
Digitizing Entomology: The Biodiversity Heritage Library @ the Smithsonian
Introduction to Biodiversity Informatics
A Current Overview of the Biodiversity Heritage Library
Gaining Weight for Good Reason: Analysis of Fuller Bibliographic Records in S...
Ad

Viewers also liked (20)

PPT
Night
PPT
DocTores del HUmor
PPS
EcoCity Lab
PPT
How To Make A Great Pbj
PPT
How To Make A Great Pbj
PDF
SEM Community College 10 Steps
PDF
PPCORDOBA - ECONOMÍA - Febrero
DOC
My Resume Jacqueline E Hern
PDF
CGI Federal: FFE
PPT
Google Search, No-Rule....?
PPT
PDF
Eden Pps6 1 Introdu O Ao Evento Luke Murray
PDF
El predicado
PDF
Manuel pietra icgfm conference closing remarks
PPS
Sats på fotobevaring (Elin Østevik)
PPS
001.03. Entregados En Su Mano
PDF
Myers Power Spectral Analysis Paper
PPS
CapíTulo 4 SíNtese
PPTX
King edward essay p
PPT
Web 2.0
Night
DocTores del HUmor
EcoCity Lab
How To Make A Great Pbj
How To Make A Great Pbj
SEM Community College 10 Steps
PPCORDOBA - ECONOMÍA - Febrero
My Resume Jacqueline E Hern
CGI Federal: FFE
Google Search, No-Rule....?
Eden Pps6 1 Introdu O Ao Evento Luke Murray
El predicado
Manuel pietra icgfm conference closing remarks
Sats på fotobevaring (Elin Østevik)
001.03. Entregados En Su Mano
Myers Power Spectral Analysis Paper
CapíTulo 4 SíNtese
King edward essay p
Web 2.0
Ad

Similar to The Encyclopedia of Life: How realistic is it? (20)

PPTX
Biodiversity Informatics: An Interdisciplinary Challenge
PPT
Mla May 7
PPT
An International Cooperative Digital Library for Taxonomic Literature: The Bi...
PPT
Biodiversity Heritage Library: A Conversation About A Collaborative Digitizin...
PPT
Using the Semantic Web to Support Ecoinformatics
PPT
The Path to Enlightened Solutions for Biodiversity's Dark Data
PPT
Heidorn The Path to Enlightened Solutions for Biodiversity's Dark DataViBRANT...
PPT
Biodiversity Heritage Library : Development and Partnerhips
PPT
Shorthouse
PPT
The Encyclopedia of Life, Biodiversity Heritage Library, Biodiversity Informa...
PPT
Wikis at work
PPTX
ContentMining for France and Europe; Lessons from 2 years in UK
PPT
Biodiversity Heritage Library: Cornerstone of the Encyclopedia of Life
ODP
The Biodiversity Heritage Library: A Cornerstone of the Encyclopedia of Life
PPTX
The emerging biodiversity data ecosystem
PPT
BioOne Keynote
PPTX
Emerging challenges in data-intensive genomics
PDF
Scratchpads: Building web communities supporting biodiversity science
PPT
Special Libraries Associatin
PPT
An International Cooperative Digital Library for Taxonomic Literature: The Bi...
Biodiversity Informatics: An Interdisciplinary Challenge
Mla May 7
An International Cooperative Digital Library for Taxonomic Literature: The Bi...
Biodiversity Heritage Library: A Conversation About A Collaborative Digitizin...
Using the Semantic Web to Support Ecoinformatics
The Path to Enlightened Solutions for Biodiversity's Dark Data
Heidorn The Path to Enlightened Solutions for Biodiversity's Dark DataViBRANT...
Biodiversity Heritage Library : Development and Partnerhips
Shorthouse
The Encyclopedia of Life, Biodiversity Heritage Library, Biodiversity Informa...
Wikis at work
ContentMining for France and Europe; Lessons from 2 years in UK
Biodiversity Heritage Library: Cornerstone of the Encyclopedia of Life
The Biodiversity Heritage Library: A Cornerstone of the Encyclopedia of Life
The emerging biodiversity data ecosystem
BioOne Keynote
Emerging challenges in data-intensive genomics
Scratchpads: Building web communities supporting biodiversity science
Special Libraries Associatin
An International Cooperative Digital Library for Taxonomic Literature: The Bi...

Recently uploaded (20)

PDF
Physical education and sports and CWSN notes
PDF
African Communication Research: A review
PDF
LIFE & LIVING TRILOGY- PART (1) WHO ARE WE.pdf
PPTX
Power Point PR B.Inggris 12 Ed. 2019.pptx
PDF
Solved Past paper of Pediatric Health Nursing PHN BS Nursing 5th Semester
PDF
Horaris_Grups_25-26_Definitiu_15_07_25.pdf
PDF
anganwadi services for the b.sc nursing and GNM
PPTX
4. Diagnosis and treatment planning in RPD.pptx
PDF
Nurlina - Urban Planner Portfolio (english ver)
PDF
LIFE & LIVING TRILOGY - PART - (2) THE PURPOSE OF LIFE.pdf
PPTX
BSCE 2 NIGHT (CHAPTER 2) just cases.pptx
PDF
Farming Based Livelihood Systems English Notes
PPTX
ACFE CERTIFICATION TRAINING ON LAW.pptx
PDF
The TKT Course. Modules 1, 2, 3.for self study
PDF
FYJC - Chemistry textbook - standard 11.
PPTX
Reproductive system-Human anatomy and physiology
PDF
Hospital Case Study .architecture design
PPTX
Integrated Management of Neonatal and Childhood Illnesses (IMNCI) – Unit IV |...
PDF
Lecture on Viruses: Structure, Classification, Replication, Effects on Cells,...
PDF
0520_Scheme_of_Work_(for_examination_from_2021).pdf
Physical education and sports and CWSN notes
African Communication Research: A review
LIFE & LIVING TRILOGY- PART (1) WHO ARE WE.pdf
Power Point PR B.Inggris 12 Ed. 2019.pptx
Solved Past paper of Pediatric Health Nursing PHN BS Nursing 5th Semester
Horaris_Grups_25-26_Definitiu_15_07_25.pdf
anganwadi services for the b.sc nursing and GNM
4. Diagnosis and treatment planning in RPD.pptx
Nurlina - Urban Planner Portfolio (english ver)
LIFE & LIVING TRILOGY - PART - (2) THE PURPOSE OF LIFE.pdf
BSCE 2 NIGHT (CHAPTER 2) just cases.pptx
Farming Based Livelihood Systems English Notes
ACFE CERTIFICATION TRAINING ON LAW.pptx
The TKT Course. Modules 1, 2, 3.for self study
FYJC - Chemistry textbook - standard 11.
Reproductive system-Human anatomy and physiology
Hospital Case Study .architecture design
Integrated Management of Neonatal and Childhood Illnesses (IMNCI) – Unit IV |...
Lecture on Viruses: Structure, Classification, Replication, Effects on Cells,...
0520_Scheme_of_Work_(for_examination_from_2021).pdf

The Encyclopedia of Life: How realistic is it?

  • 1. The Encyclopedia of Life: How realistic is it? Ana Dal Molin ENTO681 Seminar Texas A&M University 23 Feb 2009 This work is licensed under the Creative Commons Attribution-Noncommercial-Share Alike 3.0 United States License. To view a copy of this license, visit http://creativecommons.org/licenses/by-nc-sa/3.0/us/ or send a letter to Creative Commons, 171 Second Street, Suite 300, San Francisco, California, 94105, USA.
  • 4. Imagine an electronic page for each species of organism on Earth, available everywhere by single access on command. Edward O. Wilson
  • 5. E. O. Wilson’s idea Entries with genome, proteome, morphology, geographical distribution, habitat, phylogenetic position, ecological relationships and practical importance Communicate with other DBs Content peer-reviewed Taxonomy is underfunded for the size of the enterprise, and there are too few taxonomists E-types “ accelerate as traditional taxonomic procedures (…) are replaced by high-resolution digital photography, nucleic acid sequencing and Internet publications” Three overlapping phases: The Catalog of Life (collaborative effort of sp2000, ITIS, CBD and GBIF) Inventories (All Species Foundation) Expand the EOL over the Catalog of Life
  • 6. species images general information description genetics museums classif. Just a matter of organizing existing information? literature copyright format IUCN Red List BHL
  • 7.  
  • 8. Many Internet taxonomy initiatives exist J. Mallet perhaps too many K. Willmott
  • 9. http://www.itis.gov ~483,000 names (Jan 2009) http://www.catalogueoflife.org 1.1 million names (includes LSIDs) (Dec 2008) Compiles several databases, including ITIS, GBIF, sp2000, CBD Redundancy of tools ? http://www.ubio.org Focus on searches http://www.ecoport.org http://www.cbd.int/gti/ From Rio’92 Earth Summit (UN) Several databases (separate programs) http://ispecies.org/ “ iSpecies is a test of E O Wilson's idea of a web page for each species” http://nlbif.eti.uva.nl/bis Results from independent initiatives that use specific software : site inactive! http://www.biodiversitylibrary.org/ 10207 titles 10,000,000 pages (Nov 2008) http://www.gbif.org ~171,400,000 occurrence records (v. 1.2.3)
  • 10. http://www.cria.org.br http://www.lifemapper.org Multiple initiatives overlap continues for: Keys Regional inventories / faunistic databases Taxon-specific information Museum-specific information (types, holdings) Literature databases Catalogs Tools Etc. http://antbase.org/
  • 11. Are we lacking funds? NSF: Biodiversity Surveys and Inventories (BS&I) including support for Planetary Biodiversity Inventories: Mission to an (almost) unknown planet (PBI) NSF: PEET All Species Foundation Summit (Harvard, 2001) Earth Summit (CBD, 1992; RIO+10, 2002) “ Important people jet frequently to international biodiversity conferences in expensive locales, while few improvements in taxonomy are yet evident” (Mallet & Willmott) C. Hine’s copy of “What on Earth” House of the Lords report: flags are mentions to information and communication technologies (in “Systematics as Cyberscience”, MIT, 2008)
  • 12. Mallet & Willmott’s points Biologists need to seek consensus Do not fragment information Unitary taxonomy, DNA taxonomy and the Phylocode all argue that existing rules of nomenclature are inadequate / inefficient Is it sensible to add another requirement to the already slow process of describing new taxa? ICBN and ICZN rejected central registries in 1999 The taxonomic impediment exists Not for lack of money Not for lack of purpose Not for lack of structure For lack of basic work
  • 13. http://www.organismnames.com (…) a unitary organization (…) and web taxonomy should replace printed taxonomy Taxonomists lack goals that are both realistic and relevant. C. J. Godfray Int J Syst Evol Microbiol + LPSN http://www.bacterio.cict.fr http://www.dsmz.de/bactnom/bactname.htm http://www.ipni.org a.k.a. Index Kewensis
  • 14. Dreams of consumption: GenBank GenBank is frequently referenced as what taxonomists should be doing…
  • 15. However, it is not an exclusive/central resource, not free from redundancy with other DBs. Solution: synchronization. “ Taxonomic information could become much more unitary even under existing codes. GenBank and EMBL did not become primary sources of DNA sequence information by decree .” (Mallet & Willmott)
  • 17. Is this possible? Metadata Data Metadata repository Name Index Occurrence Index Yellow Pages Regional Atlas Annotation Tools Biosecurity Portal Analysis Tools Products LaSalle, 2008. Atlas of Living Australia, ICE2008 presentation
  • 19. People lie People are lazy People are stupid Mission Impossible: know thyself Schemas aren't neutral Metrics influence results There's more than one way to describe something Cory Doctorow
  • 20. The fragility of metadata is an important concern because things such as the semantic web rely on conventions on data markup becoming widely adopted and used with care, which, according to Doctorow, will not and cannot happen. Ex. AY281248 - Australia: Gubbata, NSW (GPS: 33 38' 07'', 146 33' 12'' Genbank instructions: degrees latitude and longitude in format "d[d.dd] N|S d[dd.dd] W|E" Translating: Examples from Page, R. http://iphylo.blogspot.com/2008/01/metacrap.html -33° 38' 7.08", +146° 33' 10.80“ IS in Australia Ex. DQ502492 - Nicaragua: Rio San Juan, Near Isla de Diamante (ca. 15 km SE El Castillo on Rio San Juan), 10deg56'N Ex. DQ226041 - /lat_lon="6 28.06'N; 58 37.16'W"
  • 21. Present criticisms about such initiatives Difficulty to inventory everything (Wilson) Incongruence of species concept across taxa (Wilson) Quality control (Wilson) Information overload (Wilson) Lack of cooperation: competing proposals, organizations and websites abound (Mallet & Willmott) It had no significant impact on the taxonomic process (Mallet & Willmott) Metadata are not reliable (Doctorow) To that, add Make people able to get LSIDs (or the identifier required) Make people use LSIDs (or the identifier required) Make tools communicate Recently, even the format of such central encyclopedias: that they should be “wikis”
  • 22. The biodiversity information pipeline The capacity to deliver biodiversity information How we are inputting biodiversity information LaSalle, 2008. Overcoming the taxonomic impediment. ICE2008 presentation
  • 23. Questions 1. How realistic is it to have a web page for every species, including an image database that can ultimately be used in fingerprint-like fashion? 2. What exactly are the objectives behind the EOL, GBIF, and the other initiatives? Are they in fact overlapping? 3. Is this collaboration or: 3a. Unnecessary split of resources? 3b. Adding to the mess of linked data without actual information? 4. Can we learn from the example of other areas? Is our situation that different from astronomy or molecular databases, for example? 5. Do we need to change the way taxonomy is being done? 6. Do we need to change the way we deliver information? What are we doing wrong?