SlideShare a Scribd company logo
Champaign, 27th July 2017
A Clearinghouse
for Taxonomy & Nomenclature
Markus Döring, GBIF
CLEARINGHOUSE MODEL
For reviewFor review
For review For review
IPNI
Index
Fungorum
ZooBank
Global
Species
Database 1
GSD 2
GSD 3 …
Provisional
Datasets
Clearing House
Catalogue
of Life
(Reviewed)
“Index of
Names”
(Reviewed)
ExtendedCatalogueSeparates names & taxa
Different authorities: nomenclator & GSD
Integrate sources on distinct name
single consensus taxonomy
Provisional & authoritative sources
Offers a review queue to every source
NAMES
A unique name is based on:
Genus, species-, infraspecific epithet
Rank
Authorship regardless of spelling
Name types
Included: Linnean names, Named hybrids, Virus names
Exclude: hybrid formulas, cultivars, OTUs
NameID from nomenclator
Otherwise preliminary clearinghouse nameID
Major information (draft)
Code (ICN, ICZN, …)
Homotypic synonyms (basionym & replacement names )
Original publication
isFossil
Available
Conserved, rejected
Nomenclatural notes
Acacia (Juliflorae) aneura F. Muell. ex Benth. var. latifolia J.M. Black
Acacia (Juliflorae) aneura var. latifolia J.M. Black
Acacia aneura Benth. var. latifolia J. Black
Acacia aneura var. latifolia J. Black
Acacia aneura var. latifolia J.M. Black
TAXA
Single consensus view
Track historic concepts
Manage CoL classification in clearinghouse
Heterotypic synonymy
Assign taxon IDs
Computable rules based on synonymy & siblings
Reviewed and provisional taxon ids might differ
Supplementary CoL information
Distribution
Vernacular names
Lifezone & fossil flags
CLEARINGHOUSE API
All services expose both
Reviewed data only
Extended data including provisional information
Nomenclatural services
Name matching
List homotypic synonymy
Correct spelling
Name history (nomenclatural events around a basionym)
Taxonomic services
Search
Browse children
List entire synonymy
STAGING WORKSPACE
Initial import of sources, cleaning & interpretation
Reader for ACEF, DwC-A, TCS
Clean whitespace, html entities, encoding issues
Interpret values (e.g. dates, rank, country, language)
Parse names (if not parsed yet, DwC only)
Match to clearinghouse names (assign nameID)
Integrity checks for review
Check name structure, accepted name for synonyms, …
Check for duplicate name within source or outer CoL
More existing workbench checks, add over time
Change metrics (# of new, updated & deleted names)
Editorial decisions
Map taxa to CoL classification
Block name
Change name status
Publish to clearinghouse
Staging API for review, preview & editorial decisions
Staging
Workspace
Source
editorial
control
CLEARINGHOUSE WORKSPACE
Represents current best knowledge
Includes both reviewed & provisional placement
Maintains the CoL classification
Mints taxon identifier
Community contributions
Fully versioned
Comments everywhere
Edits in areas without authority
Content bots adding provisional information
Basionym relations
BHL literature links
Review queues for sources
Editorial decisions
Manage authorization for taxonomic groups
Move taxa to other group (to change source authority)
Release CoL
Clearinghouse
Workspace review
queue
Public
Staging
Workspace
Source
editorial
control
COL RELEASE SPACE
Contains latest interim (monthly) release
Not archived
Archive of all annual releases
API & portal expose them
GSD decides name fidelity
Name exactly as provided
Name spelling from nomenclator
Homotypic synonymy added
CoL
Release
Public
Clearinghouse
Workspace review
queue
Public
Staging
Workspace
Source
editorial
control
OPTIONS FOR EVOLUTION
Extended nomenclatural model
Nomenclatural acts
Type specimen
Name Author entities
Reference entities
More name relations
Concurrent taxon concepts
Multiple classifications
Multiple concepts per name
Allows precise identifications, national lists, etc.
Species images
Reference image selected by community

More Related Content

PPT
Earthsciences 4490 Library Workshops
dansich
 
PPT
Searching the chem lit
Linda Neyer
 
PPTX
OSFair2017 Workshop | OmicsDI: Omics discovery index
Open Science Fair
 
PDF
BioSamples Database Linked Data, SWAT4LS Tutorial
Rothamsted Research, UK
 
PPT
Geology Geophysics 9580 9680 Library Workshop
dansich
 
PPTX
Using ‘databases’ to find chemical information
Linda Neyer
 
PPT
Geology Geophysics 9580 9680 Library Workshop
guest049e69
 
PPTX
PhyloTastic: names-based phyloinformatic data integration
Rutger Vos
 
Earthsciences 4490 Library Workshops
dansich
 
Searching the chem lit
Linda Neyer
 
OSFair2017 Workshop | OmicsDI: Omics discovery index
Open Science Fair
 
BioSamples Database Linked Data, SWAT4LS Tutorial
Rothamsted Research, UK
 
Geology Geophysics 9580 9680 Library Workshop
dansich
 
Using ‘databases’ to find chemical information
Linda Neyer
 
Geology Geophysics 9580 9680 Library Workshop
guest049e69
 
PhyloTastic: names-based phyloinformatic data integration
Rutger Vos
 

Similar to Co l+ clearinghouse (20)

PPTX
Presentation from Code Camp 2017
Mitch Miller
 
PPT
Remsen Lect04
bioinfocourse
 
PDF
Biodiversity Informatics of the Cyperaceae: Where we stand and where we’re he...
Edward Baker
 
PPTX
Franz et al 2015 escjam 2015 logic resolution taxonomic variable
taxonbytes
 
PPT
BiSciCol ievobio
John Deck
 
PPT
The Encyclopedia of Life: How realistic is it?
anadm
 
PPTX
Franz 2014 BIGCB Tracking Change across Classifications and Phylogenies
taxonbytes
 
PPTX
Franz 2014 ESA Aligning Insect Phylogenies Perelleschus and Other Cases
taxonbytes
 
PPT
Global Names Architecture - Remsen
David Remsen
 
PPTX
Explorer of Taxon Concepts (ETC). From description to matrix and beyond in a ...
Thomas Rodenhausen
 
PPTX
Michel digital nomenclature-gna-zoobank-2014-co-namesconfv2
Ellinor Michel
 
PPTX
Open taxonomy
Roderic Page
 
PDF
Nothing in taxonomy makes sense except in the light of Open Access
agosti
 
PDF
Taxonomic 'data' exchange as expression and synthesis of phylogenetic claimsR...
jar375
 
PPTX
10 years of global biodiversity databases: are we there yet?
Tony Rees
 
PPTX
Franz et al 2017 ecn creating and publishing a symbiota based checklist version
taxonbytes
 
PDF
ASSESSMENTS-Taxonomic-Assessments-Javier
Javier Otegui
 
PDF
GBIF ChecklistBank and Backbone building
Markus Döring
 
PPTX
Cataloging Taxonomic Data
David Remsen
 
PPT
Special Libraries Associatin
drielinger
 
Presentation from Code Camp 2017
Mitch Miller
 
Remsen Lect04
bioinfocourse
 
Biodiversity Informatics of the Cyperaceae: Where we stand and where we’re he...
Edward Baker
 
Franz et al 2015 escjam 2015 logic resolution taxonomic variable
taxonbytes
 
BiSciCol ievobio
John Deck
 
The Encyclopedia of Life: How realistic is it?
anadm
 
Franz 2014 BIGCB Tracking Change across Classifications and Phylogenies
taxonbytes
 
Franz 2014 ESA Aligning Insect Phylogenies Perelleschus and Other Cases
taxonbytes
 
Global Names Architecture - Remsen
David Remsen
 
Explorer of Taxon Concepts (ETC). From description to matrix and beyond in a ...
Thomas Rodenhausen
 
Michel digital nomenclature-gna-zoobank-2014-co-namesconfv2
Ellinor Michel
 
Open taxonomy
Roderic Page
 
Nothing in taxonomy makes sense except in the light of Open Access
agosti
 
Taxonomic 'data' exchange as expression and synthesis of phylogenetic claimsR...
jar375
 
10 years of global biodiversity databases: are we there yet?
Tony Rees
 
Franz et al 2017 ecn creating and publishing a symbiota based checklist version
taxonbytes
 
ASSESSMENTS-Taxonomic-Assessments-Javier
Javier Otegui
 
GBIF ChecklistBank and Backbone building
Markus Döring
 
Cataloging Taxonomic Data
David Remsen
 
Special Libraries Associatin
drielinger
 
Ad

Recently uploaded (20)

PDF
Identification of unnecessary object allocations using static escape analysis
ESUG
 
PPTX
Role of GIS in precision farming.pptx
BikramjitDeuri
 
PPTX
Hydrocarbons Pollution. OIL pollutionpptx
AkCreation33
 
PPTX
Brain_stem_Medulla oblongata_functions of pons_mid brain
muralinath2
 
PPTX
Home Garden as a Component of Agroforestry system : A survey-based Study
AkhangshaRoy
 
PPTX
INTERNATIONAL CLASSIFICATION OF DISEASES ji.pptx
46JaybhayAshwiniHari
 
PPTX
Q1_Science 8_Week4-Day 5.pptx science re
AizaRazonado
 
PPTX
Hericium erinaceus, also known as lion's mane mushroom
TinaDadkhah1
 
PDF
The Cosmic Symphony: How Photons Shape the Universe and Our Place Within It
kutatomoshi
 
PPTX
Cell Structure and Organelles Slides PPT
JesusNeyra8
 
PPTX
METABOLIC_SYNDROME Dr Shadab- kgmu lucknow pptx
ShadabAlam169087
 
PPTX
Qualification of.UV visible spectrophotometer pptx
shrutipandit17
 
PDF
study of microbiologically influenced corrosion of 2205 duplex stainless stee...
ahmadfreak180
 
PPTX
Feeding stratagey for climate change dairy animals.
Dr.Zulfy haq
 
PPTX
Quality control test for plastic & metal.pptx
shrutipandit17
 
PDF
Renewable Energy Resources (Solar, Wind, Nuclear, Geothermal) Presentation
RimshaNaeem23
 
PPTX
INTRO-TO-CRIM-THEORIES-OF-CRIME-2023 (1).pptx
ChrisFlickIII
 
PPTX
Laboratory design and safe microbiological practices
Akanksha Divkar
 
PPTX
Limbic system_components_connections_ functions.pptx
muralinath2
 
PPTX
The Toxic Effects of Aflatoxin B1 and Aflatoxin M1 on Kidney through Regulati...
OttokomaBonny
 
Identification of unnecessary object allocations using static escape analysis
ESUG
 
Role of GIS in precision farming.pptx
BikramjitDeuri
 
Hydrocarbons Pollution. OIL pollutionpptx
AkCreation33
 
Brain_stem_Medulla oblongata_functions of pons_mid brain
muralinath2
 
Home Garden as a Component of Agroforestry system : A survey-based Study
AkhangshaRoy
 
INTERNATIONAL CLASSIFICATION OF DISEASES ji.pptx
46JaybhayAshwiniHari
 
Q1_Science 8_Week4-Day 5.pptx science re
AizaRazonado
 
Hericium erinaceus, also known as lion's mane mushroom
TinaDadkhah1
 
The Cosmic Symphony: How Photons Shape the Universe and Our Place Within It
kutatomoshi
 
Cell Structure and Organelles Slides PPT
JesusNeyra8
 
METABOLIC_SYNDROME Dr Shadab- kgmu lucknow pptx
ShadabAlam169087
 
Qualification of.UV visible spectrophotometer pptx
shrutipandit17
 
study of microbiologically influenced corrosion of 2205 duplex stainless stee...
ahmadfreak180
 
Feeding stratagey for climate change dairy animals.
Dr.Zulfy haq
 
Quality control test for plastic & metal.pptx
shrutipandit17
 
Renewable Energy Resources (Solar, Wind, Nuclear, Geothermal) Presentation
RimshaNaeem23
 
INTRO-TO-CRIM-THEORIES-OF-CRIME-2023 (1).pptx
ChrisFlickIII
 
Laboratory design and safe microbiological practices
Akanksha Divkar
 
Limbic system_components_connections_ functions.pptx
muralinath2
 
The Toxic Effects of Aflatoxin B1 and Aflatoxin M1 on Kidney through Regulati...
OttokomaBonny
 
Ad

Co l+ clearinghouse

  • 1. Champaign, 27th July 2017 A Clearinghouse for Taxonomy & Nomenclature Markus Döring, GBIF
  • 2. CLEARINGHOUSE MODEL For reviewFor review For review For review IPNI Index Fungorum ZooBank Global Species Database 1 GSD 2 GSD 3 … Provisional Datasets Clearing House Catalogue of Life (Reviewed) “Index of Names” (Reviewed) ExtendedCatalogueSeparates names & taxa Different authorities: nomenclator & GSD Integrate sources on distinct name single consensus taxonomy Provisional & authoritative sources Offers a review queue to every source
  • 3. NAMES A unique name is based on: Genus, species-, infraspecific epithet Rank Authorship regardless of spelling Name types Included: Linnean names, Named hybrids, Virus names Exclude: hybrid formulas, cultivars, OTUs NameID from nomenclator Otherwise preliminary clearinghouse nameID Major information (draft) Code (ICN, ICZN, …) Homotypic synonyms (basionym & replacement names ) Original publication isFossil Available Conserved, rejected Nomenclatural notes Acacia (Juliflorae) aneura F. Muell. ex Benth. var. latifolia J.M. Black Acacia (Juliflorae) aneura var. latifolia J.M. Black Acacia aneura Benth. var. latifolia J. Black Acacia aneura var. latifolia J. Black Acacia aneura var. latifolia J.M. Black
  • 4. TAXA Single consensus view Track historic concepts Manage CoL classification in clearinghouse Heterotypic synonymy Assign taxon IDs Computable rules based on synonymy & siblings Reviewed and provisional taxon ids might differ Supplementary CoL information Distribution Vernacular names Lifezone & fossil flags
  • 5. CLEARINGHOUSE API All services expose both Reviewed data only Extended data including provisional information Nomenclatural services Name matching List homotypic synonymy Correct spelling Name history (nomenclatural events around a basionym) Taxonomic services Search Browse children List entire synonymy
  • 6. STAGING WORKSPACE Initial import of sources, cleaning & interpretation Reader for ACEF, DwC-A, TCS Clean whitespace, html entities, encoding issues Interpret values (e.g. dates, rank, country, language) Parse names (if not parsed yet, DwC only) Match to clearinghouse names (assign nameID) Integrity checks for review Check name structure, accepted name for synonyms, … Check for duplicate name within source or outer CoL More existing workbench checks, add over time Change metrics (# of new, updated & deleted names) Editorial decisions Map taxa to CoL classification Block name Change name status Publish to clearinghouse Staging API for review, preview & editorial decisions Staging Workspace Source editorial control
  • 7. CLEARINGHOUSE WORKSPACE Represents current best knowledge Includes both reviewed & provisional placement Maintains the CoL classification Mints taxon identifier Community contributions Fully versioned Comments everywhere Edits in areas without authority Content bots adding provisional information Basionym relations BHL literature links Review queues for sources Editorial decisions Manage authorization for taxonomic groups Move taxa to other group (to change source authority) Release CoL Clearinghouse Workspace review queue Public Staging Workspace Source editorial control
  • 8. COL RELEASE SPACE Contains latest interim (monthly) release Not archived Archive of all annual releases API & portal expose them GSD decides name fidelity Name exactly as provided Name spelling from nomenclator Homotypic synonymy added CoL Release Public Clearinghouse Workspace review queue Public Staging Workspace Source editorial control
  • 9. OPTIONS FOR EVOLUTION Extended nomenclatural model Nomenclatural acts Type specimen Name Author entities Reference entities More name relations Concurrent taxon concepts Multiple classifications Multiple concepts per name Allows precise identifications, national lists, etc. Species images Reference image selected by community

Editor's Notes

  • #2: T1-23-06   The Global Biodiversity Information Facility: A Review of Progress and Challenges in Plant Data Mobilization and Integration       Siro Masinde1 Hobern Donald1 1. Global Biodiversity Information Facility (GBIF)   GBIF - the Global Biodiversity Information Facility (http://www.gbif.org/) is the leading aggregator of species occurrence data, including botanical and fungal data. As of January 2017, GBIF had more than 700 million records harvested from over 32,000 data resources published by more than 800 data providers. Since its establishment in 2002, GBIF has provided leadership in delivering tools, standards and best practices for mobilizing biodiversity data for free and open use by the research community and to support policy applications. Established tools and processes now support the aggregation and integration of species distribution data from natural history collections, field research, environmental genomics, citizen science and literature into a single, unified resource. Modern approaches to linked-open data management point the way to this data resource becoming the focal point for connecting the full spectrum of recorded information about species and communities. GBIF strives to provide well integrated data of high quality and in large quantity to enable users to query and summarize data, with answers that are as complete and accurate as possible. This can only be achieved if data providers worldwide mobilize large, high quality data across a wide spatial, temporal and taxonomic spectrum and publish them openly using interoperable standard data formats. We review GBIF progress in plant data mobilization and integration, and provide insights on future directions. We also highlight the challenges encountered and invite a discussion on future expectations of the botanical community and the role the community can play in developing and curating the GBIF database to meet the needs and expectations of users.       Keywords: data mobilization data integration data management GBIF linked-open data
  • #4: No subgenus or infraspecific hierarchy for quadrinomials Also valid / correct name? Orthographic variants ???