Janna Hastings,
                                                  Colin Batchelor,
                                                    Stefan Schulz,
                                              Christoph Steinbeck



 Modularity requirements in bio-ontologies
                               a case study of ChEBI




Workshop on Modular Ontologies, ESSLLI,
12 August 2011                            EBI is an Outstation of the European Molecular Biology Laboratory.
ChEBI:
             an ontology of biologically interesting chemicals

                                           ChEBI Ontology


                                  chemical entity           role
    chemical substance                                                                   biological role

                                 molecular entity               application
        group                                                                    chemical role
                           carbonyl compound                 pharmaceutical
                                                                                   solvent
carboxy group            carboxylic acid
                                                            antibacterial drug
                                                                                             cyclooxygenase
            has part                                                                             inhibitor
                                                                   has role




                                   cefpodoxime (CHEBI:606443)

2     22.02.2012
Bio-ontologies are modular by design:
                        domain and granularity

                        Domain             Chemistry


                 Granularity                      Upper level type
                                                   Material entities
                  Molecular
                   entities
                                                   Functions and
                                                      roles of
                 Substances                       chemical entities


3   22.02.2012            ChEBI ontology
They are characterised by large sizes
                         and low expressivity



     Currently
                                                  Chemical entities
     exported                                     (29132)
     in EL++                                      Roles (596)

                                                  Subatomic
                                                  particles (41)




     August 2011                          29769 classes in total
4   22.02.2012          ChEBI ontology
Classification practices in chemistry lead to
           high levels of multiple inheritance




5   22.02.2012   ChEBI ontology
ChEBI is growing


                                   bigger …




                        … and more expressive
6    22.02.2012   ChEBI ontology
                                    Image credit: Jonathan J. Dickau
Increased expressivity to enable automatic
                      classification

           hydrocarbon equivalentTo
                 molecule and has_atom only
                       (carbon atom or hydrogen atom)



           peptide cation equivalentTo
                  peptide and has_charge some double [>, 0.0]




7   22.02.2012        ChEBI ontology
carboxylic acid equivalentTo
          molecule and has_functional_group some
                 carboxy group



    tricarboxylic acid equivalentTo
           molecule and has_functional_group exactly 3
                 carboxy group




8   22.02.2012    ChEBI ontology
Size explosion in asserted parts




9   22.02.2012       ChEBI ontology
Reasoning is required for classification and
                 consistency validation
                  No definitional cycles
                        A part_of B part_of C part_of A

                  Enforcing disjointness
                         Chemical Entity disjoint_from Role …
                         Group disjoint_from Molecule …

                  No disallowed combinations of relations
                        A has_part B ; A conjugate_base_of B


10   22.02.2012           ChEBI ontology
Reasoning time in seconds




                                              Number of fully defined classes

11                  22.02.2012   ChEBI ontology
Modularity and large ontologies
                  smaller modules = faster classification
12   22.02.2012        ChEBI ontology
A USEFUL module for maintenance

                  … is delineated by topic

                  … is comprehensible and
                      easy to work with

                  … is self contained for
                      reasoning tasks
13   22.02.2012    ChEBI ontology
Subject-specific modules overlap



                                            Immunology


                   Drugs
                                                Metabolism

                                   Carboxylic
                                   acids

                                                    E.g. GO-SLIM approach
14   22.02.2012        ChEBI ontology
Self-contained modules
                   include all axioms needed for
              classification and consistency checking

                                            upper-level
        properties                          constraints
                                            (e.g. disjointness)

      parts

                                                hierarchy


15   22.02.2012       ChEBI ontology
Ontology segmentation tools
                  don’t work very well on ChEBI
                              … yet
                                        Topic blind

           Modules too small                   Out of memory
              or too big                    Long processing times


                                 No tool support
                                 for recombined
                                viewing/querying


16   22.02.2012       ChEBI ontology
Interrelating bio-ontologies
                             requires
                        modular imports




17   22.02.2012        ChEBI ontology
                                        Image credit: Rameesh Vyas
The MIREOT mechanism requires
               manual selection of module content
             and manual update of ontology changes


                                                   Build
                                                  ontology
                  Choose                            links
                   terms

                                        Extract
                                        module


18   22.02.2012        ChEBI ontology
We need modular ontology views

                  Automatic module extraction          View V1
                  based on selection criteria           (Topic,
                                                       Editing)


                                                Edit, Validate,
                                                write back to source


                              Ontology O


19   22.02.2012            ChEBI ontology
Views can be imported and are
                     then automatically updated


     Module extraction                    View V1
                                           (Topic,    Import of views
                                          Editing)


                                                       Ontology O2
               Ontology O1                             (e.g. biology)
             (e.g. chemistry)             Automatic
                                           update

20   22.02.2012          ChEBI ontology
How do we facilitate

                  the development of tools

                       for modular
                  ontology engineering?



21   22.02.2012       ChEBI ontology
22.02.2012




                               Thank you
                      Acknowledgements: BBSRC (funding)




22
ChEBI ontology                             EBI is an Outstation of the European Molecular Biology Laboratory.

Modularity requirements in bio-ontologies: a case study of ChEBI

  • 1.
    Janna Hastings, Colin Batchelor, Stefan Schulz, Christoph Steinbeck Modularity requirements in bio-ontologies a case study of ChEBI Workshop on Modular Ontologies, ESSLLI, 12 August 2011 EBI is an Outstation of the European Molecular Biology Laboratory.
  • 2.
    ChEBI: an ontology of biologically interesting chemicals ChEBI Ontology chemical entity role chemical substance biological role molecular entity application group chemical role carbonyl compound pharmaceutical solvent carboxy group carboxylic acid antibacterial drug cyclooxygenase has part inhibitor has role cefpodoxime (CHEBI:606443) 2 22.02.2012
  • 3.
    Bio-ontologies are modularby design: domain and granularity Domain Chemistry Granularity Upper level type Material entities Molecular entities Functions and roles of Substances chemical entities 3 22.02.2012 ChEBI ontology
  • 4.
    They are characterisedby large sizes and low expressivity Currently Chemical entities exported (29132) in EL++ Roles (596) Subatomic particles (41) August 2011 29769 classes in total 4 22.02.2012 ChEBI ontology
  • 5.
    Classification practices inchemistry lead to high levels of multiple inheritance 5 22.02.2012 ChEBI ontology
  • 6.
    ChEBI is growing bigger … … and more expressive 6 22.02.2012 ChEBI ontology Image credit: Jonathan J. Dickau
  • 7.
    Increased expressivity toenable automatic classification hydrocarbon equivalentTo molecule and has_atom only (carbon atom or hydrogen atom) peptide cation equivalentTo peptide and has_charge some double [>, 0.0] 7 22.02.2012 ChEBI ontology
  • 8.
    carboxylic acid equivalentTo molecule and has_functional_group some carboxy group tricarboxylic acid equivalentTo molecule and has_functional_group exactly 3 carboxy group 8 22.02.2012 ChEBI ontology
  • 9.
    Size explosion inasserted parts 9 22.02.2012 ChEBI ontology
  • 10.
    Reasoning is requiredfor classification and consistency validation No definitional cycles A part_of B part_of C part_of A Enforcing disjointness Chemical Entity disjoint_from Role … Group disjoint_from Molecule … No disallowed combinations of relations A has_part B ; A conjugate_base_of B 10 22.02.2012 ChEBI ontology
  • 11.
    Reasoning time inseconds Number of fully defined classes 11 22.02.2012 ChEBI ontology
  • 12.
    Modularity and largeontologies smaller modules = faster classification 12 22.02.2012 ChEBI ontology
  • 13.
    A USEFUL modulefor maintenance … is delineated by topic … is comprehensible and easy to work with … is self contained for reasoning tasks 13 22.02.2012 ChEBI ontology
  • 14.
    Subject-specific modules overlap Immunology Drugs Metabolism Carboxylic acids E.g. GO-SLIM approach 14 22.02.2012 ChEBI ontology
  • 15.
    Self-contained modules include all axioms needed for classification and consistency checking upper-level properties constraints (e.g. disjointness) parts hierarchy 15 22.02.2012 ChEBI ontology
  • 16.
    Ontology segmentation tools don’t work very well on ChEBI … yet Topic blind Modules too small Out of memory or too big Long processing times No tool support for recombined viewing/querying 16 22.02.2012 ChEBI ontology
  • 17.
    Interrelating bio-ontologies requires modular imports 17 22.02.2012 ChEBI ontology Image credit: Rameesh Vyas
  • 18.
    The MIREOT mechanismrequires manual selection of module content and manual update of ontology changes Build ontology Choose links terms Extract module 18 22.02.2012 ChEBI ontology
  • 19.
    We need modularontology views Automatic module extraction View V1 based on selection criteria (Topic, Editing) Edit, Validate, write back to source Ontology O 19 22.02.2012 ChEBI ontology
  • 20.
    Views can beimported and are then automatically updated Module extraction View V1 (Topic, Import of views Editing) Ontology O2 Ontology O1 (e.g. biology) (e.g. chemistry) Automatic update 20 22.02.2012 ChEBI ontology
  • 21.
    How do wefacilitate the development of tools for modular ontology engineering? 21 22.02.2012 ChEBI ontology
  • 22.
    22.02.2012 Thank you Acknowledgements: BBSRC (funding) 22 ChEBI ontology EBI is an Outstation of the European Molecular Biology Laboratory.

Editor's Notes

  • #5  29769 classes in latest OWL file release Of these, 28875 are descendents of chemical entity, 596 are roles 41 subatomic particles and 257 are chemical entities not classified as chemical entities, thus, the real count for chems is 29132
  • #9 Higher expressivity is not necessarily required for question answering, since the inferred hierarchy can be exported to OWL-EL for question answering.
  • #13 I am coming from the software engineering perspective in this talk. Modularity is a tool to design complex systems while focusing on local organisation.
  • #20 Tools are needed which are able to perform modularization of existing ontologies for purposes of ease of maintenance, then recombination for query answering.Shared terms between modules (represented only once)good way of thinking about it: Modular VIEWS on the overall ontologyAlso the ability to extract modules for import into other ontologies
  • #21 Tools for modularization of existing ontologies for purposes of ease of maintenance, then recombination for query answering Shared terms between modules (represented only once)good way of thinking about it: Modular VIEWS on the overall ontologyAbility to extract modules for import into other ontologies,