SlideShare a Scribd company logo
Relationship between the
      Semantic Web and NLP

                 Rajendra Akerkar
                 Technomathematics Research Foundation,
                 Kolhapur, India




March 17, 2009                      Akerkar: Sogndal Lecture   1
Structure of this talk
         Relationship between NLP and SW
         Inspiration: QA system and H
          I    i ti          t     d Haystack
                                         t k
         RDF Schema & NL Annotations
         Information Access Schemata
         Information Planning Schemata
         Integration
         Conclusion




March 17, 2009            Akerkar: Sogndal Lecture   2
The sense of the relationship

    Could the Semantic Web enhance the
     technical level of NLP technologies?

    Could NLP technologies help in delivering
     and using a better Semantic Web?
             g




March 17, 2009        Akerkar: Sogndal Lecture   3
Purpose of the Semantic Web

    to help users
         locate,
         organize,
          organize and
         process information.


    belief:
         It should be grounded in the information access
          method humans are comfortable with
          — natural language.

March 17, 2009             Akerkar: Sogndal Lecture         4
Why natural language?

    It is intuitive
           intuitive,
    easy to use and rapidly deployable, and
    no specialized training
                      training.




March 17, 2009        Akerkar: Sogndal Lecture   5
Vision
    The Semantic Web equally accessible by
     computers using specialized languages and
     interchange formats, and humans using
     natural l
        t l language.
         Ask a computer: “when was the king of Norway
          born?
          born?”
         “What’s the cheapest flight to the Mumbai this
          month?
          month?”
    Retrieve “exact information”.


March 17, 2009             Akerkar: Sogndal Lecture        6
What synergistic opportunities exist between natural
language technology and the S
l           h l       d h Semantic W b?
                                     i Web?
    State of the art
     State-of-the-art NL systems are capable of
     providing users intuitive access to a wealth of
     textual data using ordinary language.
    However, such systems are often hampered
     by
         the knowledge engineering bottleneck (wrappers,
          integrate new data source),
         knowledge integration (from multi. Sources), and
                  g     g                           )
         time consuming.
    Here Semantic Web comes in …

March 17, 2009                     Akerkar: Sogndal Lecture   7
Semantic Web research

    Constructing, integrating, packaging,
     Constructing integrating packaging and
     exporting segments of knowledge to be
     usable by the entire world.
             y
    NL technology can tap into this knowledge
     framework
         In return provides natural language information
          access for the Semantic Web.




March 17, 2009             Akerkar: Sogndal Lecture         8
SW: What is missing?

         Where in the loop is the human?
         How will we communicate with our software agents?
         How will we access information on the Semantic Web?

          Obviously, we cannot expect ordinary Semantic Web users to
          manually manipulate ontologies, query with formal logic
          expressions, etc.
                 i      t
          We would like to communicate with software agents in natural
          language…

         What is the role of natural language in the Semantic
          Web?


March 17, 2009                  Akerkar: Sogndal Lecture                 9
Mechanism for integrating NL into the
RDF
    Augmenting RDF property definitions
    Creating Information Access Schemata
         To bridge gap between NL & RDF
    Extension to mirror human question
     answering behaviour in the form of NL query
     plans.




March 17, 2009           Akerkar: Sogndal Lecture   10
Inspiration

    Question Answering
     Question-Answering (QA) System

    Haystack System
         End user semantic web platform
         aggregates all user s information into a unified
                         user’s
          repository.




March 17, 2009              Akerkar: Sogndal Lecture         11
QA system

    The use of metadata is a common technique
     for rendering information fragments more
     tenable to processing by computer systems.

    Our approach
         natural language itself as metadata
         numerous advantages and opportunities.
                preserves h
                           human readability and
                                    d bilit    d
                encourages non-expert users to engage in metadata
                 creation.


March 17, 2009                   Akerkar: Sogndal Lecture            12
QA system

    Natural language annotations
         machine-parsable sentences and phrases that
          describe the content of various i f
          d    ib th      t t f      i    information
                                                 ti
          segments.
         annotations serve as metadata
                describe the kinds of questions a particular piece of
                 knowledge is capable of answering.
    Contains natural language annotation
     technology

March 17, 2009                     Akerkar: Sogndal Lecture              13
QA system

    “For pioneering contributions to the theory and
      For
     practice of optimizing compiler techniques that laid
     the foundation for modern optimizing compilers and
     automatic parallel execution.” F
        t    ti      ll l     ti ” Frances E All was
                                                Allen
     selected for Turing award 2006.

    Annotation:
         Frances E Allen is selected for Turing award in 2006.
         2006 Turing award



March 17, 2009                Akerkar: Sogndal Lecture            14
QA system

    The annotations allow system to answer:

         What award did Allan receive in 2006?
         Who was selected for the Turing award in 2006?
         To whom was the Turing award given in 2006?




March 17, 2009            Akerkar: Sogndal Lecture         15
QA system

    Feature of natural language annotations
         any information segment can be annotated:
                not only text, but also images, multimedia …
                        y     ,             g ,
    To provide uniform access to semi-structured
     resources on the Web
         a virtual database system
                integrates Web sources under a single query interface.




March 17, 2009                     Akerkar: Sogndal Lecture               16
Haystack

    Aggregates a user’s information into a unified
                  user s
     repository.
         e mail,
          e-mail, documents, calendar, and web pages.
    It is presented using RDF
         makes it easy for agents to access filter and
                                       access, filter,
          process this information in an automated fashion.




March 17, 2009             Akerkar: Sogndal Lecture           17
Haystack

    “Present Tim the letter from the secretary I
      Present
     met with last Tuesday from TMRF.”
         Current IT allows to store all info to answer the
          query
                Scattered amongst multiple systems
                Agent need to communicate with
                    Email client
                    Calendar
                    File system
                    Directory server


March 17, 2009                          Akerkar: Sogndal Lecture   18
Haystack

    Reduce the protocol barriers to information—
                                     information
     standardizing on RDF as a common model
     for information—
         agents are free to mine the semantics of a user’s
          various data sources
    End-user
     End user application for managing
     information
         serves as a powerful platform for experimenting
          with various information retrieval and user
          interface research problems

March 17, 2009             Akerkar: Sogndal Lecture           19
QA System & Haystack

    By incorporating natural language search
     capabilities into Haystack
    Demonstrate
         the usefulness of natural language search
         show its applicability to the Semantic Web




March 17, 2009             Akerkar: Sogndal Lecture    20
To endow Haystack with the ability to
answer
         What is the state bird of India?
         Tell me what the vision statement of TMRF is.
         Do you know Sogndal’s population?
                        Sogndal s


    Easy on Web
    But, for this data to be usable by any
     Semantic Web system it must be
                      system,
     restructured in terms of the RDF model.

March 17, 2009             Akerkar: Sogndal Lecture       21
Adenine

    To facilitate frequent manipulation of RDF
     data, Haystack’s programming language.
         Features of Lisp, Python, and Notation3.
         Basic data unit is the RDF triple.




March 17, 2009             Akerkar: Sogndal Lecture   22
Adenine :State class and the :bird property
     @prefix dc: <http://purl.org/dc/elements/1.1/>
     @prefix : <www.tourindia.com/data#>

          add { :State                                   Triples
                                                         T i l are enclosed i curly
                                                                        l   d in    l
          rdf:type rdfs:Class ;
          rdfs:label "State"                             braces { } and expressed in
     }                                                   subject-predicate-object order.
          add { :bird
          rdf:type rdf:Property ;               semicolon denotes the predicate-
          rdfs:label “State bird" ;
          rdfs:domain :State                    object pair is to assume the last
     }                                          used subject
                                                     subject.
     # ... more property declarations

        add { :india
        rdf:type :State ;
        dc:title “India" ;      RDF literals are written as         strings in double quotes
        :bird “Peacock" ;
        :flower “Lotus" ;
        :population "1,147,995,904"
     # ... more information about India and its states
     }

March 17, 2009                    Akerkar: Sogndal Lecture                                 23
Adenine unique feature

    Every Adenine instruction is encoded as a
     node in the RDF graph, and a sequence of
     instructions is expressed by adenine:next
                       p        y
     arcs between these instruction nodes.

    As a result, data and procedures can be
     embedded within the same RDF graph and
     can be distributed together.


March 17, 2009       Akerkar: Sogndal Lecture    24
The connection between the RDF schema and the
NL annotations in natural language schema
          i    i    t ll             h
@prefix nl: <http://www.tmrfindia.org/sw/projects/enlight#>
    add { :stateAttribute
    rdf:type nl:NaturalLanguageSchema ;
    # This annotation handles cases like "[state bird] of [India]"
    # and "[population] of [India]".
    nl:annotation @( :attribute "of" :state ) ;                   The definition of :attribute
    # Code to run to resolve state attribute                      restricts the resource representing
    nl:code :stateAttributeCode                                   the attribute to be queried to have
}                                                                 type rdf:Property.
                                                                         df P         t
    add { :attribute
    rdf:type nl:Parameter ;
    nl:domain rdf:Property ;                The rdfs:label property to resolve the actual literal,
    nl:descriptionProperty rdfs:label       e.g., “State bird” or “population”.
}
    add { :state
                                                   :state restricts the resource to have type
    rdf:type nl:Parameter ;
    nl:domain :State ;                             :State and to have the resolver dc:title
    nl:descriptionProperty dc:title
}
# The identifier [state] will be bound to the value of the named
# parameter :state. The identifier [attribute] will be bound to the
# value of the named parameter :attribute.
method :stateAttributeCode :state = state :attribute = attribute
    # Ask the system what the [attribute] property of [state] is
    return (ask %{ attribute state ?x })



March 17, 2009                         Akerkar: Sogndal Lecture                                25
Question Answering
    What is the state bird of India?

         System parses the question and determines that
          :stateAttribute is the relevant natural language schema to
          invoke.
          i    k
         System extracts the natural language bindings of :attribute
          and :state, which are “state bird” and “India”, respectively. This
          is further resolved into the RDF resources :bird and :india
                                                                :india.
         As a response to the question, the method
          :stateAttributeCode is invoked with named parameter
          :attribute bound to :bird and named parameter :state
                                                       p
          bound to :india.
         The invoked method performs a query into Haystack’s RDF store,
          which returns “Peacock”, the state bird of India.


March 17, 2009                   Akerkar: Sogndal Lecture                 26
User query is parsed by QA System

    So,
     So a single natural language annotation is
     capable of answering a question.

    QA system is capable of normalizing different
     methods for requesting the same information
                                     information.
         imperative (“Tell me...”),
         interrogative (“What is ”)
                        ( What is... ).



March 17, 2009               Akerkar: Sogndal Lecture   27
Natural language schema
add { :stateAttribute
      rdf:type nl:NaturalLanguageSchema                 The method invoked by the
         ;                                              NLS queries the RDF store for
                                                              q
      nl:annotation @( :state " has the                 the resource of type :State
         largest " :comparisonAttribute                 that contains the maximal
         ) ;
                                                        integer value for the property
      nl:code
         :maxComparisonAttributeCode                    given by
}
                                                        :comparisonAttribute.

               p
method :maxComparisonAttributeCode
   :comparisonAttribute = attribute
      return (ask %{                                  Allow our system to answer the
         rdf:type ?x :State ,                         following questions:
         adenine:argMax ?x ?y 1 xsd:int                   • Which state has the lowest
                                                            population?
            %{
                                                          • Do you know what state has the
         :attribute ?x ?y                                   largest area?
         }
} @(?x))


March 17, 2009             Akerkar: Sogndal Lecture                                  28
QA System

    Built a prototype implementing the natural
     language schemata.
    Limited in the types of questions that it can
     answer and the domain.

    However, proof of concept that demonstrates
     a method of marrying natural language with
     the Semantic Web.

March 17, 2009         Akerkar: Sogndal Lecture      29
Further integrating natural language
technology with the Semantic Web
    RDF triples ≈ System’s ternary expression
                   representation of NL.

    Clipping natural language annotations directly
     into
     i t rdf:Property d fi iti
                          definitions.

    Consider a piece of an ontology modeling an
     address book entry in Haystack:

March 17, 2009         Akerkar: Sogndal Lecture       30
A natural language-aware software agent could answer questions…

add { :Person
     rdf:type                      rdfs:Class                        The :homeAddress is a property
}                                                                    specifying a user’s home address.
add { :homeAddress
      rdf:type                      rdf:Property ;
      rdfs:domain                   :Person ;
      rdfs:range                    xsd:string ;
                                                                                       Annotation
          nl:annotation             @( nl:subject " lives at "                         expresses this
                                       nl:object ) ;                                   connection
          nl:annotation             @( nl:subject "’s home address is "                concretely in
                                       nl:object ) ;                                   natural language,
          nl:annotation             @( nl:subject "’s bungalow" ) ;                    via the
          nl:generation             @( nl:subject "’s home address is                  nl:annotation
                                       nl:object )                                     property.
}


                 The phrase “nl:subject lives at nl:object” is linked to every RDF
                 statement involving the :homeAddress property, where nl:subject is
                 shorthand for indicating the subject (domain) of the relation, and nl:object
                 is h th d for the bj t (range) of the relation.
                 i shorthand f th object (         ) f th   l ti

March 17, 2009                            Akerkar: Sogndal Lecture                                  31
‘Make sense’ with minimal cost!

    The nl:generation property specifies a
     natural language version of the knowledge.
         allows software agents to present meaningful,
          natural responses to users.

         Question: Where does Ram live?
         Reply: Ram’s home is Tellefsens gate 5.




March 17, 2009              Akerkar: Sogndal Lecture      32
Information Access Schemata

    Despite the simplicity of adding NL
     annotations to RDF properties
         Significant restriction : only one RDF statement
          can be queried at once.
         Solution: Create a schemata that captures similar
          patterns of information access.




March 17, 2009             Akerkar: Sogndal Lecture           33
An information access schema is a
quadruple
    Annotations: NL sentences (   (either declarative or
     interrogative) or phrases that describe the types of
     user questions the schema can answer
                                       answer.
    Pattern: a declarative pattern of RDF triples that
     references a pre-existing ontology.
                   p         g        gy
    Action: a set of operators to further process variables
     bound during the pattern matching process.
    Mapping: mechanism for handling disjunction
     between lexical and ontological terms.


March 17, 2009           Akerkar: Sogndal Lecture        34
Example: “family” of questions

         What is the country in Asia with the largest area?
         Tell me what Asian country has the highest
          population density
                      density.
         What country in Europe has the lowest infant
          mortality rate?
                  y
         What is the most populated American country?




March 17, 2009              Akerkar: Sogndal Lecture           35
Capture the “pattern” of information
requests i an i f
         in information access schema
                      i            h
     <nl:InformationAccessSchema>
                                                                                    Natural language
           <nl:ann>what country in $region has the largest
        $attribute</nl:ann>                                                         annotations are
           <nl:pattern>?x a :Country</nl:pattern>                                   employed to
           <nl:pattern>?x map($attribute) ?val</nl:pattern>                         describe a pattern of
                                                                                    RDF statements
                 <nl:pattern>?x :location $region</nl:pattern>
                 <nl:action>display(boundto(?x, max(?val)))
                  </nl:action>

       <nl:mapping>                                             Because annotations would be
           <nl:hash variable="$attribute">
                    variable $attribute >                       p
                                                                processed by linguistically
                                                                             y g          y
                   <nl:map value="population">                  sophisticated systems, different
                   :population                                  adjectives such as “highest” and
                    </nl:map>                                   “largest” could be uniformly mapped
                    <nl:map value="area">                       onto the maximum operation.
                    :area
                     </nl:map>
                      ...                                       Schema answers questions that
                     </nl:hash>                                 involve region specific superlative
                     </nl:mapping>
                            pp g                                comparison of countries.
         </nl:InformationAccessSchema>


March 17, 2009                       Akerkar: Sogndal Lecture                                         36
    The pattern binds to the value of the particular attribute for
          countries within the queried geographic region, and the action
              ti      ithi th      i d         hi      i       d th     ti
          specifies an aggregate operation (maximum) over the values
          bound within the pattern.
                The country corresponding to that maximum value is returned as the
                 answer.

         The mapping provides a translation from language attributes to
                pp g p                               g g
          RDF properties.
                Information access schemata are written with respect to a particular
                 pre-existing ontology;
                In thi
                 I this example, we assume th t an appropriate ontology h b
                               l             that            i t    t l   has been
                 established (i.e., :Country is defined as a class, and :location is
                 defined as a property).


    In this vision of the Semantic Web, information access schemata
     grounded in natural language would co-exist alongside RDF
     metadata.


March 17, 2009                        Akerkar: Sogndal Lecture                      37
Further extension: Query Plan

    Question: What is the distance from India to
     Norway?
    Solution Plan: To compute the distance
     between their respective capitals.


Could humans “teach” such plans to a computer directly
              teach                                      ?

March 17, 2009           Akerkar: Sogndal Lecture            38
Information Planning Schemata

    An extension of Information Access Schemata.
    Simplifies the task of knowledge engineering.

    Example:
                Instead of writing RDF patterns,
                    which would require knowledge of domain-specific ontologies,
                Use natural language itself to describe the process of
                 answering a question.
                    The answer plan (nl:plan) reflects the user’s thought process
                     expressed in natural language: first find the capitals of the
                     countries, and then find the distance between those cities



March 17, 2009                         Akerkar: Sogndal Lecture                      39
An information planning schema
   <nl:InformationPlanningSchema>

              <nl:ann>distance between $country1
                        and $country2</ann>
              <nl:plan>
                 <rdf:Seq>
                     <rdf:li>what is the capital of $country1
                                 := ?capital1</rdf:li>
                      <rdf:li>what is the capital of $country2
                                 := ?capital2</rdf:li>
                      <rdf:li>what is the distance between
                                 ?capital1 and ?capital2
                                  := ?distance</rdf:li>
                        </rdf:Seq>
                     </nl:plan>
                   <nl:action>display(?distance)</nl:action>
     </nl:InformationPlanningSchema>




March 17, 2009              Akerkar: Sogndal Lecture             40
Integrating the methods
    The three proposed methods for integrating natural language and
     RDF can be used together to afford greater flexibility.
                Annotating RDF properties is a low-cost (from a knowledge
                 engineering perspective) way of providing natural language access to
                 RDF statements.
                Information access schemata while being more complex and
                                     schemata,
                 requiring knowledge of domain-specific ontologies, give experienced
                 knowledge engineers fine-grained tools for manipulating RDF and
                 controlling the output.
                Information planning schemata allow users to describe in natural
                                                              describe,
                 language itself, how they would go about answering a particular class
                 of questions.


    These three methods can combine to provide the foundation for
     question answering on the Semantic Web.



March 17, 2009                         Akerkar: Sogndal Lecture                     41
Thank You !




March 17, 2009      Akerkar: Sogndal Lecture   42

More Related Content

Similar to Relationship between the Semantic Web and NLP (20)

PPTX
ESWC 2015 Closing and "General Chair's minute of Madness"
Fabien Gandon
 
ODP
RDF and the Semantic Web -- Joanna Pszenicyn
Richard.Sapon-White
 
PDF
Chapter 1 semantic web
R A Akerkar
 
PPT
GATE, HLT and Machine Learning, Sheffield, July 2003
butest
 
PPTX
Ontology mapping for the semantic web
Worawith Sangkatip
 
PPT
Multilingual Access to Cultural Heritage Content on the Semantic Web - Acl2013
Mariana Damova, Ph.D
 
PDF
The Mystery of Natural Language Processing
Mahmood Aijazi, MD
 
PDF
9. the semantic grid and autonomic grid
Dr Sandeep Kumar Poonia
 
PPTX
Exposing Bibliographic Information as Linked Open Data using Standards-based ...
Nikolaos Konstantinou
 
PDF
G9_Cognition-Knowledge Rep and Reas_H6
Marta de la Cruz Martos
 
PPT
The Semantic Web: status and prospects
Guus Schreiber
 
PDF
Application Semantics via Rules in Open Vocabulary English
Adrian Walker
 
PPTX
Lecture semantic dataaccess_presentation
IKS - Project
 
PDF
Ontology development
Stefano Bragaglia
 
ODP
SESAM4 - A guide to semantics in the Linked Open Data cloud, Robert HP Engels...
The Research Council of Norway, IKTPLUSS
 
PDF
Semtech2006
Adrian Walker
 
PDF
BrainSpa Paper
adina toderas
 
PDF
Semantic Markup
R A Akerkar
 
PPT
Corrib.org - OpenSource and Research
adameq
 
PPT
Neno/Fhat: Semantic Network Programming Language and Virtual Machine Specific...
Marko Rodriguez
 
ESWC 2015 Closing and "General Chair's minute of Madness"
Fabien Gandon
 
RDF and the Semantic Web -- Joanna Pszenicyn
Richard.Sapon-White
 
Chapter 1 semantic web
R A Akerkar
 
GATE, HLT and Machine Learning, Sheffield, July 2003
butest
 
Ontology mapping for the semantic web
Worawith Sangkatip
 
Multilingual Access to Cultural Heritage Content on the Semantic Web - Acl2013
Mariana Damova, Ph.D
 
The Mystery of Natural Language Processing
Mahmood Aijazi, MD
 
9. the semantic grid and autonomic grid
Dr Sandeep Kumar Poonia
 
Exposing Bibliographic Information as Linked Open Data using Standards-based ...
Nikolaos Konstantinou
 
G9_Cognition-Knowledge Rep and Reas_H6
Marta de la Cruz Martos
 
The Semantic Web: status and prospects
Guus Schreiber
 
Application Semantics via Rules in Open Vocabulary English
Adrian Walker
 
Lecture semantic dataaccess_presentation
IKS - Project
 
Ontology development
Stefano Bragaglia
 
SESAM4 - A guide to semantics in the Linked Open Data cloud, Robert HP Engels...
The Research Council of Norway, IKTPLUSS
 
Semtech2006
Adrian Walker
 
BrainSpa Paper
adina toderas
 
Semantic Markup
R A Akerkar
 
Corrib.org - OpenSource and Research
adameq
 
Neno/Fhat: Semantic Network Programming Language and Virtual Machine Specific...
Marko Rodriguez
 

More from R A Akerkar (20)

PDF
Rajendraakerkar lemoproject
R A Akerkar
 
PDF
Big Data and Harvesting Data from Social Media
R A Akerkar
 
PDF
Can You Really Make Best Use of Big Data?
R A Akerkar
 
PDF
Big data in Business Innovation
R A Akerkar
 
PDF
What is Big Data ?
R A Akerkar
 
PDF
Connecting and Exploiting Big Data
R A Akerkar
 
PDF
Linked open data
R A Akerkar
 
PDF
Semi structure data extraction
R A Akerkar
 
PDF
Big data: analyzing large data sets
R A Akerkar
 
PDF
Description logics
R A Akerkar
 
PDF
Data Mining
R A Akerkar
 
PDF
Link analysis
R A Akerkar
 
PDF
artificial intelligence
R A Akerkar
 
PDF
Case Based Reasoning
R A Akerkar
 
PDF
Intelligent natural language system
R A Akerkar
 
PDF
Data mining
R A Akerkar
 
PDF
Knowledge Organization Systems
R A Akerkar
 
PDF
Rational Unified Process for User Interface Design
R A Akerkar
 
PDF
Unified Modelling Language
R A Akerkar
 
PDF
Statistical Preliminaries
R A Akerkar
 
Rajendraakerkar lemoproject
R A Akerkar
 
Big Data and Harvesting Data from Social Media
R A Akerkar
 
Can You Really Make Best Use of Big Data?
R A Akerkar
 
Big data in Business Innovation
R A Akerkar
 
What is Big Data ?
R A Akerkar
 
Connecting and Exploiting Big Data
R A Akerkar
 
Linked open data
R A Akerkar
 
Semi structure data extraction
R A Akerkar
 
Big data: analyzing large data sets
R A Akerkar
 
Description logics
R A Akerkar
 
Data Mining
R A Akerkar
 
Link analysis
R A Akerkar
 
artificial intelligence
R A Akerkar
 
Case Based Reasoning
R A Akerkar
 
Intelligent natural language system
R A Akerkar
 
Data mining
R A Akerkar
 
Knowledge Organization Systems
R A Akerkar
 
Rational Unified Process for User Interface Design
R A Akerkar
 
Unified Modelling Language
R A Akerkar
 
Statistical Preliminaries
R A Akerkar
 
Ad

Recently uploaded (20)

PDF
community health nursing question paper 2.pdf
Prince kumar
 
PDF
The-Ever-Evolving-World-of-Science (1).pdf/7TH CLASS CURIOSITY /1ST CHAPTER/B...
Sandeep Swamy
 
PPTX
2025 Winter SWAYAM NPTEL & A Student.pptx
Utsav Yagnik
 
PPTX
Neurodivergent Friendly Schools - Slides from training session
Pooky Knightsmith
 
PDF
Generative AI: it's STILL not a robot (CIJ Summer 2025)
Paul Bradshaw
 
PDF
DIGESTION OF CARBOHYDRATES,PROTEINS,LIPIDS
raviralanaresh2
 
PDF
ARAL_Orientation_Day-2-Sessions_ARAL-Readung ARAL-Mathematics ARAL-Sciencev2.pdf
JoelVilloso1
 
PDF
QNL June Edition hosted by Pragya the official Quiz Club of the University of...
Pragya - UEM Kolkata Quiz Club
 
PDF
Reconstruct, Restore, Reimagine: New Perspectives on Stoke Newington’s Histor...
History of Stoke Newington
 
PPTX
How to Set Up Tags in Odoo 18 - Odoo Slides
Celine George
 
PDF
Stokey: A Jewish Village by Rachel Kolsky
History of Stoke Newington
 
PDF
The Different Types of Non-Experimental Research
Thelma Villaflores
 
PDF
LAW OF CONTRACT (5 YEAR LLB & UNITARY LLB )- MODULE - 1.& 2 - LEARN THROUGH P...
APARNA T SHAIL KUMAR
 
PDF
CONCURSO DE POESIA “POETUFAS – PASSOS SUAVES PELO VERSO.pdf
Colégio Santa Teresinha
 
PDF
Knee Extensor Mechanism Injuries - Orthopedic Radiologic Imaging
Sean M. Fox
 
PPTX
Cultivation practice of Litchi in Nepal.pptx
UmeshTimilsina1
 
PPTX
How to Create a PDF Report in Odoo 18 - Odoo Slides
Celine George
 
PPTX
I AM MALALA The Girl Who Stood Up for Education and was Shot by the Taliban...
Beena E S
 
PDF
Isharyanti-2025-Cross Language Communication in Indonesian Language
Neny Isharyanti
 
PDF
ARAL-Orientation_Morning-Session_Day-11.pdf
JoelVilloso1
 
community health nursing question paper 2.pdf
Prince kumar
 
The-Ever-Evolving-World-of-Science (1).pdf/7TH CLASS CURIOSITY /1ST CHAPTER/B...
Sandeep Swamy
 
2025 Winter SWAYAM NPTEL & A Student.pptx
Utsav Yagnik
 
Neurodivergent Friendly Schools - Slides from training session
Pooky Knightsmith
 
Generative AI: it's STILL not a robot (CIJ Summer 2025)
Paul Bradshaw
 
DIGESTION OF CARBOHYDRATES,PROTEINS,LIPIDS
raviralanaresh2
 
ARAL_Orientation_Day-2-Sessions_ARAL-Readung ARAL-Mathematics ARAL-Sciencev2.pdf
JoelVilloso1
 
QNL June Edition hosted by Pragya the official Quiz Club of the University of...
Pragya - UEM Kolkata Quiz Club
 
Reconstruct, Restore, Reimagine: New Perspectives on Stoke Newington’s Histor...
History of Stoke Newington
 
How to Set Up Tags in Odoo 18 - Odoo Slides
Celine George
 
Stokey: A Jewish Village by Rachel Kolsky
History of Stoke Newington
 
The Different Types of Non-Experimental Research
Thelma Villaflores
 
LAW OF CONTRACT (5 YEAR LLB & UNITARY LLB )- MODULE - 1.& 2 - LEARN THROUGH P...
APARNA T SHAIL KUMAR
 
CONCURSO DE POESIA “POETUFAS – PASSOS SUAVES PELO VERSO.pdf
Colégio Santa Teresinha
 
Knee Extensor Mechanism Injuries - Orthopedic Radiologic Imaging
Sean M. Fox
 
Cultivation practice of Litchi in Nepal.pptx
UmeshTimilsina1
 
How to Create a PDF Report in Odoo 18 - Odoo Slides
Celine George
 
I AM MALALA The Girl Who Stood Up for Education and was Shot by the Taliban...
Beena E S
 
Isharyanti-2025-Cross Language Communication in Indonesian Language
Neny Isharyanti
 
ARAL-Orientation_Morning-Session_Day-11.pdf
JoelVilloso1
 
Ad

Relationship between the Semantic Web and NLP

  • 1. Relationship between the Semantic Web and NLP Rajendra Akerkar Technomathematics Research Foundation, Kolhapur, India March 17, 2009 Akerkar: Sogndal Lecture 1
  • 2. Structure of this talk  Relationship between NLP and SW  Inspiration: QA system and H I i ti t d Haystack t k  RDF Schema & NL Annotations  Information Access Schemata  Information Planning Schemata  Integration  Conclusion March 17, 2009 Akerkar: Sogndal Lecture 2
  • 3. The sense of the relationship  Could the Semantic Web enhance the technical level of NLP technologies?  Could NLP technologies help in delivering and using a better Semantic Web? g March 17, 2009 Akerkar: Sogndal Lecture 3
  • 4. Purpose of the Semantic Web  to help users  locate,  organize, organize and  process information.  belief:  It should be grounded in the information access method humans are comfortable with — natural language. March 17, 2009 Akerkar: Sogndal Lecture 4
  • 5. Why natural language?  It is intuitive intuitive,  easy to use and rapidly deployable, and  no specialized training training. March 17, 2009 Akerkar: Sogndal Lecture 5
  • 6. Vision  The Semantic Web equally accessible by computers using specialized languages and interchange formats, and humans using natural l t l language.  Ask a computer: “when was the king of Norway born? born?”  “What’s the cheapest flight to the Mumbai this month? month?”  Retrieve “exact information”. March 17, 2009 Akerkar: Sogndal Lecture 6
  • 7. What synergistic opportunities exist between natural language technology and the S l h l d h Semantic W b? i Web?  State of the art State-of-the-art NL systems are capable of providing users intuitive access to a wealth of textual data using ordinary language.  However, such systems are often hampered by  the knowledge engineering bottleneck (wrappers, integrate new data source),  knowledge integration (from multi. Sources), and g g )  time consuming.  Here Semantic Web comes in … March 17, 2009 Akerkar: Sogndal Lecture 7
  • 8. Semantic Web research  Constructing, integrating, packaging, Constructing integrating packaging and exporting segments of knowledge to be usable by the entire world. y  NL technology can tap into this knowledge framework  In return provides natural language information access for the Semantic Web. March 17, 2009 Akerkar: Sogndal Lecture 8
  • 9. SW: What is missing?  Where in the loop is the human?  How will we communicate with our software agents?  How will we access information on the Semantic Web? Obviously, we cannot expect ordinary Semantic Web users to manually manipulate ontologies, query with formal logic expressions, etc. i t We would like to communicate with software agents in natural language…  What is the role of natural language in the Semantic Web? March 17, 2009 Akerkar: Sogndal Lecture 9
  • 10. Mechanism for integrating NL into the RDF  Augmenting RDF property definitions  Creating Information Access Schemata  To bridge gap between NL & RDF  Extension to mirror human question answering behaviour in the form of NL query plans. March 17, 2009 Akerkar: Sogndal Lecture 10
  • 11. Inspiration  Question Answering Question-Answering (QA) System  Haystack System  End user semantic web platform  aggregates all user s information into a unified user’s repository. March 17, 2009 Akerkar: Sogndal Lecture 11
  • 12. QA system  The use of metadata is a common technique for rendering information fragments more tenable to processing by computer systems.  Our approach  natural language itself as metadata  numerous advantages and opportunities.  preserves h human readability and d bilit d  encourages non-expert users to engage in metadata creation. March 17, 2009 Akerkar: Sogndal Lecture 12
  • 13. QA system  Natural language annotations  machine-parsable sentences and phrases that describe the content of various i f d ib th t t f i information ti segments.  annotations serve as metadata  describe the kinds of questions a particular piece of knowledge is capable of answering.  Contains natural language annotation technology March 17, 2009 Akerkar: Sogndal Lecture 13
  • 14. QA system  “For pioneering contributions to the theory and For practice of optimizing compiler techniques that laid the foundation for modern optimizing compilers and automatic parallel execution.” F t ti ll l ti ” Frances E All was Allen selected for Turing award 2006.  Annotation:  Frances E Allen is selected for Turing award in 2006.  2006 Turing award March 17, 2009 Akerkar: Sogndal Lecture 14
  • 15. QA system  The annotations allow system to answer:  What award did Allan receive in 2006?  Who was selected for the Turing award in 2006?  To whom was the Turing award given in 2006? March 17, 2009 Akerkar: Sogndal Lecture 15
  • 16. QA system  Feature of natural language annotations  any information segment can be annotated:  not only text, but also images, multimedia … y , g ,  To provide uniform access to semi-structured resources on the Web  a virtual database system  integrates Web sources under a single query interface. March 17, 2009 Akerkar: Sogndal Lecture 16
  • 17. Haystack  Aggregates a user’s information into a unified user s repository.  e mail, e-mail, documents, calendar, and web pages.  It is presented using RDF  makes it easy for agents to access filter and access, filter, process this information in an automated fashion. March 17, 2009 Akerkar: Sogndal Lecture 17
  • 18. Haystack  “Present Tim the letter from the secretary I Present met with last Tuesday from TMRF.”  Current IT allows to store all info to answer the query  Scattered amongst multiple systems  Agent need to communicate with  Email client  Calendar  File system  Directory server March 17, 2009 Akerkar: Sogndal Lecture 18
  • 19. Haystack  Reduce the protocol barriers to information— information standardizing on RDF as a common model for information—  agents are free to mine the semantics of a user’s various data sources  End-user End user application for managing information  serves as a powerful platform for experimenting with various information retrieval and user interface research problems March 17, 2009 Akerkar: Sogndal Lecture 19
  • 20. QA System & Haystack  By incorporating natural language search capabilities into Haystack  Demonstrate  the usefulness of natural language search  show its applicability to the Semantic Web March 17, 2009 Akerkar: Sogndal Lecture 20
  • 21. To endow Haystack with the ability to answer  What is the state bird of India?  Tell me what the vision statement of TMRF is.  Do you know Sogndal’s population? Sogndal s  Easy on Web  But, for this data to be usable by any Semantic Web system it must be system, restructured in terms of the RDF model. March 17, 2009 Akerkar: Sogndal Lecture 21
  • 22. Adenine  To facilitate frequent manipulation of RDF data, Haystack’s programming language.  Features of Lisp, Python, and Notation3.  Basic data unit is the RDF triple. March 17, 2009 Akerkar: Sogndal Lecture 22
  • 23. Adenine :State class and the :bird property @prefix dc: <http://purl.org/dc/elements/1.1/> @prefix : <www.tourindia.com/data#> add { :State Triples T i l are enclosed i curly l d in l rdf:type rdfs:Class ; rdfs:label "State" braces { } and expressed in } subject-predicate-object order. add { :bird rdf:type rdf:Property ; semicolon denotes the predicate- rdfs:label “State bird" ; rdfs:domain :State object pair is to assume the last } used subject subject. # ... more property declarations add { :india rdf:type :State ; dc:title “India" ; RDF literals are written as strings in double quotes :bird “Peacock" ; :flower “Lotus" ; :population "1,147,995,904" # ... more information about India and its states } March 17, 2009 Akerkar: Sogndal Lecture 23
  • 24. Adenine unique feature  Every Adenine instruction is encoded as a node in the RDF graph, and a sequence of instructions is expressed by adenine:next p y arcs between these instruction nodes.  As a result, data and procedures can be embedded within the same RDF graph and can be distributed together. March 17, 2009 Akerkar: Sogndal Lecture 24
  • 25. The connection between the RDF schema and the NL annotations in natural language schema i i t ll h @prefix nl: <http://www.tmrfindia.org/sw/projects/enlight#> add { :stateAttribute rdf:type nl:NaturalLanguageSchema ; # This annotation handles cases like "[state bird] of [India]" # and "[population] of [India]". nl:annotation @( :attribute "of" :state ) ; The definition of :attribute # Code to run to resolve state attribute restricts the resource representing nl:code :stateAttributeCode the attribute to be queried to have } type rdf:Property. df P t add { :attribute rdf:type nl:Parameter ; nl:domain rdf:Property ; The rdfs:label property to resolve the actual literal, nl:descriptionProperty rdfs:label e.g., “State bird” or “population”. } add { :state :state restricts the resource to have type rdf:type nl:Parameter ; nl:domain :State ; :State and to have the resolver dc:title nl:descriptionProperty dc:title } # The identifier [state] will be bound to the value of the named # parameter :state. The identifier [attribute] will be bound to the # value of the named parameter :attribute. method :stateAttributeCode :state = state :attribute = attribute # Ask the system what the [attribute] property of [state] is return (ask %{ attribute state ?x }) March 17, 2009 Akerkar: Sogndal Lecture 25
  • 26. Question Answering  What is the state bird of India?  System parses the question and determines that :stateAttribute is the relevant natural language schema to invoke. i k  System extracts the natural language bindings of :attribute and :state, which are “state bird” and “India”, respectively. This is further resolved into the RDF resources :bird and :india :india.  As a response to the question, the method :stateAttributeCode is invoked with named parameter :attribute bound to :bird and named parameter :state p bound to :india.  The invoked method performs a query into Haystack’s RDF store, which returns “Peacock”, the state bird of India. March 17, 2009 Akerkar: Sogndal Lecture 26
  • 27. User query is parsed by QA System  So, So a single natural language annotation is capable of answering a question.  QA system is capable of normalizing different methods for requesting the same information information.  imperative (“Tell me...”),  interrogative (“What is ”) ( What is... ). March 17, 2009 Akerkar: Sogndal Lecture 27
  • 28. Natural language schema add { :stateAttribute rdf:type nl:NaturalLanguageSchema The method invoked by the ; NLS queries the RDF store for q nl:annotation @( :state " has the the resource of type :State largest " :comparisonAttribute that contains the maximal ) ; integer value for the property nl:code :maxComparisonAttributeCode given by } :comparisonAttribute. p method :maxComparisonAttributeCode :comparisonAttribute = attribute return (ask %{ Allow our system to answer the rdf:type ?x :State , following questions: adenine:argMax ?x ?y 1 xsd:int • Which state has the lowest population? %{ • Do you know what state has the :attribute ?x ?y largest area? } } @(?x)) March 17, 2009 Akerkar: Sogndal Lecture 28
  • 29. QA System  Built a prototype implementing the natural language schemata.  Limited in the types of questions that it can answer and the domain.  However, proof of concept that demonstrates a method of marrying natural language with the Semantic Web. March 17, 2009 Akerkar: Sogndal Lecture 29
  • 30. Further integrating natural language technology with the Semantic Web  RDF triples ≈ System’s ternary expression representation of NL.  Clipping natural language annotations directly into i t rdf:Property d fi iti definitions.  Consider a piece of an ontology modeling an address book entry in Haystack: March 17, 2009 Akerkar: Sogndal Lecture 30
  • 31. A natural language-aware software agent could answer questions… add { :Person rdf:type rdfs:Class The :homeAddress is a property } specifying a user’s home address. add { :homeAddress rdf:type rdf:Property ; rdfs:domain :Person ; rdfs:range xsd:string ; Annotation nl:annotation @( nl:subject " lives at " expresses this nl:object ) ; connection nl:annotation @( nl:subject "’s home address is " concretely in nl:object ) ; natural language, nl:annotation @( nl:subject "’s bungalow" ) ; via the nl:generation @( nl:subject "’s home address is nl:annotation nl:object ) property. } The phrase “nl:subject lives at nl:object” is linked to every RDF statement involving the :homeAddress property, where nl:subject is shorthand for indicating the subject (domain) of the relation, and nl:object is h th d for the bj t (range) of the relation. i shorthand f th object ( ) f th l ti March 17, 2009 Akerkar: Sogndal Lecture 31
  • 32. ‘Make sense’ with minimal cost!  The nl:generation property specifies a natural language version of the knowledge.  allows software agents to present meaningful, natural responses to users.  Question: Where does Ram live?  Reply: Ram’s home is Tellefsens gate 5. March 17, 2009 Akerkar: Sogndal Lecture 32
  • 33. Information Access Schemata  Despite the simplicity of adding NL annotations to RDF properties  Significant restriction : only one RDF statement can be queried at once.  Solution: Create a schemata that captures similar patterns of information access. March 17, 2009 Akerkar: Sogndal Lecture 33
  • 34. An information access schema is a quadruple  Annotations: NL sentences ( (either declarative or interrogative) or phrases that describe the types of user questions the schema can answer answer.  Pattern: a declarative pattern of RDF triples that references a pre-existing ontology. p g gy  Action: a set of operators to further process variables bound during the pattern matching process.  Mapping: mechanism for handling disjunction between lexical and ontological terms. March 17, 2009 Akerkar: Sogndal Lecture 34
  • 35. Example: “family” of questions  What is the country in Asia with the largest area?  Tell me what Asian country has the highest population density density.  What country in Europe has the lowest infant mortality rate? y  What is the most populated American country? March 17, 2009 Akerkar: Sogndal Lecture 35
  • 36. Capture the “pattern” of information requests i an i f in information access schema i h <nl:InformationAccessSchema> Natural language <nl:ann>what country in $region has the largest $attribute</nl:ann> annotations are <nl:pattern>?x a :Country</nl:pattern> employed to <nl:pattern>?x map($attribute) ?val</nl:pattern> describe a pattern of RDF statements <nl:pattern>?x :location $region</nl:pattern> <nl:action>display(boundto(?x, max(?val))) </nl:action> <nl:mapping> Because annotations would be <nl:hash variable="$attribute"> variable $attribute > p processed by linguistically y g y <nl:map value="population"> sophisticated systems, different :population adjectives such as “highest” and </nl:map> “largest” could be uniformly mapped <nl:map value="area"> onto the maximum operation. :area </nl:map> ... Schema answers questions that </nl:hash> involve region specific superlative </nl:mapping> pp g comparison of countries. </nl:InformationAccessSchema> March 17, 2009 Akerkar: Sogndal Lecture 36
  • 37. The pattern binds to the value of the particular attribute for countries within the queried geographic region, and the action ti ithi th i d hi i d th ti specifies an aggregate operation (maximum) over the values bound within the pattern.  The country corresponding to that maximum value is returned as the answer.  The mapping provides a translation from language attributes to pp g p g g RDF properties.  Information access schemata are written with respect to a particular pre-existing ontology;  In thi I this example, we assume th t an appropriate ontology h b l that i t t l has been established (i.e., :Country is defined as a class, and :location is defined as a property).  In this vision of the Semantic Web, information access schemata grounded in natural language would co-exist alongside RDF metadata. March 17, 2009 Akerkar: Sogndal Lecture 37
  • 38. Further extension: Query Plan  Question: What is the distance from India to Norway?  Solution Plan: To compute the distance between their respective capitals. Could humans “teach” such plans to a computer directly teach ? March 17, 2009 Akerkar: Sogndal Lecture 38
  • 39. Information Planning Schemata  An extension of Information Access Schemata.  Simplifies the task of knowledge engineering.  Example:  Instead of writing RDF patterns,  which would require knowledge of domain-specific ontologies,  Use natural language itself to describe the process of answering a question.  The answer plan (nl:plan) reflects the user’s thought process expressed in natural language: first find the capitals of the countries, and then find the distance between those cities March 17, 2009 Akerkar: Sogndal Lecture 39
  • 40. An information planning schema <nl:InformationPlanningSchema> <nl:ann>distance between $country1 and $country2</ann> <nl:plan> <rdf:Seq> <rdf:li>what is the capital of $country1 := ?capital1</rdf:li> <rdf:li>what is the capital of $country2 := ?capital2</rdf:li> <rdf:li>what is the distance between ?capital1 and ?capital2 := ?distance</rdf:li> </rdf:Seq> </nl:plan> <nl:action>display(?distance)</nl:action> </nl:InformationPlanningSchema> March 17, 2009 Akerkar: Sogndal Lecture 40
  • 41. Integrating the methods  The three proposed methods for integrating natural language and RDF can be used together to afford greater flexibility.  Annotating RDF properties is a low-cost (from a knowledge engineering perspective) way of providing natural language access to RDF statements.  Information access schemata while being more complex and schemata, requiring knowledge of domain-specific ontologies, give experienced knowledge engineers fine-grained tools for manipulating RDF and controlling the output.  Information planning schemata allow users to describe in natural describe, language itself, how they would go about answering a particular class of questions.  These three methods can combine to provide the foundation for question answering on the Semantic Web. March 17, 2009 Akerkar: Sogndal Lecture 41
  • 42. Thank You ! March 17, 2009 Akerkar: Sogndal Lecture 42