SlideShare a Scribd company logo
Zero-Knowledge
          Query Planning
   for an Iterator Implementation of
Link Traversal Based Query Execution

                                          Olaf Hartig
                          http://olafhartig.de/foaf.rdf#olaf
                                                @olafhartig

    Database and Information Systems Research Group
                       Humboldt-Universität zu Berlin
Iterator Based Execution Plan
                                          tp1 = ( ?p , ex:affiliated_with , <http://.../orgaX> )                     I1




                                          tp2 = ( ?p , ex:interested_in , ?b )                                       I2




                                          tp3 = ( ?b , rdf:type , <http://.../Book> )                                I3
  Query

  ?p ex:affiliated_with <http://.../orgaX>
  ?p ex:interested_in ?b
  ?b rdf:type <http://.../Book>
Olaf Hartig - Zero-Knowledge Query Planning for an Iterator Implementation of Link Traversal Based Query Execution        2
Iterator Based Execution Plan
                                          tp1 = ( ?p , ex:affiliated_with , <http://.../orgaX> )                     I1


                                                                                       Next?
   query-local                            tp2 = ( ?p , ex:interested_in , ?b )                                       I2
    dataset



                                                                                       Next?
                                          tp3 = ( ?b , rdf:type , <http://.../Book> )                                I3


                                                                                       Next?


Olaf Hartig - Zero-Knowledge Query Planning for an Iterator Implementation of Link Traversal Based Query Execution        3
Iterator Based Execution Plan
                                          tp1 = ( ?p , ex:affiliated_with , <http://.../orgaX> )                     I1


                                                                                       Next?
   query-local                            tp2 = ( ?p , ex:interested_in , ?b )                                       I2
    dataset



                                                                                       Next?
                                       Descriptor object <http://.../Book> )
                                        tp3 = ( ?b , rdf:type ,                                                      I3


                                               :                                       Next?
     <http://.../alice> ex:affiliated_with <http://.../orgaX>
                                               :
Olaf Hartig - Zero-Knowledge Query Planning for an Iterator Implementation of Link Traversal Based Query Execution        4
Iterator Based Execution Plan
                                          tp1 = ( ?p , ex:affiliated_with , <http://.../orgaX> )                     I1


                                                                                       Next?
   query-local                            tp2 = ( ?p , ex:interested_in , ?b )                                       I2
    dataset



                                                                                       Next?
                                          tp3 = ( ?b , rdf:type , <http://.../Book> )                                I3


                                               :                                       Next?
     <http://.../alice> ex:affiliated_with <http://.../orgaX>
                                               :
Olaf Hartig - Zero-Knowledge Query Planning for an Iterator Implementation of Link Traversal Based Query Execution        5
Iterator Based Execution Plan
                                          tp1 = ( ?p , ex:affiliated_with , <http://.../orgaX> )                     I1


                                                                 { ?p = <http://.../alice> }

   query-local                            tp2 = ( ?p , ex:interested_in , ?b )                                       I2
    dataset



                                                                                       Next?
                                          tp3 = ( ?b , rdf:type , <http://.../Book> )                                I3


                                               :                                       Next?
     <http://.../alice> ex:affiliated_with <http://.../orgaX>
                                               :
Olaf Hartig - Zero-Knowledge Query Planning for an Iterator Implementation of Link Traversal Based Query Execution        6
Iterator Based Execution Plan
                                          tp1 = ( ?p , ex:affiliated_with , <http://.../orgaX> )                     I1


                                                                 { ?p = <http://.../alice> }

   query-local                            tp2 = ( ?p , ex:interested_in , ?b )                                       I2
    dataset
                                          tp2' = ( <http://.../alice> , ex:interested_in , ?b )

                                                                                       Next?
                                          tp3 = ( ?b , rdf:type , <http://.../Book> )                                I3


                                                                                       Next?


Olaf Hartig - Zero-Knowledge Query Planning for an Iterator Implementation of Link Traversal Based Query Execution        7
Iterator Based Execution Plan
                                          tp1 = ( ?p , ex:affiliated_with , <http://.../orgaX> )                     I1


                                                                 { ?p = <http://.../alice> }

   query-local                            tp2 = ( ?p , ex:interested_in , ?b )                                       I2
    dataset
                                          tp2' = ( <http://.../alice> , ex:interested_in , ?b )

                                                                                       Next?
                                          tp3 = ( ?b , rdf:type , <http://.../Book> )                                I3


                                               :                                       Next?
        <http://.../alice> ex:interested_in <http://.../b1>
                                               :
Olaf Hartig - Zero-Knowledge Query Planning for an Iterator Implementation of Link Traversal Based Query Execution        8
Iterator Based Execution Plan
                                          tp1 = ( ?p , ex:affiliated_with , <http://.../orgaX> )                     I1


                                                                 { ?p = <http://.../alice> }

   query-local                            tp2 = ( ?p , ex:interested_in , ?b )                                       I2
    dataset
                                          tp2' = ( <http://.../alice> , ex:interested_in , ?b )

                                                   { ?p = <http://.../alice> , ?b = <http://.../b1> }

                                          tp3 = ( ?b , rdf:type , <http://.../Book> )                                I3


                                               :                                       Next?
        <http://.../alice> ex:interested_in <http://.../b1>
                                               :
Olaf Hartig - Zero-Knowledge Query Planning for an Iterator Implementation of Link Traversal Based Query Execution        9
Iterator Based Execution Plan
                                          tp1 = ( ?p , ex:affiliated_with , <http://.../orgaX> )                     I1


                                                                 { ?p = <http://.../alice> }

   query-local                            tp2 = ( ?p , ex:interested_in , ?b )                                       I2
    dataset
                                          tp2' = ( <http://.../alice> , ex:interested_in , ?b )

                                                 { ?p = <http://.../alice> , ?b = <http://.../b1> }

                                          tp3 = ( ?b , rdf:type , <http://.../Book> )                                I3
                                          tp3' = ( <http://.../b1> , rdf:type , <http://.../Book> )

                                                                                       Next?


Olaf Hartig - Zero-Knowledge Query Planning for an Iterator Implementation of Link Traversal Based Query Execution    10
Iterator Based Execution Plan
                                          tp1 = ( ?p , ex:affiliated_with , <http://.../orgaX> )                     I1


                                                                 { ?p = <http://.../alice> }

   query-local                            tp2 = ( ?p , ex:interested_in , ?b )                                       I2
    dataset
                                          tp2' = ( <http://.../alice> , ex:interested_in , ?b )

                                                 { ?p = <http://.../alice> , ?b = <http://.../b1> }

                                          tp3 = ( ?b , rdf:type , <http://.../Book> )                                I3
                                          tp3' = ( <http://.../b1> , rdf:type , <http://.../Book> )

                                                 :                                     Next?
 <http://.../Book> rdfs:subClassOf <http://.../CreativeWork>
                                                 :
Olaf Hartig - Zero-Knowledge Query Planning for an Iterator Implementation of Link Traversal Based Query Execution    11
Iterator Based Execution Plan
                                          tp1 = ( ?p , ex:affiliated_with , <http://.../orgaX> )                     I1


                                                                 { ?p = <http://.../alice> }

   query-local                            tp2 = ( ?p , ex:interested_in , ?b )                                       I2
    dataset
                                          tp2' = ( <http://.../alice> , ex:interested_in , ?b )

                                                 { ?p = <http://.../alice> , ?b = <http://.../b1> }

                                          tp3 = ( ?b , rdf:type , <http://.../Book> )                                I3
                                          tp3' = ( <http://.../b1> , rdf:type , <http://.../Book> )

                                     :                                                 Next?
     <http://.../b1> rdf:type <http://.../Book>
                                     :
Olaf Hartig - Zero-Knowledge Query Planning for an Iterator Implementation of Link Traversal Based Query Execution    12
Iterator Based Execution Plan
                                          tp1 = ( ?p , ex:affiliated_with , <http://.../orgaX> )                     I1


                                                                 { ?p = <http://.../alice> }

   query-local                            tp2 = ( ?p , ex:interested_in , ?b )                                       I2
    dataset
                                          tp2' = ( <http://.../alice> , ex:interested_in , ?b )

                                                 { ?p = <http://.../alice> , ?b = <http://.../b1> }

                                          tp3 = ( ?b , rdf:type , <http://.../Book> )                                I3
                                          tp3' = ( <http://.../b1> , rdf:type , <http://.../Book> )

                                                 { ?p = <http://.../alice> , ?b = <http://.../b1> }



Olaf Hartig - Zero-Knowledge Query Planning for an Iterator Implementation of Link Traversal Based Query Execution    13
Example Query Execution Plan
                                          tp1 = ( ?p , ex:affiliated_with , <http://.../orgaX>)                      I1




                                          tp2 = ( ?p , ex:interested_in , ?b )                                       I2




                                          tp3 = ( ?b , rdf:type , <http://.../Book> )                                I3
  Query

  ?p ex:affiliated_with <http://.../orgaX>
  ?p ex:interested_in ?b
  ?b rdf:type <http://.../Book>
Olaf Hartig - Zero-Knowledge Query Planning for an Iterator Implementation of Link Traversal Based Query Execution    14
An Alternative Execution Plan
                                          tp1 = ( ?b , rdf:type , <http://.../Book> )                                I1




                                          tp2 = ( ?p , ex:interested_in , ?b )                                       I2




                                          tp3 = ( ?p , ex:affiliated_with , <http://.../orgaX>)                      I3
  Query

  ?p ex:affiliated_with <http://.../orgaX>
  ?p ex:interested_in ?b
  ?b rdf:type <http://.../Book>
Olaf Hartig - Zero-Knowledge Query Planning for an Iterator Implementation of Link Traversal Based Query Execution    15
An Alternative Execution Plan
                                          tp1 = ( ?b , rdf:type , <http://.../Book> )                                I1


                                                                                       Next?
   query-local                            tp2 = ( ?p , ex:interested_in , ?b )                                       I2
    dataset



                                                                                       Next?
                                          tp3 = ( ?p , ex:affiliated_with , <http://.../orgaX>)                      I3


                                                                                       Next?


Olaf Hartig - Zero-Knowledge Query Planning for an Iterator Implementation of Link Traversal Based Query Execution    16
An Alternative Execution Plan
                                          tp1 = ( ?b , rdf:type , <http://.../Book> )                                I1


                                                                                       Next?
   query-local                            tp2 = ( ?p , ex:interested_in , ?b )                                       I2
    dataset



                                                                                       Next?
                                          tp3 = ( ?p , ex:affiliated_with , <http://.../orgaX>)                      I3


                                                 :                                     Next?
 <http://.../Book> rdfs:subClassOf <http://.../CreativeWork>
                                                 :
Olaf Hartig - Zero-Knowledge Query Planning for an Iterator Implementation of Link Traversal Based Query Execution    17
An Alternative Execution Plan
                                          tp1 = ( ?b , rdf:type , <http://.../Book> )                                I1


                                                                                END!

   query-local                            tp2 = ( ?p , ex:interested_in , ?b )                                       I2
    dataset



                                                                                       Next?
                                          tp3 = ( ?p , ex:affiliated_with , <http://.../orgaX>)                      I3


                                                 :                                     Next?
 <http://.../Book> rdfs:subClassOf <http://.../CreativeWork>
                                                 :
Olaf Hartig - Zero-Knowledge Query Planning for an Iterator Implementation of Link Traversal Based Query Execution    18
An Alternative Execution Plan
                                          tp1 = ( ?b , rdf:type , <http://.../Book> )                                I1


                                                                                END!

   query-local                            tp2 = ( ?p , ex:interested_in , ?b )                                       I2
    dataset



                                                                                END!

                                          tp3 = ( ?p , ex:affiliated_with , <http://.../orgaX>)                      I3


                                                                                END!



Olaf Hartig - Zero-Knowledge Query Planning for an Iterator Implementation of Link Traversal Based Query Execution    19
An Alternative Execution Plan
                                          tp1 = ( ?b , rdf:type , <http://.../Book> )                                I1


                                                                                END!

   query-local                            tp2 = ( ?p , ex:interested_in , ?b )                                       I2
    dataset



                                                                                END!

                                          tp3 = ( ?p , ex:affiliated_with , <http://.../orgaX>)                      I3


 Number of results may depend                                                   END!
 on the order of triple patterns
         = logical query execution plan
Olaf Hartig - Zero-Knowledge Query Planning for an Iterator Implementation of Link Traversal Based Query Execution    20
Query Plan Selection
 ●   Assessment criteria:
     ●   Cost (query execution time)
     ●   Benefit (number of results)
 ●   Cost and benefit must be estimated without plan execution
 ●   Estimation impossible due to “zero knowledge”
 ●   Heuristic Based Plan Selection
     ●   DEPENDENCY RESPECT RULE
     ●   SEED TP RULE
     ●   NO VOCAB SEED RULE
     ●   FILTERING TP RULE

Olaf Hartig - Zero-Knowledge Query Planning for an Iterator Implementation of Link Traversal Based Query Execution   21
SEED TP RULE

                       Use a plan with a seed triple pattern

 ●   Potential seed triple pattern
       … is a triple pattern that contains at least one HTTP URI
 ●   Seed triple pattern of a plan
       … is the first triple pattern in the plan and
       … is a potential seed triple pattern

  Query                                                                                ●   Rationale: good
                                                                                             starting point
  ?p ex:affiliated_with <http://.../orgaX> √
  ?p ex:interested_in ?b √
  ?b rdf:type <http://.../Book> √
Olaf Hartig - Zero-Knowledge Query Planning for an Iterator Implementation of Link Traversal Based Query Execution   22
NO VOCAB SEED RULE

         Avoid a seed triple pattern with vocabulary terms

 ●   Not only vocabulary term URIs in the seed triple pattern
 ●   Patterns to avoid:                       ?s ex:any_property ?o
                                              ?s rdf:type ex:any_class
 ●   Rationale: URIs for vocabulary term usually resolve to
                vocabulary definitions with little instance data
  Query

  ?p ex:affiliated_with <http://.../orgaX> √
  ?p ex:interested_in ?b
  ?b rdf:type <http://.../Book>
Olaf Hartig - Zero-Knowledge Query Planning for an Iterator Implementation of Link Traversal Based Query Execution   23
FILTERING TP RULE

           Use a plan where all filtering triple patterns are
            as close to the seed triple pattern as possible

 ●   Filtering triple pattern: each variable already occurs in one
                               of the preceding triple patterns
 ●   For each result                               tp1 = ( ?p , ex:affiliated_with , <http://.../orgaX>)             I1
     consumed as input
     a filtering TP can                                                 { ?p = <http://.../alice> }
     only report 1 or 0
     results as output                             tp2 = ( ?p , ex:interested_in , ?b )                              I2
                                                   tp2' = ( <http://.../alice> , ex:interested_in , ?b )
 ●   Rationale: Reduce                                   { ?p = <http://.../alice> , ?b = <http://.../b1> }
                cost
                                                   tp3 = ( ?b , rdf:type , <http://.../Book> )                       I3
                                                   tp3' = ( <http://.../b1> , rdf:type , <http://.../Book> )
Olaf Hartig - Zero-Knowledge Query Planning for an Iterator Implementation of Link Traversal Based Query Execution    24
Evaluation Procedure
 ●   Generate all possible plans
 ●   Execute each plan:
     ●   5 runs (+ 1 initial warm-up run)
     ●   Use an initially empty query-local dataset for each run
 ●   Measure for each plan:
     ●   Avg. execution time
     ●   Avg. number of descriptor objects retrieved during execution
     ●   Avg. number of query results




Olaf Hartig - Zero-Knowledge Query Planning for an Iterator Implementation of Link Traversal Based Query Execution   25
Evaluation Query (Example)

 SELECT ?spec ?genus WHERE {                                                          Of what genus are
                                                                                      the species that are
  geospecies:4qyn7 gs:inFamily ?fam .                                                 ● classified in the

  ?fam skos:narrowerTransitive ?spec .                                                  same family as the
  ?spec skos:closeMatch ?sp2 .                                                          American Badger,
                                                                                      ● and expected in the
  ?sp2 rdfs:subClassOf     ?genus .
                                                                                        same states as the
     ?spec             gs:isExpectedIn ?loc .                                           American Badger ?
     geospecies:4qyn7 gs:isExpectedIn ?loc
     ?loc rdf:type gs:State . }

 ●   2 potential seed triple patterns that
     satisfy our NO SEED VOCAB RULE
 ●   56 different plans, each contains
     2 filtering triple patterns                                                                        Picture source: Wikipedia
Olaf Hartig - Zero-Knowledge Query Planning for an Iterator Implementation of Link Traversal Based Query Execution            26
Measurements
                30                                                                      400




                                                             retrieved descr. objects
                                                                                        300
                20
query results




                                                                                        200
                10
                                                                                        100

                0                                                                        0
                     0 30 60 90 120 150 180                                                0 30 60 90 120 150 180
                     query exec. times (in seconds)                                       query exec. times (in seconds)


                Percentage of plans in each group with a filtering TP in specific positions
                             1st Filtering TP                                                    2nd Filtering TP
  100                                                       100
       50                                                       50
            0                                                            0
             1      2     3      4       5       6   7               1       2        3      4       5      6  7
     Olaf Hartig position in the ordered an Iterator Implementation of Link Traversal Based Query Execution BGP 27
          TP - Zero-Knowledge Query Planning for BGP              TP position in the ordered
Conclusions
                 ●   Approach that uses iterators to implement
                     Link Traversal Based Query Execution
                         … is sound
                         … guarantees termination
                         … cannot guarantee (reachability-) completeness
                 ●   Degree of completeness depends on
                     the query plans (i.e. orders of the BGP)
                 ●   Heuristic based plan selection

                 ●   Next steps:
                     ●   Algorithm to generate most suitable plans only
                     ●   Integrate adaptive query processing techniques
Olaf Hartig - Zero-Knowledge Query Planning for an Iterator Implementation of Link Traversal Based Query Execution   28
Backup Slides




Olaf Hartig - Zero-Knowledge Query Planning for an Iterator Implementation of Link Traversal Based Query Execution   29
Outline


                             1. Link Traversal Based
                                Query Execution
                             2. Characteristics of the
                                Iterator Based
                                Implementation Approach
                             3. Query Plan Selection
                             4. Evaluation


Olaf Hartig - Zero-Knowledge Query Planning for an Iterator Implementation of Link Traversal Based Query Execution   30
Link Traversal Based Querying
 ●   Semantics defined
     in two phases:
     1. Reachability                                                                 Descriptor
     2. Query Results                                                                  object




  Query

  ?p ex:affiliated_with <http://.../orgaX>
  ?p ex:interested_in ?b
  ?b rdf:type <http://.../Book>
Olaf Hartig - Zero-Knowledge Query Planning for an Iterator Implementation of Link Traversal Based Query Execution   31
Link Traversal Based Querying
 ●   Semantics defined
     in two phases:
     1. Reachability
     2. Query Results




  Query

  ?p ex:affiliated_with <http://.../orgaX>
  ?p ex:interested_in ?b
  ?b rdf:type <http://.../Book>
Olaf Hartig - Zero-Knowledge Query Planning for an Iterator Implementation of Link Traversal Based Query Execution   32
Link Traversal Based Querying
 ●   Semantics defined
     in two phases:
     1. Reachability
     2. Query Results




  Query

  ?p ex:affiliated_with <http://.../orgaX>
  ?p ex:interested_in ?b
  ?b rdf:type <http://.../Book>
Olaf Hartig - Zero-Knowledge Query Planning for an Iterator Implementation of Link Traversal Based Query Execution   33
Link Traversal Based Querying
 ●   Semantics defined                                                                           Reachable
     in two phases:                                                                              descriptor
     1. Reachability                                                                               object
     2. Query Results


                                        The mapping μ : V → U  B  L
                                        is a result to BGP bgp iff:
                                              Condition 1: dom(μ) = vars(bgp)
                                              Condition 2: μ[bgp]  Dreachable

 ●   2 phases do not reflect idea of actual execution strategy
     ●   Intertwine query evaluation with the traversal of data links
Olaf Hartig - Zero-Knowledge Query Planning for an Iterator Implementation of Link Traversal Based Query Execution   34
Characteristics




 ●   No need to know all data sources in advance
 ●   Never complete w.r.t. all data on the Web
     ●   Reason: reachability based on links that match query patterns
     ●   New concept: reachability-completeness
 ●   No guarantee for termination
     ●   Reason: Web of Data is infinite (at any point in time)
Olaf Hartig - Zero-Knowledge Query Planning for an Iterator Implementation of Link Traversal Based Query Execution   35
Outline



                             1. Characteristics of the
                                Implementation Approach
                             2. Query Plan Selection
                             3. Evaluation




Olaf Hartig - Zero-Knowledge Query Planning for an Iterator Implementation of Link Traversal Based Query Execution   36
Characteristics
 ●   Sound
 ●   Number of results may depend on order of triple patterns
 ●   Reachability-completeness not guaranteed
     ●   Main reason: inflexibility due to fixed order
 ●   Termination is guaranteed
     ●   No such guarantee for link traversal based
         query execution in general because the Web
         of Data is infinite (at any point in time)
 ●   Efficient
 ●   Easy to apply in existing query engines


Olaf Hartig - Zero-Knowledge Query Planning for an Iterator Implementation of Link Traversal Based Query Execution   37
Plan Selection Problem

 ●   Number of results may depend on order of triple patterns

                                                                = logical query execution plan


 ➔   Problem: select a suitable plan




Olaf Hartig - Zero-Knowledge Query Planning for an Iterator Implementation of Link Traversal Based Query Execution   38
Outline


                             1. Link Traversal Based
                                Query Execution
                             2. Characteristics of the
                                Iterator Based
                                Implementation Approach
                             3. Query Plan Selection
                             4. Evaluation


Olaf Hartig - Zero-Knowledge Query Planning for an Iterator Implementation of Link Traversal Based Query Execution   39
DEPENDENCY RESPECT RULE

                  Use a dependency respecting query plan

 ●   Dependency respect: a variable from each triple pattern
     already occurs in one of the preceding triple patterns

                                            tp1 = ( ?p , ex:affiliated_with , <http://.../orgaX>) I1




  Query
                                            tp2 = ( ?p , ex:interested_in , ?b )                                 √   I2


  ?p ex:affiliated_with tp3 = ( ?b , rdf:type , <http://.../Book> )
                         <http://.../orgaX>                                                                          I3
  ?p ex:interested_in ?b
  ?b rdf:type <http://.../Book>
Olaf Hartig - Zero-Knowledge Query Planning for an Iterator Implementation of Link Traversal Based Query Execution    40
DEPENDENCY RESPECT RULE

                  Use a dependency respecting query plan

 ●   Dependency respect: a variable from each triple pattern
     already occurs in one of the preceding triple patterns

                                            tp1 = ( ?p , ex:affiliated_with , <http://.../orgaX>) I1



                                            tp2 = ( ?p , ex:interested_in , ?b )                                     I2
  Query

  ?p ex:affiliated_with tp3 = ( ?b , rdf:type , <http://.../Book> )
                         <http://.../orgaX>                                                                          I3
  ?p ex:interested_in ?b
  ?b rdf:type <http://.../Book>
Olaf Hartig - Zero-Knowledge Query Planning for an Iterator Implementation of Link Traversal Based Query Execution    41
DEPENDENCY RESPECT RULE

                  Use a dependency respecting query plan

 ●   Dependency respect: a variable from each triple pattern
     already occurs in one of the preceding triple patterns
 ●   Rationale:                             tp1 = ( ?p , ex:affiliated_with , <http://.../orgaX>) I1
       Avoid
       cartesian
       products
                                            tp2 = ( ?b , rdf:type , <http://.../Book> )                              I2
  Query

  ?p ex:affiliated_with tp3 = ( ?p , ex:interested_in , ?b )
                         <http://.../orgaX>                                                                          I3
  ?p ex:interested_in ?b
  ?b rdf:type <http://.../Book>
Olaf Hartig - Zero-Knowledge Query Planning for an Iterator Implementation of Link Traversal Based Query Execution    42
These slides have been created by
                                       Olaf Hartig

                                              http://olafhartig.de


                     This work is licensed under a
       Creative Commons Attribution-Share Alike 3.0 License
           (http://creativecommons.org/licenses/by-sa/3.0/)




Olaf Hartig - Zero-Knowledge Query Planning for an Iterator Implementation of Link Traversal Based Query Execution   43

More Related Content

What's hot (13)

PDF
A Mapping-based Method to Query MongoDB Documents with SPARQL
Franck Michel
 
PPTX
Ibmr 2014
Fabio Porto
 
PPTX
PYTHON FOR BEGINNERS (BASICS OF PYTHON)
HemaArora2
 
PPTX
SPARQL Cheat Sheet
LeeFeigenbaum
 
PPTX
Semantic web meetup – sparql tutorial
AdonisDamian
 
PDF
Paper id 42201608
IJRAT
 
PDF
Framester: A Wide Coverage Linguistic Linked Data Hub
Mehwish Alam
 
PDF
Turbocharge your data science with python and r
Kelli-Jean Chun
 
PPTX
4 sw architectures and sparql
Mariano Rodriguez-Muro
 
PDF
Applications of Word Vectors in Text Retrieval and Classification
shakimov
 
PDF
Big Data Analytics course: Named Entities and Deep Learning for NLP
Christian Morbidoni
 
PDF
ZenCrowd: Leveraging Probabilistic Reasoning and Crowdsourcing Techniques for...
eXascale Infolab
 
PPT
Twinkle: A SPARQL Query Tool
Leigh Dodds
 
A Mapping-based Method to Query MongoDB Documents with SPARQL
Franck Michel
 
Ibmr 2014
Fabio Porto
 
PYTHON FOR BEGINNERS (BASICS OF PYTHON)
HemaArora2
 
SPARQL Cheat Sheet
LeeFeigenbaum
 
Semantic web meetup – sparql tutorial
AdonisDamian
 
Paper id 42201608
IJRAT
 
Framester: A Wide Coverage Linguistic Linked Data Hub
Mehwish Alam
 
Turbocharge your data science with python and r
Kelli-Jean Chun
 
4 sw architectures and sparql
Mariano Rodriguez-Muro
 
Applications of Word Vectors in Text Retrieval and Classification
shakimov
 
Big Data Analytics course: Named Entities and Deep Learning for NLP
Christian Morbidoni
 
ZenCrowd: Leveraging Probabilistic Reasoning and Crowdsourcing Techniques for...
eXascale Infolab
 
Twinkle: A SPARQL Query Tool
Leigh Dodds
 

Viewers also liked (10)

PDF
Water-Services-ZBR-Final-Report-PFC2015-0230
John Carty
 
PDF
Report on zero rating and its definition – 18 annenberg-oxford media policy s...
Shreedeep Rayamajhi
 
PDF
Big Data For Flight Delay Report
JSPM's JSCOE , Pune Maharashtra.
 
PPTX
Facebook Messenger
Supernova Media
 
PDF
Airborne internet
connectshilpa
 
PPTX
Sky x technology
maulik610
 
PDF
Internet of thing wsn-fix
indahsurvyana
 
PPTX
Internet of Things
Cisco Services
 
PPTX
Internet-of-things- (IOT) - a-seminar - ppt - by- mohan-kumar-g
Mohan Kumar G
 
PPT
Internet of Things and its applications
Pasquale Puzio
 
Water-Services-ZBR-Final-Report-PFC2015-0230
John Carty
 
Report on zero rating and its definition – 18 annenberg-oxford media policy s...
Shreedeep Rayamajhi
 
Big Data For Flight Delay Report
JSPM's JSCOE , Pune Maharashtra.
 
Facebook Messenger
Supernova Media
 
Airborne internet
connectshilpa
 
Sky x technology
maulik610
 
Internet of thing wsn-fix
indahsurvyana
 
Internet of Things
Cisco Services
 
Internet-of-things- (IOT) - a-seminar - ppt - by- mohan-kumar-g
Mohan Kumar G
 
Internet of Things and its applications
Pasquale Puzio
 
Ad

More from Olaf Hartig (20)

PDF
LDQL: A Query Language for the Web of Linked Data
Olaf Hartig
 
PDF
A Context-Based Semantics for SPARQL Property Paths over the Web
Olaf Hartig
 
PDF
Rethinking Online SPARQL Querying to Support Incremental Result Visualization
Olaf Hartig
 
PDF
Tutorial "Linked Data Query Processing" Part 5 "Query Planning and Optimizati...
Olaf Hartig
 
PDF
Tutorial "Linked Data Query Processing" Part 4 "Execution Process" (WWW 2013 ...
Olaf Hartig
 
PDF
Tutorial "Linked Data Query Processing" Part 3 "Source Selection Strategies" ...
Olaf Hartig
 
PDF
Tutorial "Linked Data Query Processing" Part 2 "Theoretical Foundations" (WWW...
Olaf Hartig
 
PDF
Tutorial "Linked Data Query Processing" Part 1 "Introduction" (WWW 2013 Ed.)
Olaf Hartig
 
PDF
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 3 (...
Olaf Hartig
 
PDF
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 2 (...
Olaf Hartig
 
PDF
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 1 (...
Olaf Hartig
 
ODP
An Overview on PROV-AQ: Provenance Access and Query
Olaf Hartig
 
PDF
(An Overview on) Linked Data Management and SPARQL Querying (ISSLOD2011)
Olaf Hartig
 
PDF
The Impact of Data Caching of on Query Execution for Linked Data
Olaf Hartig
 
PDF
How Caching Improves Efficiency and Result Completeness for Querying Linked Data
Olaf Hartig
 
PDF
A Main Memory Index Structure to Query Linked Data
Olaf Hartig
 
PDF
Towards a Data-Centric Notion of Trust in the Semantic Web (A Position Statem...
Olaf Hartig
 
PDF
Brief Introduction to the Provenance Vocabulary (for W3C prov-xg)
Olaf Hartig
 
PDF
Querying Linked Data with SPARQL (2010)
Olaf Hartig
 
PDF
Answers to usual issues in getting started with consuming Linked Data (2010)
Olaf Hartig
 
LDQL: A Query Language for the Web of Linked Data
Olaf Hartig
 
A Context-Based Semantics for SPARQL Property Paths over the Web
Olaf Hartig
 
Rethinking Online SPARQL Querying to Support Incremental Result Visualization
Olaf Hartig
 
Tutorial "Linked Data Query Processing" Part 5 "Query Planning and Optimizati...
Olaf Hartig
 
Tutorial "Linked Data Query Processing" Part 4 "Execution Process" (WWW 2013 ...
Olaf Hartig
 
Tutorial "Linked Data Query Processing" Part 3 "Source Selection Strategies" ...
Olaf Hartig
 
Tutorial "Linked Data Query Processing" Part 2 "Theoretical Foundations" (WWW...
Olaf Hartig
 
Tutorial "Linked Data Query Processing" Part 1 "Introduction" (WWW 2013 Ed.)
Olaf Hartig
 
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 3 (...
Olaf Hartig
 
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 2 (...
Olaf Hartig
 
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 1 (...
Olaf Hartig
 
An Overview on PROV-AQ: Provenance Access and Query
Olaf Hartig
 
(An Overview on) Linked Data Management and SPARQL Querying (ISSLOD2011)
Olaf Hartig
 
The Impact of Data Caching of on Query Execution for Linked Data
Olaf Hartig
 
How Caching Improves Efficiency and Result Completeness for Querying Linked Data
Olaf Hartig
 
A Main Memory Index Structure to Query Linked Data
Olaf Hartig
 
Towards a Data-Centric Notion of Trust in the Semantic Web (A Position Statem...
Olaf Hartig
 
Brief Introduction to the Provenance Vocabulary (for W3C prov-xg)
Olaf Hartig
 
Querying Linked Data with SPARQL (2010)
Olaf Hartig
 
Answers to usual issues in getting started with consuming Linked Data (2010)
Olaf Hartig
 
Ad

Recently uploaded (20)

PPTX
COMPARISON OF RASTER ANALYSIS TOOLS OF QGIS AND ARCGIS
Sharanya Sarkar
 
PDF
UiPath DevConnect 2025: Agentic Automation Community User Group Meeting
DianaGray10
 
PDF
Newgen Beyond Frankenstein_Build vs Buy_Digital_version.pdf
darshakparmar
 
PPTX
Seamless Tech Experiences Showcasing Cross-Platform App Design.pptx
presentifyai
 
PPTX
Digital Circuits, important subject in CS
contactparinay1
 
DOCX
Cryptography Quiz: test your knowledge of this important security concept.
Rajni Bhardwaj Grover
 
PDF
Newgen 2022-Forrester Newgen TEI_13 05 2022-The-Total-Economic-Impact-Newgen-...
darshakparmar
 
PPT
Ericsson LTE presentation SEMINAR 2010.ppt
npat3
 
PDF
“Squinting Vision Pipelines: Detecting and Correcting Errors in Vision Models...
Edge AI and Vision Alliance
 
PDF
“Voice Interfaces on a Budget: Building Real-time Speech Recognition on Low-c...
Edge AI and Vision Alliance
 
PDF
The Rise of AI and IoT in Mobile App Tech.pdf
IMG Global Infotech
 
PDF
Kit-Works Team Study_20250627_한달만에만든사내서비스키링(양다윗).pdf
Wonjun Hwang
 
PDF
“NPU IP Hardware Shaped Through Software and Use-case Analysis,” a Presentati...
Edge AI and Vision Alliance
 
PPTX
The Project Compass - GDG on Campus MSIT
dscmsitkol
 
PDF
NASA A Researcher’s Guide to International Space Station : Physical Sciences ...
Dr. PANKAJ DHUSSA
 
PDF
The 2025 InfraRed Report - Redpoint Ventures
Razin Mustafiz
 
PPTX
New ThousandEyes Product Innovations: Cisco Live June 2025
ThousandEyes
 
PDF
AI Agents in the Cloud: The Rise of Agentic Cloud Architecture
Lilly Gracia
 
PDF
ICONIQ State of AI Report 2025 - The Builder's Playbook
Razin Mustafiz
 
PDF
How do you fast track Agentic automation use cases discovery?
DianaGray10
 
COMPARISON OF RASTER ANALYSIS TOOLS OF QGIS AND ARCGIS
Sharanya Sarkar
 
UiPath DevConnect 2025: Agentic Automation Community User Group Meeting
DianaGray10
 
Newgen Beyond Frankenstein_Build vs Buy_Digital_version.pdf
darshakparmar
 
Seamless Tech Experiences Showcasing Cross-Platform App Design.pptx
presentifyai
 
Digital Circuits, important subject in CS
contactparinay1
 
Cryptography Quiz: test your knowledge of this important security concept.
Rajni Bhardwaj Grover
 
Newgen 2022-Forrester Newgen TEI_13 05 2022-The-Total-Economic-Impact-Newgen-...
darshakparmar
 
Ericsson LTE presentation SEMINAR 2010.ppt
npat3
 
“Squinting Vision Pipelines: Detecting and Correcting Errors in Vision Models...
Edge AI and Vision Alliance
 
“Voice Interfaces on a Budget: Building Real-time Speech Recognition on Low-c...
Edge AI and Vision Alliance
 
The Rise of AI and IoT in Mobile App Tech.pdf
IMG Global Infotech
 
Kit-Works Team Study_20250627_한달만에만든사내서비스키링(양다윗).pdf
Wonjun Hwang
 
“NPU IP Hardware Shaped Through Software and Use-case Analysis,” a Presentati...
Edge AI and Vision Alliance
 
The Project Compass - GDG on Campus MSIT
dscmsitkol
 
NASA A Researcher’s Guide to International Space Station : Physical Sciences ...
Dr. PANKAJ DHUSSA
 
The 2025 InfraRed Report - Redpoint Ventures
Razin Mustafiz
 
New ThousandEyes Product Innovations: Cisco Live June 2025
ThousandEyes
 
AI Agents in the Cloud: The Rise of Agentic Cloud Architecture
Lilly Gracia
 
ICONIQ State of AI Report 2025 - The Builder's Playbook
Razin Mustafiz
 
How do you fast track Agentic automation use cases discovery?
DianaGray10
 

Zero-Knowledge Query Planning for an Iterator Implementation of Link Traversal Based Query Execution

  • 1. Zero-Knowledge Query Planning for an Iterator Implementation of Link Traversal Based Query Execution Olaf Hartig http://olafhartig.de/foaf.rdf#olaf @olafhartig Database and Information Systems Research Group Humboldt-Universität zu Berlin
  • 2. Iterator Based Execution Plan tp1 = ( ?p , ex:affiliated_with , <http://.../orgaX> ) I1 tp2 = ( ?p , ex:interested_in , ?b ) I2 tp3 = ( ?b , rdf:type , <http://.../Book> ) I3 Query ?p ex:affiliated_with <http://.../orgaX> ?p ex:interested_in ?b ?b rdf:type <http://.../Book> Olaf Hartig - Zero-Knowledge Query Planning for an Iterator Implementation of Link Traversal Based Query Execution 2
  • 3. Iterator Based Execution Plan tp1 = ( ?p , ex:affiliated_with , <http://.../orgaX> ) I1 Next? query-local tp2 = ( ?p , ex:interested_in , ?b ) I2 dataset Next? tp3 = ( ?b , rdf:type , <http://.../Book> ) I3 Next? Olaf Hartig - Zero-Knowledge Query Planning for an Iterator Implementation of Link Traversal Based Query Execution 3
  • 4. Iterator Based Execution Plan tp1 = ( ?p , ex:affiliated_with , <http://.../orgaX> ) I1 Next? query-local tp2 = ( ?p , ex:interested_in , ?b ) I2 dataset Next? Descriptor object <http://.../Book> ) tp3 = ( ?b , rdf:type , I3 : Next? <http://.../alice> ex:affiliated_with <http://.../orgaX> : Olaf Hartig - Zero-Knowledge Query Planning for an Iterator Implementation of Link Traversal Based Query Execution 4
  • 5. Iterator Based Execution Plan tp1 = ( ?p , ex:affiliated_with , <http://.../orgaX> ) I1 Next? query-local tp2 = ( ?p , ex:interested_in , ?b ) I2 dataset Next? tp3 = ( ?b , rdf:type , <http://.../Book> ) I3 : Next? <http://.../alice> ex:affiliated_with <http://.../orgaX> : Olaf Hartig - Zero-Knowledge Query Planning for an Iterator Implementation of Link Traversal Based Query Execution 5
  • 6. Iterator Based Execution Plan tp1 = ( ?p , ex:affiliated_with , <http://.../orgaX> ) I1 { ?p = <http://.../alice> } query-local tp2 = ( ?p , ex:interested_in , ?b ) I2 dataset Next? tp3 = ( ?b , rdf:type , <http://.../Book> ) I3 : Next? <http://.../alice> ex:affiliated_with <http://.../orgaX> : Olaf Hartig - Zero-Knowledge Query Planning for an Iterator Implementation of Link Traversal Based Query Execution 6
  • 7. Iterator Based Execution Plan tp1 = ( ?p , ex:affiliated_with , <http://.../orgaX> ) I1 { ?p = <http://.../alice> } query-local tp2 = ( ?p , ex:interested_in , ?b ) I2 dataset tp2' = ( <http://.../alice> , ex:interested_in , ?b ) Next? tp3 = ( ?b , rdf:type , <http://.../Book> ) I3 Next? Olaf Hartig - Zero-Knowledge Query Planning for an Iterator Implementation of Link Traversal Based Query Execution 7
  • 8. Iterator Based Execution Plan tp1 = ( ?p , ex:affiliated_with , <http://.../orgaX> ) I1 { ?p = <http://.../alice> } query-local tp2 = ( ?p , ex:interested_in , ?b ) I2 dataset tp2' = ( <http://.../alice> , ex:interested_in , ?b ) Next? tp3 = ( ?b , rdf:type , <http://.../Book> ) I3 : Next? <http://.../alice> ex:interested_in <http://.../b1> : Olaf Hartig - Zero-Knowledge Query Planning for an Iterator Implementation of Link Traversal Based Query Execution 8
  • 9. Iterator Based Execution Plan tp1 = ( ?p , ex:affiliated_with , <http://.../orgaX> ) I1 { ?p = <http://.../alice> } query-local tp2 = ( ?p , ex:interested_in , ?b ) I2 dataset tp2' = ( <http://.../alice> , ex:interested_in , ?b ) { ?p = <http://.../alice> , ?b = <http://.../b1> } tp3 = ( ?b , rdf:type , <http://.../Book> ) I3 : Next? <http://.../alice> ex:interested_in <http://.../b1> : Olaf Hartig - Zero-Knowledge Query Planning for an Iterator Implementation of Link Traversal Based Query Execution 9
  • 10. Iterator Based Execution Plan tp1 = ( ?p , ex:affiliated_with , <http://.../orgaX> ) I1 { ?p = <http://.../alice> } query-local tp2 = ( ?p , ex:interested_in , ?b ) I2 dataset tp2' = ( <http://.../alice> , ex:interested_in , ?b ) { ?p = <http://.../alice> , ?b = <http://.../b1> } tp3 = ( ?b , rdf:type , <http://.../Book> ) I3 tp3' = ( <http://.../b1> , rdf:type , <http://.../Book> ) Next? Olaf Hartig - Zero-Knowledge Query Planning for an Iterator Implementation of Link Traversal Based Query Execution 10
  • 11. Iterator Based Execution Plan tp1 = ( ?p , ex:affiliated_with , <http://.../orgaX> ) I1 { ?p = <http://.../alice> } query-local tp2 = ( ?p , ex:interested_in , ?b ) I2 dataset tp2' = ( <http://.../alice> , ex:interested_in , ?b ) { ?p = <http://.../alice> , ?b = <http://.../b1> } tp3 = ( ?b , rdf:type , <http://.../Book> ) I3 tp3' = ( <http://.../b1> , rdf:type , <http://.../Book> ) : Next? <http://.../Book> rdfs:subClassOf <http://.../CreativeWork> : Olaf Hartig - Zero-Knowledge Query Planning for an Iterator Implementation of Link Traversal Based Query Execution 11
  • 12. Iterator Based Execution Plan tp1 = ( ?p , ex:affiliated_with , <http://.../orgaX> ) I1 { ?p = <http://.../alice> } query-local tp2 = ( ?p , ex:interested_in , ?b ) I2 dataset tp2' = ( <http://.../alice> , ex:interested_in , ?b ) { ?p = <http://.../alice> , ?b = <http://.../b1> } tp3 = ( ?b , rdf:type , <http://.../Book> ) I3 tp3' = ( <http://.../b1> , rdf:type , <http://.../Book> ) : Next? <http://.../b1> rdf:type <http://.../Book> : Olaf Hartig - Zero-Knowledge Query Planning for an Iterator Implementation of Link Traversal Based Query Execution 12
  • 13. Iterator Based Execution Plan tp1 = ( ?p , ex:affiliated_with , <http://.../orgaX> ) I1 { ?p = <http://.../alice> } query-local tp2 = ( ?p , ex:interested_in , ?b ) I2 dataset tp2' = ( <http://.../alice> , ex:interested_in , ?b ) { ?p = <http://.../alice> , ?b = <http://.../b1> } tp3 = ( ?b , rdf:type , <http://.../Book> ) I3 tp3' = ( <http://.../b1> , rdf:type , <http://.../Book> ) { ?p = <http://.../alice> , ?b = <http://.../b1> } Olaf Hartig - Zero-Knowledge Query Planning for an Iterator Implementation of Link Traversal Based Query Execution 13
  • 14. Example Query Execution Plan tp1 = ( ?p , ex:affiliated_with , <http://.../orgaX>) I1 tp2 = ( ?p , ex:interested_in , ?b ) I2 tp3 = ( ?b , rdf:type , <http://.../Book> ) I3 Query ?p ex:affiliated_with <http://.../orgaX> ?p ex:interested_in ?b ?b rdf:type <http://.../Book> Olaf Hartig - Zero-Knowledge Query Planning for an Iterator Implementation of Link Traversal Based Query Execution 14
  • 15. An Alternative Execution Plan tp1 = ( ?b , rdf:type , <http://.../Book> ) I1 tp2 = ( ?p , ex:interested_in , ?b ) I2 tp3 = ( ?p , ex:affiliated_with , <http://.../orgaX>) I3 Query ?p ex:affiliated_with <http://.../orgaX> ?p ex:interested_in ?b ?b rdf:type <http://.../Book> Olaf Hartig - Zero-Knowledge Query Planning for an Iterator Implementation of Link Traversal Based Query Execution 15
  • 16. An Alternative Execution Plan tp1 = ( ?b , rdf:type , <http://.../Book> ) I1 Next? query-local tp2 = ( ?p , ex:interested_in , ?b ) I2 dataset Next? tp3 = ( ?p , ex:affiliated_with , <http://.../orgaX>) I3 Next? Olaf Hartig - Zero-Knowledge Query Planning for an Iterator Implementation of Link Traversal Based Query Execution 16
  • 17. An Alternative Execution Plan tp1 = ( ?b , rdf:type , <http://.../Book> ) I1 Next? query-local tp2 = ( ?p , ex:interested_in , ?b ) I2 dataset Next? tp3 = ( ?p , ex:affiliated_with , <http://.../orgaX>) I3 : Next? <http://.../Book> rdfs:subClassOf <http://.../CreativeWork> : Olaf Hartig - Zero-Knowledge Query Planning for an Iterator Implementation of Link Traversal Based Query Execution 17
  • 18. An Alternative Execution Plan tp1 = ( ?b , rdf:type , <http://.../Book> ) I1 END! query-local tp2 = ( ?p , ex:interested_in , ?b ) I2 dataset Next? tp3 = ( ?p , ex:affiliated_with , <http://.../orgaX>) I3 : Next? <http://.../Book> rdfs:subClassOf <http://.../CreativeWork> : Olaf Hartig - Zero-Knowledge Query Planning for an Iterator Implementation of Link Traversal Based Query Execution 18
  • 19. An Alternative Execution Plan tp1 = ( ?b , rdf:type , <http://.../Book> ) I1 END! query-local tp2 = ( ?p , ex:interested_in , ?b ) I2 dataset END! tp3 = ( ?p , ex:affiliated_with , <http://.../orgaX>) I3 END! Olaf Hartig - Zero-Knowledge Query Planning for an Iterator Implementation of Link Traversal Based Query Execution 19
  • 20. An Alternative Execution Plan tp1 = ( ?b , rdf:type , <http://.../Book> ) I1 END! query-local tp2 = ( ?p , ex:interested_in , ?b ) I2 dataset END! tp3 = ( ?p , ex:affiliated_with , <http://.../orgaX>) I3 Number of results may depend END! on the order of triple patterns = logical query execution plan Olaf Hartig - Zero-Knowledge Query Planning for an Iterator Implementation of Link Traversal Based Query Execution 20
  • 21. Query Plan Selection ● Assessment criteria: ● Cost (query execution time) ● Benefit (number of results) ● Cost and benefit must be estimated without plan execution ● Estimation impossible due to “zero knowledge” ● Heuristic Based Plan Selection ● DEPENDENCY RESPECT RULE ● SEED TP RULE ● NO VOCAB SEED RULE ● FILTERING TP RULE Olaf Hartig - Zero-Knowledge Query Planning for an Iterator Implementation of Link Traversal Based Query Execution 21
  • 22. SEED TP RULE Use a plan with a seed triple pattern ● Potential seed triple pattern … is a triple pattern that contains at least one HTTP URI ● Seed triple pattern of a plan … is the first triple pattern in the plan and … is a potential seed triple pattern Query ● Rationale: good starting point ?p ex:affiliated_with <http://.../orgaX> √ ?p ex:interested_in ?b √ ?b rdf:type <http://.../Book> √ Olaf Hartig - Zero-Knowledge Query Planning for an Iterator Implementation of Link Traversal Based Query Execution 22
  • 23. NO VOCAB SEED RULE Avoid a seed triple pattern with vocabulary terms ● Not only vocabulary term URIs in the seed triple pattern ● Patterns to avoid: ?s ex:any_property ?o ?s rdf:type ex:any_class ● Rationale: URIs for vocabulary term usually resolve to vocabulary definitions with little instance data Query ?p ex:affiliated_with <http://.../orgaX> √ ?p ex:interested_in ?b ?b rdf:type <http://.../Book> Olaf Hartig - Zero-Knowledge Query Planning for an Iterator Implementation of Link Traversal Based Query Execution 23
  • 24. FILTERING TP RULE Use a plan where all filtering triple patterns are as close to the seed triple pattern as possible ● Filtering triple pattern: each variable already occurs in one of the preceding triple patterns ● For each result tp1 = ( ?p , ex:affiliated_with , <http://.../orgaX>) I1 consumed as input a filtering TP can { ?p = <http://.../alice> } only report 1 or 0 results as output tp2 = ( ?p , ex:interested_in , ?b ) I2 tp2' = ( <http://.../alice> , ex:interested_in , ?b ) ● Rationale: Reduce { ?p = <http://.../alice> , ?b = <http://.../b1> } cost tp3 = ( ?b , rdf:type , <http://.../Book> ) I3 tp3' = ( <http://.../b1> , rdf:type , <http://.../Book> ) Olaf Hartig - Zero-Knowledge Query Planning for an Iterator Implementation of Link Traversal Based Query Execution 24
  • 25. Evaluation Procedure ● Generate all possible plans ● Execute each plan: ● 5 runs (+ 1 initial warm-up run) ● Use an initially empty query-local dataset for each run ● Measure for each plan: ● Avg. execution time ● Avg. number of descriptor objects retrieved during execution ● Avg. number of query results Olaf Hartig - Zero-Knowledge Query Planning for an Iterator Implementation of Link Traversal Based Query Execution 25
  • 26. Evaluation Query (Example) SELECT ?spec ?genus WHERE { Of what genus are the species that are geospecies:4qyn7 gs:inFamily ?fam . ● classified in the ?fam skos:narrowerTransitive ?spec . same family as the ?spec skos:closeMatch ?sp2 . American Badger, ● and expected in the ?sp2 rdfs:subClassOf ?genus . same states as the ?spec gs:isExpectedIn ?loc . American Badger ? geospecies:4qyn7 gs:isExpectedIn ?loc ?loc rdf:type gs:State . } ● 2 potential seed triple patterns that satisfy our NO SEED VOCAB RULE ● 56 different plans, each contains 2 filtering triple patterns Picture source: Wikipedia Olaf Hartig - Zero-Knowledge Query Planning for an Iterator Implementation of Link Traversal Based Query Execution 26
  • 27. Measurements 30 400 retrieved descr. objects 300 20 query results 200 10 100 0 0 0 30 60 90 120 150 180 0 30 60 90 120 150 180 query exec. times (in seconds) query exec. times (in seconds) Percentage of plans in each group with a filtering TP in specific positions 1st Filtering TP 2nd Filtering TP 100 100 50 50 0 0 1 2 3 4 5 6 7 1 2 3 4 5 6 7 Olaf Hartig position in the ordered an Iterator Implementation of Link Traversal Based Query Execution BGP 27 TP - Zero-Knowledge Query Planning for BGP TP position in the ordered
  • 28. Conclusions ● Approach that uses iterators to implement Link Traversal Based Query Execution … is sound … guarantees termination … cannot guarantee (reachability-) completeness ● Degree of completeness depends on the query plans (i.e. orders of the BGP) ● Heuristic based plan selection ● Next steps: ● Algorithm to generate most suitable plans only ● Integrate adaptive query processing techniques Olaf Hartig - Zero-Knowledge Query Planning for an Iterator Implementation of Link Traversal Based Query Execution 28
  • 29. Backup Slides Olaf Hartig - Zero-Knowledge Query Planning for an Iterator Implementation of Link Traversal Based Query Execution 29
  • 30. Outline 1. Link Traversal Based Query Execution 2. Characteristics of the Iterator Based Implementation Approach 3. Query Plan Selection 4. Evaluation Olaf Hartig - Zero-Knowledge Query Planning for an Iterator Implementation of Link Traversal Based Query Execution 30
  • 31. Link Traversal Based Querying ● Semantics defined in two phases: 1. Reachability Descriptor 2. Query Results object Query ?p ex:affiliated_with <http://.../orgaX> ?p ex:interested_in ?b ?b rdf:type <http://.../Book> Olaf Hartig - Zero-Knowledge Query Planning for an Iterator Implementation of Link Traversal Based Query Execution 31
  • 32. Link Traversal Based Querying ● Semantics defined in two phases: 1. Reachability 2. Query Results Query ?p ex:affiliated_with <http://.../orgaX> ?p ex:interested_in ?b ?b rdf:type <http://.../Book> Olaf Hartig - Zero-Knowledge Query Planning for an Iterator Implementation of Link Traversal Based Query Execution 32
  • 33. Link Traversal Based Querying ● Semantics defined in two phases: 1. Reachability 2. Query Results Query ?p ex:affiliated_with <http://.../orgaX> ?p ex:interested_in ?b ?b rdf:type <http://.../Book> Olaf Hartig - Zero-Knowledge Query Planning for an Iterator Implementation of Link Traversal Based Query Execution 33
  • 34. Link Traversal Based Querying ● Semantics defined Reachable in two phases: descriptor 1. Reachability object 2. Query Results The mapping μ : V → U  B  L is a result to BGP bgp iff: Condition 1: dom(μ) = vars(bgp) Condition 2: μ[bgp]  Dreachable ● 2 phases do not reflect idea of actual execution strategy ● Intertwine query evaluation with the traversal of data links Olaf Hartig - Zero-Knowledge Query Planning for an Iterator Implementation of Link Traversal Based Query Execution 34
  • 35. Characteristics ● No need to know all data sources in advance ● Never complete w.r.t. all data on the Web ● Reason: reachability based on links that match query patterns ● New concept: reachability-completeness ● No guarantee for termination ● Reason: Web of Data is infinite (at any point in time) Olaf Hartig - Zero-Knowledge Query Planning for an Iterator Implementation of Link Traversal Based Query Execution 35
  • 36. Outline 1. Characteristics of the Implementation Approach 2. Query Plan Selection 3. Evaluation Olaf Hartig - Zero-Knowledge Query Planning for an Iterator Implementation of Link Traversal Based Query Execution 36
  • 37. Characteristics ● Sound ● Number of results may depend on order of triple patterns ● Reachability-completeness not guaranteed ● Main reason: inflexibility due to fixed order ● Termination is guaranteed ● No such guarantee for link traversal based query execution in general because the Web of Data is infinite (at any point in time) ● Efficient ● Easy to apply in existing query engines Olaf Hartig - Zero-Knowledge Query Planning for an Iterator Implementation of Link Traversal Based Query Execution 37
  • 38. Plan Selection Problem ● Number of results may depend on order of triple patterns = logical query execution plan ➔ Problem: select a suitable plan Olaf Hartig - Zero-Knowledge Query Planning for an Iterator Implementation of Link Traversal Based Query Execution 38
  • 39. Outline 1. Link Traversal Based Query Execution 2. Characteristics of the Iterator Based Implementation Approach 3. Query Plan Selection 4. Evaluation Olaf Hartig - Zero-Knowledge Query Planning for an Iterator Implementation of Link Traversal Based Query Execution 39
  • 40. DEPENDENCY RESPECT RULE Use a dependency respecting query plan ● Dependency respect: a variable from each triple pattern already occurs in one of the preceding triple patterns tp1 = ( ?p , ex:affiliated_with , <http://.../orgaX>) I1 Query tp2 = ( ?p , ex:interested_in , ?b ) √ I2 ?p ex:affiliated_with tp3 = ( ?b , rdf:type , <http://.../Book> ) <http://.../orgaX> I3 ?p ex:interested_in ?b ?b rdf:type <http://.../Book> Olaf Hartig - Zero-Knowledge Query Planning for an Iterator Implementation of Link Traversal Based Query Execution 40
  • 41. DEPENDENCY RESPECT RULE Use a dependency respecting query plan ● Dependency respect: a variable from each triple pattern already occurs in one of the preceding triple patterns tp1 = ( ?p , ex:affiliated_with , <http://.../orgaX>) I1 tp2 = ( ?p , ex:interested_in , ?b ) I2 Query ?p ex:affiliated_with tp3 = ( ?b , rdf:type , <http://.../Book> ) <http://.../orgaX> I3 ?p ex:interested_in ?b ?b rdf:type <http://.../Book> Olaf Hartig - Zero-Knowledge Query Planning for an Iterator Implementation of Link Traversal Based Query Execution 41
  • 42. DEPENDENCY RESPECT RULE Use a dependency respecting query plan ● Dependency respect: a variable from each triple pattern already occurs in one of the preceding triple patterns ● Rationale: tp1 = ( ?p , ex:affiliated_with , <http://.../orgaX>) I1 Avoid cartesian products tp2 = ( ?b , rdf:type , <http://.../Book> ) I2 Query ?p ex:affiliated_with tp3 = ( ?p , ex:interested_in , ?b ) <http://.../orgaX> I3 ?p ex:interested_in ?b ?b rdf:type <http://.../Book> Olaf Hartig - Zero-Knowledge Query Planning for an Iterator Implementation of Link Traversal Based Query Execution 42
  • 43. These slides have been created by Olaf Hartig http://olafhartig.de This work is licensed under a Creative Commons Attribution-Share Alike 3.0 License (http://creativecommons.org/licenses/by-sa/3.0/) Olaf Hartig - Zero-Knowledge Query Planning for an Iterator Implementation of Link Traversal Based Query Execution 43