The Web of Data for E-Commerce in Brief

A Hands-on Introduction to the GoodRelations Ontology,
         RDFa, and Yahoo! SearchMonkey

                   October 25, 2009
Westfields Conference Center near Washington, DC, USA

                        Martin Hepp
      Universität der Bundeswehr München, Munich, Germany


                    Richard Cyganiak
        Digital Enterprise Research Institute (DERI), Ireland
About the Organizers
             Martin Hepp                   Richard Cyganiak


    Professor, Head of Group                    PhD Researcher
Universität der Bundeswehr München    Digital Enterprise Research Institute
          Munich, Germany                      (DERI), Galway, Ireland
      mhepp@computer.org                  richard.cyganiak@deri.org
      http://www.heppnetz.de                    http://www.deri.ie
 Previous affiliations: Universität    Previous affiliations: FU Berlin,
   Würzburg (Germany), Florida Gulf                    Germany
      Coast University, IBM Zurich
        Research Lab, DERI/STI
                Innsbruck

25.10.2009                                                                    2
Learning Goals
Participants will learn
• to use
     – the GoodRelations conceptual structures and
     – the RDFa syntax
  to augment static and dynamic Web sites by the various relevant
  details of a commercial Web presence;
• RDFa modeling patterns for more complex RDF structures;
• to publish data on the Semantic Web and make it available for
  indexing services, repositories, Yahoo SearchMonkey and
  applications;
• to query the Web of Data using SPARQL, and
• the development of simple Yahoo SearchMonkey and Yahoo
  BOSS applications.
25.10.2009                                                          3
Logistics
08:30-10:30   Overview and Motivation: Why the Web of Data is Now 30’
              Quick Review of Prerequisites 15’
              The GoodRelations Ontology: E-Commerce on the Web of Data 75’
10:30-10:45   Coffee Break
10:45-12:30   RDFa: Bridging the Web of Documents with the Web of Data 45’
              Expressing GoodRelations in RDFa: A Running Example 30’
              GoodRelations – Advanced Topics 30’
12:30-13:30   Lunch Break
13:30-16:00   Hands-on Exercise: Annotating a Web Shop 60’
              Querying the Web of Data for Offerings – SPARQL 15’
              Querying the Web of Data – Exercises 15’
16:00-16:30   Coffee Break
16:30-18:00   Publishing Semantic Web Data: Make Your RDF Available 30’
              Yahoo SearchMonkey and Yahoo BOSS 45’
              Discussion, Conclusion, Feedback Round 15’
                                                                          4
Resources: Information
•   Wiki page
    http://tr.im/srGx
    http://www.ebusiness-unibw.org/wiki/Web_of_Data_for_E-Commerce_Tutorial_ISWC2009
•   GoodRelations Primer
    http://www.heppnetz.de/projects/goodrelations/primer/
•   GoodRelations Documentation
    http://purl.org/goodrelations/v1
•   RDFa
    http://www.w3.org/TR/2008/REC-rdfa-syntax-20081014/
•   SPARQL
    http://www.w3.org/TR/rdf-sparql-query/
•   Yahoo SearchMonkey
    http://developer.yahoo.com/searchmonkey/smguide/


25.10.2009                                                                             5
Resources: Tools
• RDF Validator (and Visualizer)
   http://www.w3.org/RDF/Validator/
• GoodRelations Annotator
   http://www.ebusiness-unibw.org/tools/goodrelations-annotator/
• PyRDFa
   http://www.w3.org/2007/08/pyRdfa/
• Twinkle
   http://www.ldodds.com/projects/twinkle/
      Custom configuration file from Wiki
• RDF2dataRSS
   http://www.ebusiness-unibw.org/tools/rdf2datarss/

25.10.2009                                                         6
Overview and Motivation: Why the
             Web of Data is Now

                Martin Hepp




25.10.2009                               7
Limitations of the Web, 2009
Specificity vs. Keyword-based Search
•   Synonyms
•   Homonyms
•   Multiple languages
•   No parametric
    search




                                           9
No Unified View: Jumping Back and Forth
           Across Data Silos
                               Site   Page          Page
       Search Engine Results
      Search Engine Results

                                1      1             2
    Search Engine Results
   Search Engine Results


                                             Page          Page
                                              3             4

                               Site   Page
                                2      5


                               Site   Page          Page   Page
                                3      6             7      8




                                                                  10
We know the best hits only when done.
                          Site   Page          Page
                           1      1             2
  Search Engine Results



                                        Page          Page
                                         3             4

                          Site   Page
                           2      5


                          Site   Page          Page   Page
                           3      6             7      8




                                                             11
Limited Ability to Reuse Data




                                12
The Web: A Bottleneck for Sharing
         Product Data




                                    13
Web of Data (“Semantic Web”)




                               14
E-Commerce on the Web of Data




                                15
Goal: A Unified View on Commerce
           Data on the Web
                                            Extraction
                        Arbitrary Query     and Reuse


Manufacturers
                                                     Retailers
Payment
                                                     Delivery
Product Model                                 Warranty
 Master Data     Shop                Spare Parts &
                Offerings   Auctions Consumables
                                                         16
On the Shoulders of Giants




  A Unified View of Commerce Data
             on the Web
                                    17
Martin Hepp,
mhepp@compu
Deep Comparison Shopping
               Search Engine Results




               Site




                                       Site
                         Site
                3




                                        1
                          2
               Page




                         Page




                                       Page
                6




                          5




                                        1
               Page




                                Page
                7




                                 3




                                       Page
                                        2
               Page




                                Page
                8




                                 4


                                              18
Martin Hepp,
mhepp@compu
Use Case 1: Product Search
• Find all MP3 players
  that have a USB
  interface and a color
  display, and sort them
  by weight (lightest
  first).



                           ...on a Web Scale!
                                                19
Use Case 2: Product Model Data Reuse
                     World Wide Web
                                                                World Wide Web
 Manufacturer                                     Retailer /
                                                  Web Shop


                                                   Structured
   Structured
                                                    Data on
    Data on
                                                    Products
  Products and     Product Specifications:             and
    Services
                 Type of Product, Features etc.     Services




                                                                          20
Use Case 3: Fine-grained Affiliate
          Marketing
                                         Offers of
                                        computer
                                         add-ons
                                        that have
                                          an USB
                                         interface




 Screenshot from http://en.wikipedia.org/wiki/USB
                                                     21
The Web of Linked Data, Essentially:
1.   Cluster Web links by what they mean
2.   Use URIs to indicate the type of links
3.   Use HTTP URIs so that it is quick and easy to explore
     what this URI means.
4.   Make clear whether you are referring to something or
     its representation.




                                                        22
Martin Hepp,
mhepp@compu
The Web of Linked Data, Essentially:
1.   Cluster Web links by what they mean
2.   Use URIs to indicate the type of links
3.   Use HTTP URIs so that it is quick and easy to explore
     what this URI means.
4.   Make clear whether you are referring to something or
     its representation.




                                                        23
Martin Hepp,
mhepp@compu
The Web of Linked Data, Essentially:
1.   Cluster Web links by what they mean
2.   Use URIs to indicate the type of links
3.   Use HTTP URIs so that it is quick and easy to explore
     what this URI means.
4.   Make clear whether you are referring to something or
     its representation.




                                                        24
Martin Hepp,
mhepp@compu
Technical Effects & Working Assumption
                   • This will reduce the
                     computational
                     complexity of
                     processing,
                     combining, reusing
                     data on a Web scale




                                            25
Martin Hepp,
mhepp@compu
Both Sides Can Help Build a Bridge




                                       26
Martin Hepp,
mhepp@compu
The Web of Linked Data is NOW and HERE
• RDFa has become a W3C Recommendation
     – HTML5+RDFa Specification well underway, too
•   Yahoo SearchMonkey and BOSS
•   Google adopts RDFa
•   GoodRelations ontology
•   SPARQL Query language and endpoint interface
•   Scalable, commercial repositories
•   Linked Data Guidelines: Best Practices for co-
    existence of the Web of Data and existing Web content

25.10.2009                                                  27
NOW and HERE: Yahoo & GoodRelations




25.10.2009                              28
NOW and HERE: Google (Mock-up)




25.10.2009                               29
NOW and HERE: OpenLink Virtuoso Spongers




25.10.2009                           30
GoodRelations #2 of all Web Ontologies




         …and this does not yet include the > 10 Mio. offers
         from Amazon and eBay!

25.10.2009                                                     31
GoodRelations #2 of all Web Ontologies




25.10.2009                          32
NOW and HERE: BestBuy
• Details on all 1000+ stores in the US using
  GoodRelations
     – http://stores.bestbuy.com/sitemap.xml
     – http://lod.openlinksw.com/sparql
• Full Catalog: >432,000 item descriptions
     – http://products.semweb.bestbuy.com/sitemap.xml
     – updated on a daily basis



25.10.2009                                              33
Thank you.




25.10.2009                34

GoodRelations Tutorial Part 1

  • 1.
    The Web ofData for E-Commerce in Brief A Hands-on Introduction to the GoodRelations Ontology, RDFa, and Yahoo! SearchMonkey October 25, 2009 Westfields Conference Center near Washington, DC, USA Martin Hepp Universität der Bundeswehr München, Munich, Germany Richard Cyganiak Digital Enterprise Research Institute (DERI), Ireland
  • 2.
    About the Organizers Martin Hepp Richard Cyganiak Professor, Head of Group PhD Researcher Universität der Bundeswehr München Digital Enterprise Research Institute Munich, Germany (DERI), Galway, Ireland [email protected] [email protected] http://www.heppnetz.de http://www.deri.ie Previous affiliations: Universität Previous affiliations: FU Berlin, Würzburg (Germany), Florida Gulf Germany Coast University, IBM Zurich Research Lab, DERI/STI Innsbruck 25.10.2009 2
  • 3.
    Learning Goals Participants willlearn • to use – the GoodRelations conceptual structures and – the RDFa syntax to augment static and dynamic Web sites by the various relevant details of a commercial Web presence; • RDFa modeling patterns for more complex RDF structures; • to publish data on the Semantic Web and make it available for indexing services, repositories, Yahoo SearchMonkey and applications; • to query the Web of Data using SPARQL, and • the development of simple Yahoo SearchMonkey and Yahoo BOSS applications. 25.10.2009 3
  • 4.
    Logistics 08:30-10:30 Overview and Motivation: Why the Web of Data is Now 30’ Quick Review of Prerequisites 15’ The GoodRelations Ontology: E-Commerce on the Web of Data 75’ 10:30-10:45 Coffee Break 10:45-12:30 RDFa: Bridging the Web of Documents with the Web of Data 45’ Expressing GoodRelations in RDFa: A Running Example 30’ GoodRelations – Advanced Topics 30’ 12:30-13:30 Lunch Break 13:30-16:00 Hands-on Exercise: Annotating a Web Shop 60’ Querying the Web of Data for Offerings – SPARQL 15’ Querying the Web of Data – Exercises 15’ 16:00-16:30 Coffee Break 16:30-18:00 Publishing Semantic Web Data: Make Your RDF Available 30’ Yahoo SearchMonkey and Yahoo BOSS 45’ Discussion, Conclusion, Feedback Round 15’ 4
  • 5.
    Resources: Information • Wiki page http://tr.im/srGx http://www.ebusiness-unibw.org/wiki/Web_of_Data_for_E-Commerce_Tutorial_ISWC2009 • GoodRelations Primer http://www.heppnetz.de/projects/goodrelations/primer/ • GoodRelations Documentation http://purl.org/goodrelations/v1 • RDFa http://www.w3.org/TR/2008/REC-rdfa-syntax-20081014/ • SPARQL http://www.w3.org/TR/rdf-sparql-query/ • Yahoo SearchMonkey http://developer.yahoo.com/searchmonkey/smguide/ 25.10.2009 5
  • 6.
    Resources: Tools • RDFValidator (and Visualizer) http://www.w3.org/RDF/Validator/ • GoodRelations Annotator http://www.ebusiness-unibw.org/tools/goodrelations-annotator/ • PyRDFa http://www.w3.org/2007/08/pyRdfa/ • Twinkle http://www.ldodds.com/projects/twinkle/ Custom configuration file from Wiki • RDF2dataRSS http://www.ebusiness-unibw.org/tools/rdf2datarss/ 25.10.2009 6
  • 7.
    Overview and Motivation:Why the Web of Data is Now Martin Hepp 25.10.2009 7
  • 8.
  • 9.
    Specificity vs. Keyword-basedSearch • Synonyms • Homonyms • Multiple languages • No parametric search 9
  • 10.
    No Unified View:Jumping Back and Forth Across Data Silos Site Page Page Search Engine Results Search Engine Results 1 1 2 Search Engine Results Search Engine Results Page Page 3 4 Site Page 2 5 Site Page Page Page 3 6 7 8 10
  • 11.
    We know thebest hits only when done. Site Page Page 1 1 2 Search Engine Results Page Page 3 4 Site Page 2 5 Site Page Page Page 3 6 7 8 11
  • 12.
    Limited Ability toReuse Data 12
  • 13.
    The Web: ABottleneck for Sharing Product Data 13
  • 14.
    Web of Data(“Semantic Web”) 14
  • 15.
    E-Commerce on theWeb of Data 15
  • 16.
    Goal: A UnifiedView on Commerce Data on the Web Extraction Arbitrary Query and Reuse Manufacturers Retailers Payment Delivery Product Model Warranty Master Data Shop Spare Parts & Offerings Auctions Consumables 16
  • 17.
    On the Shouldersof Giants A Unified View of Commerce Data on the Web 17 Martin Hepp, mhepp@compu
  • 18.
    Deep Comparison Shopping Search Engine Results Site Site Site 3 1 2 Page Page Page 6 5 1 Page Page 7 3 Page 2 Page Page 8 4 18 Martin Hepp, mhepp@compu
  • 19.
    Use Case 1:Product Search • Find all MP3 players that have a USB interface and a color display, and sort them by weight (lightest first). ...on a Web Scale! 19
  • 20.
    Use Case 2:Product Model Data Reuse World Wide Web World Wide Web Manufacturer Retailer / Web Shop Structured Structured Data on Data on Products Products and Product Specifications: and Services Type of Product, Features etc. Services 20
  • 21.
    Use Case 3:Fine-grained Affiliate Marketing Offers of computer add-ons that have an USB interface Screenshot from http://en.wikipedia.org/wiki/USB 21
  • 22.
    The Web ofLinked Data, Essentially: 1. Cluster Web links by what they mean 2. Use URIs to indicate the type of links 3. Use HTTP URIs so that it is quick and easy to explore what this URI means. 4. Make clear whether you are referring to something or its representation. 22 Martin Hepp, mhepp@compu
  • 23.
    The Web ofLinked Data, Essentially: 1. Cluster Web links by what they mean 2. Use URIs to indicate the type of links 3. Use HTTP URIs so that it is quick and easy to explore what this URI means. 4. Make clear whether you are referring to something or its representation. 23 Martin Hepp, mhepp@compu
  • 24.
    The Web ofLinked Data, Essentially: 1. Cluster Web links by what they mean 2. Use URIs to indicate the type of links 3. Use HTTP URIs so that it is quick and easy to explore what this URI means. 4. Make clear whether you are referring to something or its representation. 24 Martin Hepp, mhepp@compu
  • 25.
    Technical Effects &Working Assumption • This will reduce the computational complexity of processing, combining, reusing data on a Web scale 25 Martin Hepp, mhepp@compu
  • 26.
    Both Sides CanHelp Build a Bridge 26 Martin Hepp, mhepp@compu
  • 27.
    The Web ofLinked Data is NOW and HERE • RDFa has become a W3C Recommendation – HTML5+RDFa Specification well underway, too • Yahoo SearchMonkey and BOSS • Google adopts RDFa • GoodRelations ontology • SPARQL Query language and endpoint interface • Scalable, commercial repositories • Linked Data Guidelines: Best Practices for co- existence of the Web of Data and existing Web content 25.10.2009 27
  • 28.
    NOW and HERE:Yahoo & GoodRelations 25.10.2009 28
  • 29.
    NOW and HERE:Google (Mock-up) 25.10.2009 29
  • 30.
    NOW and HERE:OpenLink Virtuoso Spongers 25.10.2009 30
  • 31.
    GoodRelations #2 ofall Web Ontologies …and this does not yet include the > 10 Mio. offers from Amazon and eBay! 25.10.2009 31
  • 32.
    GoodRelations #2 ofall Web Ontologies 25.10.2009 32
  • 33.
    NOW and HERE:BestBuy • Details on all 1000+ stores in the US using GoodRelations – http://stores.bestbuy.com/sitemap.xml – http://lod.openlinksw.com/sparql • Full Catalog: >432,000 item descriptions – http://products.semweb.bestbuy.com/sitemap.xml – updated on a daily basis 25.10.2009 33
  • 34.