A Linked Data Cookbook
                          for Government Agencies
                                Semantic Technology Conference, Washington DC
                                            01-Dec-2011 8:30AM




                                          Bernadette Hyland
                                        CEO, 3 Round Stones &
                          co-chair W3C Government Linked Data Working Group
                                      bhyland@3roundstones.com
                                         Twitter @BernHyland




Monday, November 28, 11
• Linked Data is about
       publishing and
       consuming data using
       international data
       standards
  • Based on 20 year old
       idea
  • Goal is to solve
       organizational issues
       related to data silos,
       requirements for faster
       data integration and an
       environment of reduced
       IT budgets


Monday, November 28, 11
Linking Government Data
            •      42 contributors
            •      ...from 8 countries
            •      10 chapters
            •      Publication date:
                   November 2011




                                                    3

Monday, November 28, 11
Agenda
                          • Why publishing Linked Open Data matters

                          • What governments are doing today

                          • How government use of Open Standards &
                            Open Source Software saves lives and money

                          • Social contract as a government publisher

                          • Next steps



Monday, November 28, 11
Two sides of the
                           Open Government Coin
                          Short and long term public interests

                          Increasing transparency

                          Helping with informed civic engagement

               #2 Data sharing for informed research, policy &
               regulation

                          My talk today focuses on #2


Monday, November 28, 11
Why should we Care?
             • Reducing      data silos has long been discussed ...

                  • Linked  Data, based on international data exchange
                     standards avoids vendor lock in

                  • Reduces     the need to create & maintain data silos

             • Encourages          private and public partnerships

                  • Sows  the seeds for economic growth from the top down
                     and bottom up


Monday, November 28, 11
ACCEPTABLE ROI FOR IT

                                    4%   17%
                              13%


                            16%


                                               6 months
                                         49%   12 months
                                               18 months
                                               24 months
                                               More than 24 months


Monday, November 28, 11
Monday, November 28, 11
Monday, November 28, 11
Monday, November 28, 11
Governments
           Goals: Governmental transparency and/or improved
                  internal efficiencies (data warehouses)




Monday, November 28, 11
Publishers
        Goals: Improve internal manuscript pipelines, expose
            additional ways of finding and using content




Monday, November 28, 11
Monday, November 28, 11
Countries with Open
                       Government Sites




Monday, November 28, 11
Open Government Data Camp 2011




Monday, November 28, 11
Open Government Data in 2011




Monday, November 28, 11
Government LOD on CKAN




Monday, November 28, 11
Largest Publisher of Government LOD




Monday, November 28, 11
Where is Open Source deployed?
             International Standards and Open Source are the reason

                   • The Web has become the most extensible, robust
                          information network ever created

                   • US Dept of Defense is big customer of commercially
                          support Open Source software

                   • US Army cites Open Source is saving lives and hundreds of
                          millions of dollars.

                           • 100k instances deployed in missile defense systems &
                               armored personnel carriers

Monday, November 28, 11
In 3 brief years ...
       • Starting in 2008, a few heads of state directed open
                 government data to be published on the Web ...

       • Three months ago (September 2011), Presidents
                 Obama (USA) and Rousseff (Brazil) endorsed the
                 Open Government Partnership, along with
                 7 other nations

       • Each launched their government’s National Plans
                 during the meeting of the UN General Assembly

Monday, November 28, 11
World changing phenomenon
                •     Using Linked Data approach, we can begin to
                      address data silos and interoperability using
                      data exchange standards
                •     We can combine information sources
                •     The W3C has defined standards that enable
                      interoperability and allow us to freely move
                      data



Monday, November 28, 11
Monday, November 28, 11
What is next?
                • We’re already seeing signs of things to
                          come.

                • Structured data on the Web is becoming
                          mainstream.




Monday, November 28, 11
Government Linked
                                 Data Working Group
           • Started June 2011; runs to May 2013
           • Chartered to provide standards & develop standards
                     track documents to help all governments share
                     their data as high quality (“5 star”) Linked
                     Data
           • 39 participants from 25 organizations
           • 50% in non-US locations

Monday, November 28, 11
http://www.w3.org/2011/gld/charter
Monday, November 28, 11
Deliverables
     Community Directory

     Best Practices for Publishing Linked Data

           • Procurement, vocabulary selection, URI construction,
                    versioning, stability, legacy data issues

           • Cookbook for Linked Open Data
     Standard Vocabularies

                • Metadata, Statistical “Cube” Data, People,
                          Organizational structures

Monday, November 28, 11
Beta: http://dir.w3.org
                          email support@3roundstones.com for login to
                                    add your organization’s details
Monday, November 28, 11
Monday, November 28, 11
Monday, November 28, 11
A pragmatic
                                      approach to
                          publishing & consuming
                                     Linked Data
Monday, November 28, 11
There is a Process


          Identify        Model   Name    Describe   Convert   Publish




                                         Maintain




Monday, November 28, 11
Monday, November 28, 11
Monday, November 28, 11
Monday, November 28, 11
Monday, November 28, 11
Monday, November 28, 11
Preparation
                 1. Leverage what exists
                 • Request a copy of the logical and physical model of the
                    database(s)
                 • Obtain data extracts (i.e., databases and/or spreadsheets)
                    or create data in a way that can be replicated.




Monday, November 28, 11
Model the data
         2. Model data without context to allow for
            reuse and easier merging of data sets

             • Traditional DBAs organize data for specified
                 Web services or applications.

             • With    LD, application logic does not drive the
                 data schema, concepts, etc.

Monday, November 28, 11
Model the data
              3.Look for real world objects of interest (e.g.,
                people, places, things, locations, etc.) and
                model them.
              • Investigate how others are already modeling
                similar or related data.
              • Look for duplication and normalize the data
              • Use common sense to decide whether or
                not to make link

Monday, November 28, 11
Model the data ...
              4. Connect data from different sources and
                authoritative vocabularies (see list of popular
                vocabularies below).
              •Use URIs as names for your
                objects




Monday, November 28, 11
Model the data ...

              •Put aside immediate needs of any
               application
              •Don’t think about how an application will
               use your data
              •Do think about time and how the data will
               change over time.


Monday, November 28, 11
Convert, Publish & Maintain
                 5.Write a script or process to convert the
                   data set repeatedly
                 6.Publish to the Web and announce it! (more
                   details shortly)
                 7.Maintenance strategy (more details in the
                   social contract at the end)


Monday, November 28, 11
Take the plunge ... Be forgiving

                • Simplistic   data models can still be useful
                • Better   to make progress with something
                    rather than do nothing because we cannot
                    be comprehensive and complete




Monday, November 28, 11
Take an iterative approach
                  1. Review of modeling decisions

                  2. Review vocabularies chosen and developed

                  3. Modify/update data conversion scripts

                  4. Do a maintenance walk-through with real use cases

                  5. Show how to explore data with SPARQL and
                     visualizations

                  6. Discuss a persistent identifier strategy (think PURLs)

Monday, November 28, 11
Content Management Systems




Monday, November 28, 11
Linked Data Management System
           Callimachus (kəlĭm'əkəs) is a framework for data-driven
           applications based on Linked Data principles.

           Callimachus allows Web authors to quickly and easily create
           semantically-enabled Web applications.




Monday, November 28, 11
Web 2.0 developers can create data driven application
              with templates in hours
              Triples up & down (no mySQL under the covers)
              Wiki editing of content
              Access control
              Collaboration via Web
              Change tracking (history)
              Page/form Templates




Monday, November 28, 11
Monday, November 28, 11
Monday, November 28, 11
Monday, November 28, 11
Monday, November 28, 11
Monday, November 28, 11
Monday, November 28, 11
Monday, November 28, 11
Monday, November 28, 11
Monday, November 28, 11
Monday, November 28, 11
Join the Community
      Callimachus has benefited from 2+ years of corporate support

      We’re using it for real world Web applications in environmental
      protection, finance and publishing

      Open Source project

      Visit callimachusproject.org




Monday, November 28, 11
What we covered today
          • Why government authorities are publishing information as
            Linked Open Data

          • The process for converting data into RDF

          • Using Open Standards and Open Source to publish
            Open Data

                   • Note: Commercial support & products are
                     critical for government publishing & consumption of Open
                     Data

          • Announcing agency Open Data & your social contract

Monday, November 28, 11
Further Reading




                               http://linkeddatabook.com/editions/1.0/

                          http://3roundstones.com/linking-enterprise-data/
                          http://3roundstones.com/linking-government-data/

                                http://www.linkeddatadeveloper.com/

Monday, November 28, 11
Recommended talk
                          Thursday, 1-Dec 2011 @ 9:30
                            by Michael Pendleton &
                            David G. Smith, US EPA

                   LINKED GOVERNMENT
                          DATA:
                     ENVIRONMENTAL
                 PROTECTION PERSPECTIVES

Monday, November 28, 11
This talk http://slideshare.net/3roundstones
                                          @BernHyland
                                  bhyland@3roundstones.com
Monday, November 28, 11
This work is Copyright © 2011 3 Round Stones Inc.
                          It is licensed under the Creative Commons Attribution 3.0 Unported License
                          Full details at: http://creativecommons.org/licenses/by/3.0/

                          You are free:

                                  to Share — to copy, distribute and transmit the work



                                  to Remix — to adapt the work



                          Under the following conditions:
                                  Attribution. You must attribute the work in the manner specified by the
                                  author or licensor (but not in any way that suggests that they endorse
                                  you or your use of the work).

                                  Share Alike. If you alter, transform, or build upon this work, you may
                                  distribute the resulting work only under the same or similar license to this
                                  one.




Monday, November 28, 11

Linked Data Cookbook for Government Agencies

  • 1.
    A Linked DataCookbook for Government Agencies Semantic Technology Conference, Washington DC 01-Dec-2011 8:30AM Bernadette Hyland CEO, 3 Round Stones & co-chair W3C Government Linked Data Working Group [email protected] Twitter @BernHyland Monday, November 28, 11
  • 2.
    • Linked Datais about publishing and consuming data using international data standards • Based on 20 year old idea • Goal is to solve organizational issues related to data silos, requirements for faster data integration and an environment of reduced IT budgets Monday, November 28, 11
  • 3.
    Linking Government Data • 42 contributors • ...from 8 countries • 10 chapters • Publication date: November 2011 3 Monday, November 28, 11
  • 4.
    Agenda • Why publishing Linked Open Data matters • What governments are doing today • How government use of Open Standards & Open Source Software saves lives and money • Social contract as a government publisher • Next steps Monday, November 28, 11
  • 5.
    Two sides ofthe Open Government Coin Short and long term public interests Increasing transparency Helping with informed civic engagement #2 Data sharing for informed research, policy & regulation My talk today focuses on #2 Monday, November 28, 11
  • 6.
    Why should weCare? • Reducing data silos has long been discussed ... • Linked Data, based on international data exchange standards avoids vendor lock in • Reduces the need to create & maintain data silos • Encourages private and public partnerships • Sows the seeds for economic growth from the top down and bottom up Monday, November 28, 11
  • 7.
    ACCEPTABLE ROI FORIT 4% 17% 13% 16% 6 months 49% 12 months 18 months 24 months More than 24 months Monday, November 28, 11
  • 8.
  • 9.
  • 10.
  • 11.
    Governments Goals: Governmental transparency and/or improved internal efficiencies (data warehouses) Monday, November 28, 11
  • 12.
    Publishers Goals: Improve internal manuscript pipelines, expose additional ways of finding and using content Monday, November 28, 11
  • 13.
  • 14.
    Countries with Open Government Sites Monday, November 28, 11
  • 15.
    Open Government DataCamp 2011 Monday, November 28, 11
  • 16.
    Open Government Datain 2011 Monday, November 28, 11
  • 17.
    Government LOD onCKAN Monday, November 28, 11
  • 18.
    Largest Publisher ofGovernment LOD Monday, November 28, 11
  • 19.
    Where is OpenSource deployed? International Standards and Open Source are the reason • The Web has become the most extensible, robust information network ever created • US Dept of Defense is big customer of commercially support Open Source software • US Army cites Open Source is saving lives and hundreds of millions of dollars. • 100k instances deployed in missile defense systems & armored personnel carriers Monday, November 28, 11
  • 20.
    In 3 briefyears ... • Starting in 2008, a few heads of state directed open government data to be published on the Web ... • Three months ago (September 2011), Presidents Obama (USA) and Rousseff (Brazil) endorsed the Open Government Partnership, along with 7 other nations • Each launched their government’s National Plans during the meeting of the UN General Assembly Monday, November 28, 11
  • 21.
    World changing phenomenon • Using Linked Data approach, we can begin to address data silos and interoperability using data exchange standards • We can combine information sources • The W3C has defined standards that enable interoperability and allow us to freely move data Monday, November 28, 11
  • 22.
  • 23.
    What is next? • We’re already seeing signs of things to come. • Structured data on the Web is becoming mainstream. Monday, November 28, 11
  • 24.
    Government Linked Data Working Group • Started June 2011; runs to May 2013 • Chartered to provide standards & develop standards track documents to help all governments share their data as high quality (“5 star”) Linked Data • 39 participants from 25 organizations • 50% in non-US locations Monday, November 28, 11
  • 25.
  • 26.
    Deliverables Community Directory Best Practices for Publishing Linked Data • Procurement, vocabulary selection, URI construction, versioning, stability, legacy data issues • Cookbook for Linked Open Data Standard Vocabularies • Metadata, Statistical “Cube” Data, People, Organizational structures Monday, November 28, 11
  • 27.
    Beta: http://dir.w3.org email [email protected] for login to add your organization’s details Monday, November 28, 11
  • 28.
  • 29.
  • 30.
    A pragmatic approach to publishing & consuming Linked Data Monday, November 28, 11
  • 31.
    There is aProcess Identify Model Name Describe Convert Publish Maintain Monday, November 28, 11
  • 32.
  • 33.
  • 34.
  • 35.
  • 36.
  • 37.
    Preparation 1. Leverage what exists • Request a copy of the logical and physical model of the database(s) • Obtain data extracts (i.e., databases and/or spreadsheets) or create data in a way that can be replicated. Monday, November 28, 11
  • 38.
    Model the data 2. Model data without context to allow for reuse and easier merging of data sets • Traditional DBAs organize data for specified Web services or applications. • With LD, application logic does not drive the data schema, concepts, etc. Monday, November 28, 11
  • 39.
    Model the data 3.Look for real world objects of interest (e.g., people, places, things, locations, etc.) and model them. • Investigate how others are already modeling similar or related data. • Look for duplication and normalize the data • Use common sense to decide whether or not to make link Monday, November 28, 11
  • 40.
    Model the data... 4. Connect data from different sources and authoritative vocabularies (see list of popular vocabularies below). •Use URIs as names for your objects Monday, November 28, 11
  • 41.
    Model the data... •Put aside immediate needs of any application •Don’t think about how an application will use your data •Do think about time and how the data will change over time. Monday, November 28, 11
  • 42.
    Convert, Publish &Maintain 5.Write a script or process to convert the data set repeatedly 6.Publish to the Web and announce it! (more details shortly) 7.Maintenance strategy (more details in the social contract at the end) Monday, November 28, 11
  • 43.
    Take the plunge... Be forgiving • Simplistic data models can still be useful • Better to make progress with something rather than do nothing because we cannot be comprehensive and complete Monday, November 28, 11
  • 44.
    Take an iterativeapproach 1. Review of modeling decisions 2. Review vocabularies chosen and developed 3. Modify/update data conversion scripts 4. Do a maintenance walk-through with real use cases 5. Show how to explore data with SPARQL and visualizations 6. Discuss a persistent identifier strategy (think PURLs) Monday, November 28, 11
  • 45.
  • 46.
    Linked Data ManagementSystem Callimachus (kəlĭm'əkəs) is a framework for data-driven applications based on Linked Data principles. Callimachus allows Web authors to quickly and easily create semantically-enabled Web applications. Monday, November 28, 11
  • 47.
    Web 2.0 developerscan create data driven application with templates in hours Triples up & down (no mySQL under the covers) Wiki editing of content Access control Collaboration via Web Change tracking (history) Page/form Templates Monday, November 28, 11
  • 48.
  • 49.
  • 50.
  • 51.
  • 52.
  • 53.
  • 54.
  • 55.
  • 56.
  • 57.
  • 58.
    Join the Community Callimachus has benefited from 2+ years of corporate support We’re using it for real world Web applications in environmental protection, finance and publishing Open Source project Visit callimachusproject.org Monday, November 28, 11
  • 59.
    What we coveredtoday • Why government authorities are publishing information as Linked Open Data • The process for converting data into RDF • Using Open Standards and Open Source to publish Open Data • Note: Commercial support & products are critical for government publishing & consumption of Open Data • Announcing agency Open Data & your social contract Monday, November 28, 11
  • 60.
    Further Reading http://linkeddatabook.com/editions/1.0/ http://3roundstones.com/linking-enterprise-data/ http://3roundstones.com/linking-government-data/ http://www.linkeddatadeveloper.com/ Monday, November 28, 11
  • 61.
    Recommended talk Thursday, 1-Dec 2011 @ 9:30 by Michael Pendleton & David G. Smith, US EPA LINKED GOVERNMENT DATA: ENVIRONMENTAL PROTECTION PERSPECTIVES Monday, November 28, 11
  • 62.
  • 63.
    This work isCopyright © 2011 3 Round Stones Inc. It is licensed under the Creative Commons Attribution 3.0 Unported License Full details at: http://creativecommons.org/licenses/by/3.0/ You are free: to Share — to copy, distribute and transmit the work to Remix — to adapt the work Under the following conditions: Attribution. You must attribute the work in the manner specified by the author or licensor (but not in any way that suggests that they endorse you or your use of the work). Share Alike. If you alter, transform, or build upon this work, you may distribute the resulting work only under the same or similar license to this one. Monday, November 28, 11