“Plans are worthless, but planning is essential” 
Creating the culture and technology for an international data infrastructure 
Mark A. Parsons 
Secretary General 
CASRAI Canada ReConnnect14 
Ottawa, Canada 
20 November 2014 
Unless otherwise noted, the slides in this presentation are licensed by Mark A. Parsons under a Creative Commons Attribution-Share Alike 3.0 License
All of society’s grand challenges require diverse 
(often large) data to be shared and integrated 
across cultures, scales, and technologies.
Research Data Alliance 
Vision 
Researchers and innovators openly share data across 
technologies, disciplines, and countries to address the 
grand challenges of society. 
Mission 
RDA builds the social and technical bridges that enable 
open sharing of data.
Dynamics of Infrastructure 
Edwards, et al. 2007 Understanding Infrastructure: Dynamics, 
Tensions, and Design. 
• Infrastructures become “ubiquitous, accessible, reliable, and 
transparent” as they mature. 
• Systems Networks Inter-networks 
• “system-building, characterized by the deliberate and successful 
design of technology-based services.” 
• “technology transfer across domains and locations results in 
variations on the original design, as well as the emergence of 
competing systems.” 
• Finally, “a process of consolidation characterized by gateways 
that allow dissimilar systems to be linked into networks.”
Not what, but 
When is infrastructure?
Not what, but 
When and 
Who is infrastructure?
Bridges and 
Gateways 
Gateways are often wrongly 
understood as “technologies,” 
i.e. hardware or software 
alone. A more accurate 
approach conceives them as 
combining a technical solution 
with a social choice, i.e. a 
standard, both of which must 
be integrated into existing 
users’ communities of 
practice. Because of this, 
gateways rarely perform 
perfectly. 
— Edwards et al. 2007
Infrastructure is 
Relationships, interactions, and connections 
between people, technologies, and institutions
From Interregional Highways: Message from the President of the United States Transmitting a Report of the National 
Interregional Highway Committee, Outlining and Recommending a National System of Interregional Highways, 12 Jan. 1944. 
CC-BY Eric Fischer http://www.flickr.com/photos/walkingsf/8270270785/
http://www.shockblast.net/aerial-photographs/urban-sprawl-by-christoph-gielen-arizona/
Interchange 
cc-by-sa Steven Vance http://www.flickr.com/photos/jamesbondsv/8475376363/
Ranch Exit 
CC-BY-SA Ken Lund http://www.flickr.com/photos/kenlund/2381991900/
Themes from A. Tsing on Collaboration 
Friction—An ethnography of global connection 
•“Actual existing universalisms are 
hybrid, transient, and involved in 
constant reformulation through 
dialogue.” They work out through 
friction. 
•“There is no reason to think 
collaborators have common goals.” 
•Unity and diversity cover each 
other up. Need to remember the 
local.
"Data Deluge," Brett Ryder, The Economist, Feb. 2010
Data Blizzard? 
© Mindy Veissid | Mindy Veissid Photography.
Diverse snow crystal photos by Kenneth G. Libbrecht 
snowcrystals.com
Distribution of NSF Awards by Dollar Value 
© 2009 The Board of Trustees, University of Illinois 
The long tail of science Heidorn 2008
Ashby’s Law of 
Requisite Variety Only variety absorbs variety
Map of the internet by the Opte Project [CC-BY] via Wikimedia 
Commons
Networks or ecosystems often rely on “weak” links, so partner and 
build relationships. (See Barabási A-L and R Albert. 1999 and others)
But what does this all have to do with 
RDA? 
1. RDA focusses on developing “gateways” 
2. RDA doesn’t do “architecture,” but it does provide a level of unity.
Deliverables that make data work 
“Create - Adopt - Use” 
• Adopted code, policy, specifications, standards, or practices that 
enable data sharing 
• “Harvestable” efforts for which 12-18 months of work can eliminate 
a roadblock 
• Efforts that have substantive applicability to 
groups within the data community but may 
RDA Principles 
not apply to all 
Openness 
Consensus 
• Efforts that can start today 
Balance 
Harmonization 
Community Driven 
Non-profit
RDA Organisational Framework
RDA Working Groups 
1. Brokering Governance* 
2. Data Citation WG 
3. Data Description Registry 
Interoperability 
4. Data Foundation and Terminology 
WG 
5. Data Type Registries WG 
6. Metadata Standards Directory 
Working Group 
7. PID Information Types WG 
8. Practical Policy WG 
9. RDA/CODATA Summer Schools in 
Data Science and Cloud Computing 
in the Developing World* 
10.RDA/WDS Publishing Data 
Bibliometrics WG 
11.RDA/WDS Publishing Data Services 
WG 
12.RDA/WDS Publishing Data 
Workflows WG 
13.Repository Audit and Certification 
DSA–WDS Partnership WG 
14.Standardisation of Data Categories 
and Codes WG 
15.The BioSharing Registry: 
connecting data policies, standards 
& databases in life sciences* 
16.Urban Quality of Life Indicators* 
17.Wheat Data Interoperability WG 
* in review
Initial Products—adopt one today! 
• A basic vocabulary of foundational terminology and query tool to make sure we know what 
we’re talking about. 
• A data type model and registry (“MIME-types” for data) to help tools interpret, display, and 
process data. 
• A persistent identifier type registry to help search engines understand what they are pointing to 
and retrieving. 
• Coming soon: 
• A basic set of machine actionable rules to enhance trust 
• A metadata standards directory so we can describe similar things consistently 
• A dynamic-data citation methodology so we can reference precise subsets of changing 
data. 
• Semantically linked terms describing wheat data so we can share harvest and related 
information around the world 
• A unified repository certification scheme to reduce confusion and improve trust.
But what does this all have to do with 
RDA? 
1. RDA focusses on developing “gateways” 
2. RDA doesn’t do “architecture,” but it does provide a level of unity. 
3. RDA plays both globally and locally—Think “glocal”.
Other 
Private6% 
13% 
Government 
18% Academia 
63% 
Distribution of 2,353 Individual RDA Members in 96 Countries 
12 September 2014 
Map courtesy traveltip.org 
Europe 
50% 
North America 
36% 
Austral-pacific 
5% 
Africa 
3% 
South 
America 
1% 
Asia 
5%
Regional RDAs 
• Australian National Data Service, RDA/United States, RDA/Europe, 
• Implement RDA deliverables locally and enhance adoption. 
• Ensure regional or national issues are addressed globally. 
• Support plenaries and support attendance at plenaries.
But what does this all have to do with 
RDA? 
1. RDA focusses on developing “gateways” 
2. RDA doesn’t do “architecture,” but it does provide a level of unity. 
3. RDA plays both globally and locally—Think glocal. 
4. RDA fosters relationships, interfaces, and connections. 
5. RDA provides a “neutral place” to identify and work through friction.
RDA Interest Groups 
1. Agricultural Data Interoperability IG 
2. Big Data Analytics IG 
3. Biodiversity Data Integration IG 
4. Brokering IG 
5. Community Capability Model IG 
6. Data Fabric IG 
7. Data for Development 
8. Data in Context IG 
9. Defining Urban Data Exchange for Science IG* 
10.Development of cloud computing capacity and 
education in developing world research 
11.Digital Practices in History and Ethnography IG 
12.Domain Repositories Interest Group 
13.Education and Training on handling of research 
data 
14.ELIXIR Bridging Force IG* 
15.Engagement IG 
16.Federated Identity Management 
17.Geospatial IG* 
18.Libraries for Research Data* 
19.Long tail of research data IG 
20.Marine Data Harmonization IG 
21.Metabolomics 
22.Metadata IG 
23.PID Interest Group 
24.Preservation e-Infrastructure IG 
25.RDA/CODATA Legal Interoperability IG 
26.RDA/CODATA Materials Data, Infrastructure & 
Interoperability IG 
27.RDA/WDS Certification of Digital Repositories IG 
28.RDA/WDS Publishing Data Cost Recovery for 
Data Centres 
29.RDA/WDS Publishing Data IG 
30.Reproducibility IG* 
31.Research data needs of the Photon and Neutron 
Science community 
32.Research Data Provenance 
33.Service Management IG 
34.Structural Biology IG 
35.Toxicogenomics Interoperability IG 
* in review
Plenary 5 San Diego, California 
9 - 11 March 2015 
©2013 Pecoff Studios Inc
RDA Organisational Framework
Get involved! 
• Join RDA as an individual member supporting our principles at 
http://rd-alliance.org 
• Join as an Organisational Member (nominal fee) or an 
Organisational Affiliate (jointly sponsored efforts). 
• Initiate or join an Interest Group 
• Propose or join a Working Group 
• Attend the RDA Plenaries 
Coming together is a beginning; 
keeping together is progress; 
working together is success. 
—Henry Ford
Summary 
• Infrastructure is created in phases with the final consolidation phase relying 
on gateways and bridges. 
• Diversity is a central problem, but only diversity absorbs diversity. 
• Networking and interconnection are the way to solve complex problems. 
• Need to be constantly, but lightly, managing tension between bottom-up 
chaos and stifling, top-down control. 
• We are in more global and democratic world, but also a more local world. 
Coalition politics with new kinds of coalitions because there are new kinds of 
identity. 
• Data science needs to focus on relationships, connections, interfaces. 
• You must participate “glocally” to succeed. 
• Responding to change is more important than following a plan. 
• RDA provides mechanisms to address all of the above!
Info: 
enquiries@rd-alliance.org 
@resdatall

"Plans are worthless, but planning is essential"

  • 1.
    “Plans are worthless,but planning is essential” Creating the culture and technology for an international data infrastructure Mark A. Parsons Secretary General CASRAI Canada ReConnnect14 Ottawa, Canada 20 November 2014 Unless otherwise noted, the slides in this presentation are licensed by Mark A. Parsons under a Creative Commons Attribution-Share Alike 3.0 License
  • 2.
    All of society’sgrand challenges require diverse (often large) data to be shared and integrated across cultures, scales, and technologies.
  • 3.
    Research Data Alliance Vision Researchers and innovators openly share data across technologies, disciplines, and countries to address the grand challenges of society. Mission RDA builds the social and technical bridges that enable open sharing of data.
  • 8.
    Dynamics of Infrastructure Edwards, et al. 2007 Understanding Infrastructure: Dynamics, Tensions, and Design. • Infrastructures become “ubiquitous, accessible, reliable, and transparent” as they mature. • Systems Networks Inter-networks • “system-building, characterized by the deliberate and successful design of technology-based services.” • “technology transfer across domains and locations results in variations on the original design, as well as the emergence of competing systems.” • Finally, “a process of consolidation characterized by gateways that allow dissimilar systems to be linked into networks.”
  • 9.
    Not what, but When is infrastructure?
  • 10.
    Not what, but When and Who is infrastructure?
  • 11.
    Bridges and Gateways Gateways are often wrongly understood as “technologies,” i.e. hardware or software alone. A more accurate approach conceives them as combining a technical solution with a social choice, i.e. a standard, both of which must be integrated into existing users’ communities of practice. Because of this, gateways rarely perform perfectly. — Edwards et al. 2007
  • 12.
    Infrastructure is Relationships,interactions, and connections between people, technologies, and institutions
  • 13.
    From Interregional Highways:Message from the President of the United States Transmitting a Report of the National Interregional Highway Committee, Outlining and Recommending a National System of Interregional Highways, 12 Jan. 1944. CC-BY Eric Fischer http://www.flickr.com/photos/walkingsf/8270270785/
  • 14.
  • 15.
    Interchange cc-by-sa StevenVance http://www.flickr.com/photos/jamesbondsv/8475376363/
  • 16.
    Ranch Exit CC-BY-SAKen Lund http://www.flickr.com/photos/kenlund/2381991900/
  • 17.
    Themes from A.Tsing on Collaboration Friction—An ethnography of global connection •“Actual existing universalisms are hybrid, transient, and involved in constant reformulation through dialogue.” They work out through friction. •“There is no reason to think collaborators have common goals.” •Unity and diversity cover each other up. Need to remember the local.
  • 18.
    "Data Deluge," BrettRyder, The Economist, Feb. 2010
  • 19.
    Data Blizzard? ©Mindy Veissid | Mindy Veissid Photography.
  • 20.
    Diverse snow crystalphotos by Kenneth G. Libbrecht snowcrystals.com
  • 21.
    Distribution of NSFAwards by Dollar Value © 2009 The Board of Trustees, University of Illinois The long tail of science Heidorn 2008
  • 22.
    Ashby’s Law of Requisite Variety Only variety absorbs variety
  • 23.
    Map of theinternet by the Opte Project [CC-BY] via Wikimedia Commons
  • 24.
    Networks or ecosystemsoften rely on “weak” links, so partner and build relationships. (See Barabási A-L and R Albert. 1999 and others)
  • 25.
    But what doesthis all have to do with RDA? 1. RDA focusses on developing “gateways” 2. RDA doesn’t do “architecture,” but it does provide a level of unity.
  • 26.
    Deliverables that makedata work “Create - Adopt - Use” • Adopted code, policy, specifications, standards, or practices that enable data sharing • “Harvestable” efforts for which 12-18 months of work can eliminate a roadblock • Efforts that have substantive applicability to groups within the data community but may RDA Principles not apply to all Openness Consensus • Efforts that can start today Balance Harmonization Community Driven Non-profit
  • 27.
  • 28.
    RDA Working Groups 1. Brokering Governance* 2. Data Citation WG 3. Data Description Registry Interoperability 4. Data Foundation and Terminology WG 5. Data Type Registries WG 6. Metadata Standards Directory Working Group 7. PID Information Types WG 8. Practical Policy WG 9. RDA/CODATA Summer Schools in Data Science and Cloud Computing in the Developing World* 10.RDA/WDS Publishing Data Bibliometrics WG 11.RDA/WDS Publishing Data Services WG 12.RDA/WDS Publishing Data Workflows WG 13.Repository Audit and Certification DSA–WDS Partnership WG 14.Standardisation of Data Categories and Codes WG 15.The BioSharing Registry: connecting data policies, standards & databases in life sciences* 16.Urban Quality of Life Indicators* 17.Wheat Data Interoperability WG * in review
  • 29.
    Initial Products—adopt onetoday! • A basic vocabulary of foundational terminology and query tool to make sure we know what we’re talking about. • A data type model and registry (“MIME-types” for data) to help tools interpret, display, and process data. • A persistent identifier type registry to help search engines understand what they are pointing to and retrieving. • Coming soon: • A basic set of machine actionable rules to enhance trust • A metadata standards directory so we can describe similar things consistently • A dynamic-data citation methodology so we can reference precise subsets of changing data. • Semantically linked terms describing wheat data so we can share harvest and related information around the world • A unified repository certification scheme to reduce confusion and improve trust.
  • 30.
    But what doesthis all have to do with RDA? 1. RDA focusses on developing “gateways” 2. RDA doesn’t do “architecture,” but it does provide a level of unity. 3. RDA plays both globally and locally—Think “glocal”.
  • 31.
    Other Private6% 13% Government 18% Academia 63% Distribution of 2,353 Individual RDA Members in 96 Countries 12 September 2014 Map courtesy traveltip.org Europe 50% North America 36% Austral-pacific 5% Africa 3% South America 1% Asia 5%
  • 32.
    Regional RDAs •Australian National Data Service, RDA/United States, RDA/Europe, • Implement RDA deliverables locally and enhance adoption. • Ensure regional or national issues are addressed globally. • Support plenaries and support attendance at plenaries.
  • 33.
    But what doesthis all have to do with RDA? 1. RDA focusses on developing “gateways” 2. RDA doesn’t do “architecture,” but it does provide a level of unity. 3. RDA plays both globally and locally—Think glocal. 4. RDA fosters relationships, interfaces, and connections. 5. RDA provides a “neutral place” to identify and work through friction.
  • 34.
    RDA Interest Groups 1. Agricultural Data Interoperability IG 2. Big Data Analytics IG 3. Biodiversity Data Integration IG 4. Brokering IG 5. Community Capability Model IG 6. Data Fabric IG 7. Data for Development 8. Data in Context IG 9. Defining Urban Data Exchange for Science IG* 10.Development of cloud computing capacity and education in developing world research 11.Digital Practices in History and Ethnography IG 12.Domain Repositories Interest Group 13.Education and Training on handling of research data 14.ELIXIR Bridging Force IG* 15.Engagement IG 16.Federated Identity Management 17.Geospatial IG* 18.Libraries for Research Data* 19.Long tail of research data IG 20.Marine Data Harmonization IG 21.Metabolomics 22.Metadata IG 23.PID Interest Group 24.Preservation e-Infrastructure IG 25.RDA/CODATA Legal Interoperability IG 26.RDA/CODATA Materials Data, Infrastructure & Interoperability IG 27.RDA/WDS Certification of Digital Repositories IG 28.RDA/WDS Publishing Data Cost Recovery for Data Centres 29.RDA/WDS Publishing Data IG 30.Reproducibility IG* 31.Research data needs of the Photon and Neutron Science community 32.Research Data Provenance 33.Service Management IG 34.Structural Biology IG 35.Toxicogenomics Interoperability IG * in review
  • 35.
    Plenary 5 SanDiego, California 9 - 11 March 2015 ©2013 Pecoff Studios Inc
  • 36.
  • 37.
    Get involved! •Join RDA as an individual member supporting our principles at http://rd-alliance.org • Join as an Organisational Member (nominal fee) or an Organisational Affiliate (jointly sponsored efforts). • Initiate or join an Interest Group • Propose or join a Working Group • Attend the RDA Plenaries Coming together is a beginning; keeping together is progress; working together is success. —Henry Ford
  • 38.
    Summary • Infrastructureis created in phases with the final consolidation phase relying on gateways and bridges. • Diversity is a central problem, but only diversity absorbs diversity. • Networking and interconnection are the way to solve complex problems. • Need to be constantly, but lightly, managing tension between bottom-up chaos and stifling, top-down control. • We are in more global and democratic world, but also a more local world. Coalition politics with new kinds of coalitions because there are new kinds of identity. • Data science needs to focus on relationships, connections, interfaces. • You must participate “glocally” to succeed. • Responding to change is more important than following a plan. • RDA provides mechanisms to address all of the above!
  • 39.