23 Research Data
Things
Research Data Coordinator
Katina Toufexis
Data Sharing
• Research data may be shared in many ways.
• Getting Started looks at sharing data via access methods: Open, Shared
and Closed Data
Thing 5
Data Sharing
Thing 5
https://vimeo.com/125783029
Open / Shared / Closed: The world of data
Data Sharing
Thing 5
1. What is 'open data'?
2. Who benefits from open data?
3. Overcoming barriers to opening data
4. Making data open
5. Open data in Research Data Australia
Data Sharing
Thing 5
1. What is 'open data'?
2. Who benefits from open data?
3. Overcoming barriers to opening data
4. Making data open
5. Open data in Research Data Australia
Thing 5
What is 'open data'?
1. freely available to download in a reusable form. Large or complex data
may be accessible via a service or facility that enables access in-situ or
the compilation of sub-sets
2. licensed with minimal restrictions to reuse
3. well described with provenance and reuse information provided
4. available in convenient, modifiable and open formats
5. managed by the provider on an ongoing basis.
The Open Data Handbook provides an introduction to the legal, social and
technical aspects of open data. It discusses what open data is as well as why
and how to make data open.
Thing 5
Open Data Handbook
Data Sharing
Thing 5
1. What is 'open data'?
2. Who benefits from open data?
3. Overcoming barriers to opening data
4. Making data open
5. Open data in Research Data Australia
Thing 5
Who benefits from open data?
Everyone! According to the Royal Society, open data supports:
• new research and new types of research
• the application of automated knowledge discovery tools online
• the verification of previous results
• a broader base set of data than any one researcher can hope to collect
• the exploration of topics not envisioned by the initial investigators
• the creation of new data sets, information and knowledge when data from
multiple sources are combined
• the transfer of factual information to promote development and capacity
building in developing countries
• interdisciplinary, inter-sectoral, inter-institutional and international research.
Thing 5
Who benefits
from open data?
The many ways
open data benefits
researchers,
research
organisations,
funders, policy
makers and the
broader
community:
Data Sharing
Thing 5
1. What is 'open data'?
2. Who benefits from open data?
3. Overcoming barriers to opening data
4. Making data open
5. Open data in Research Data Australia
Thing 5
Overcoming barriers to opening data
Someone might use my data to 'scoop' me
Thing 5
Overcoming barriers to opening data
Someone might use my data to 'scoop' me
1. Timing?
You may choose to restrict access to your data until a key paper is published.
You decide the appropriate time for making your data open.
Thing 5
Overcoming barriers to opening data:
Someone might use my data to 'scoop' me
2. What is the real risk of ‘scooping’?
Little risk according to:
Nature, Professor Issac Kohone, Harvard Medical School: "[we] need to
convince people that the likelihood of being scooped if they put their data out
there [is] not going to be high ... we need to do away with a culture of sitting on
data until we have mined every useful scientific grain out of it".
In a similar vein, some researchers report that any possible loss of future
potential papers is well offset by the more immediate rewards of data citations
and collaborative opportunities.
Thing 5
Overcoming barriers to opening data:
Someone might use my data to 'scoop' me
3. What is the real risk of ‘scooping’?
In fact, many researchers find that opening up their data has greatly benefited
their research.
Report - Professor Tim Gowers, Royal Society Research Professor, University
of Cambridge
• opened up his data to crowd-source an unsolved mathematical problem.
• 27 people made 800 substantive contributions to solve the problem in a
matter of days.
• Professor Gowers commented that this approach to research was "like driving
a car whilst normal research is like pushing it".
Thing 5
Overcoming barriers to opening data:
My data are sensitive due to cultural, ethical, ecological or security
considerations
There are circumstances where it may not be appropriate to make data open.
e.g.
• where individuals may be identified;
• threatened species located; or
• information affecting national security revealed.
Thing 5
Overcoming barriers to opening data:
My data are sensitive due to cultural, ethical, ecological or security
considerations
However, there may be ways to make sensitive data at least
partially open.
This comprehensive 26 page Publishing and Sharing
Sensitive Data - ANDS Guide (PDF, 0.73 MB) outlines
best practice for the publication and sharing of sensitive
research data in the Australian context. It should be read
in conjunction with the ANDS Introduction to Sensitive
Data.
Thing 5
Overcoming barriers to opening data:
My data are sensitive due to cultural, ethical, ecological or security
considerations
http://www.ands.org.au/__data/assets/pdf_file/0010/385309/sensitive-
decision-tree.pdf
ANDS Publishing and Sharing Sensitive Data DECISION TREE
Thing 5
Thing 5
Overcoming barriers to opening data:
I won't get any recognition or reward for making my data open
Tools such as Thomson Reuters Data Citation Index, enable citation metrics to
be captured for data
Thing 5
Overcoming barriers to opening data:
There are contractual or commercial interests associated with my data
• research data may underpin a commercialisation opportunity such as a
patent.
• Or it may be that contractually, IP arising from a project is owned by a third
party.
• In others cases though, data is not shared because of the uncertainty arising
from data not being explicitly addressed in contracts and project plans.
Data Sharing
Thing 5
1. What is 'open data'?
2. Who benefits from open data?
3. Overcoming barriers to opening data
4. Making data open
5. Open data in Research Data Australia
Thing 5
Making Data Open
Open data is Which ideally means ... So preferably not ...
Freely available
to download
a) There is no cost to access the data;
b) Access is via an internet accessible
download;
c) Data is in a form that can be readily
downloaded. Large or complex data is located
close to high performance computing or
specialised services that enable access to the
data in situ or the compilation of sub-sets.
a) Costed at more than
reproduction cost;
b) Burned to a DVD and posted
via 'snail mail';
c) Only available in huge
packages that are difficult to
reuse and/or take days to
download.
Licensed An open license such as CC-BY is applied.
A restrictive license, or worse, no
license at all. If no license is
applied, no reuse is permitted.
Well described
Standards based metadata is used with details
of data elements and inclusion of data
dictionaries. Describe the purpose of the
collection, the characteristics of the sample and
the method of data collection.
Metadata descriptions that are
very brief or will not be widely
understood. Avoid jargon and
abbreviations and don't assume
prior knowledge of the data or
subject domain.
Provided in an
open format
The data is in a convenient, modifiable and
open format that can be readily retrieved,
downloaded, indexed and searched. Where
possible, formats should be machine-readable
and non-proprietary formats are preferred. For
example, prefer netCDF over .xls.
Obscure formats or formats that
require proprietary software to
open and reuse.
Well managed
The data is managed on an ongoing basis with
a point of contact designated to assist with data
use.
Data that is loaded on to a
server and forgotten.
Data Sharing
Thing 5
1. What is 'open data'?
2. Who benefits from open data?
3. Overcoming barriers to opening data
4. Making data open
5. Open data in Research Data Australia
Data Sharing
Thing 5
Open data in
Research Data
Australia
1. New Interface
highlights the
openness of
data
2. Licenses can
be applied
Data Sharing
Wiley Survey
Thing 5
http://www.acscinf.org/PDF/Giffi-%20Researcher%20Data%20Insights%20--%20Infographic%20FINAL%20REVISED.pdf
Long-lived data: curation & preservation
https://youtu.be/qEmme
FFafUs
US Library of Congress
(LoC)
Thing 6
Long-lived data: curation & preservation
Thing 6
Long-lived data: curation & preservation
Thing 6
What key advice would you give someone about preserving
their born digital objects e.g. the family historian, a researcher,
yourself? ….Hint: look for ideas on the Library of Congress
Digital Preservation website.
Long-lived data: curation & preservation
Thing 6
Long-lived data: curation & preservation
Video - http://www.clir.org/initiatives-partnerships/data-curation
Sayeed Choudhury, Associate Dean for Research Data Management at Johns
Hopkins University (long video… to summarise)
 Talks about the Stack Model for Data Mgt
Thing 6
Storage
•Disk, tape, cloud etc.
Archiving
•Identifiers for sharing and
references
Preservation
•Policy, metadata, long-term
reuse
Curation
•Adding value to data for
reuse
Data citation for access & attribution
• Data citation continues the tradition of acknowledging other people’s work
and ideas.
• Along with books, journals and other scholarly works, it is now possible to
formally cite research datasets and even the software that was used to
create or analyse the data.
Thing 7
Data citation for access & attribution
Thing 7
Data citation for access & attribution
https://resear
chdata.ands.
org.au/mont
hly-drought-
australia-
drought-
index/61872
Thing 7
Data citation for access & attribution
http://ands.org.au
/working-with-
data/citation-and-
identifiers/data-
citation
Thing 7
Thing 7
Data citation for access & attribution
Force11 Joint Declaration of Data Citation Principles
• a set of principles for citing data.
• based on the premise that data citation, like the citation of other evidence
and sources, is good research practice and is part of the scholarly
ecosystem supporting data reuse.
Since they were published 2 years ago, the Principles have been endorsed by
numerous individuals and more than 100 data centres, publishers and societies.
Thing 7
Data citation for access & attribution
Thing 7
Data citation for access & attribution
Force11 is endorsed by…
https://www.force11.org/datacitation/endorsements
Thing 7
Data citation for access & attribution
Given such support and clear direction,
why do you think data citation has not been uniformly adopted, so far, across all
disciplines?
Citation Metrics for Data
Thing 8
What are Digital Object Identifiers (DOIs)
and how do they support data citation and
metrics for data and related research
objects?
Citation Metrics for Data
Thing 8
DOIs are:
• unique identifiers
• provide persistent access to published articles,
datasets, software versions and a range of other
research inputs and outputs.
• over 120million DOIs in use,
• last year DOIs were “resolved” (clicked on) over
5 billion times!
• typical DOI looks like this:
http://doi.org/10.4225/08/50F62E0D359D5
Citation Metrics for Data
Thing 8
Google “The compendium of crop
Proteins with Annotated Locations
(cropPAL) version 1 ”
Data citation for access & attribution
http://ands.org.au
/working-with-
data/citation-and-
identifiers/data-
citation
Remember in Thing 7…
Citation Metrics for Data
Here’s a controversial question to discuss:
Should DOIs be routinely applied to all research outputs?
Remember that DOIs carry an expectation of persistence
(maintenance costs etc.) but can provide be used to collect
metrics as well as link articles and data (evidence of impact.)
Thing 8
Thing 8
Citation Metrics for Data
• Alternative metrics or altmetrics count the number of views, number of
downloads, social media "likes" and recommendations associated with a
dataset.
• Because of their immediacy, altmetrics can be an early indicator of the
impact or reach of a dataset; long before formal citation metrics can be
assessed.
Thing 8
Citation Metrics for Data
http://classic.science
mag.org/articleusage?
gca=sci%3B346/6210
/763
Start by looking at the
almetrics for this
Phylogenomics
article published in
Science.
Thing 8
Citation Metrics for Data
Thing 8
Citation Metrics for Data
Look also at the
associated data in
Dryad noting that the
data has been
assigned a DOI.
Thing 8
Citation Metrics for Data
By way of comparison, as of early April 2016:
• the same dataset had been cited once in Thomson Reuters Data Citation Index
• the article had been cited 143 times in Web of Science
Share your thoughts
Do you think altmetrics for data have value in academic settings? Why?
Contacts
Contact UWA 23 Things Coordinators:
Caroline Clark
caroline.clark@uwa.edu.au
Nola Steiner
nola.steiner@uwa.edu.au
Katina Toufexis
katina.toufexis@uwa.edu.au

More Related Content

PPTX
20160414 23 Research Data Things
PPTX
RDM LIASA webinar
PPTX
Managing and sharing data
PDF
Research Data Management, Challenges and Tools - Per Öster
PPTX
Managing and sharing data
PPTX
Research Data Management Services at UWA (November 2015)
PPTX
Introduction to Research Data Management at UWA
PPTX
LEARN Conference - How to cost
20160414 23 Research Data Things
RDM LIASA webinar
Managing and sharing data
Research Data Management, Challenges and Tools - Per Öster
Managing and sharing data
Research Data Management Services at UWA (November 2015)
Introduction to Research Data Management at UWA
LEARN Conference - How to cost

What's hot (18)

PPTX
Jisc's new shared data centre
PPTX
20160301 23 Research Data Things
PPTX
RDM & ELNs @ Edinburgh
PPTX
LEARN Final Conference: Tutorial Group | Using the LEARN Model RDM Policy
PPTX
LEARN Final Conference: Tutorial Group | Implementing the LEARN RDM Toolkit
PPTX
RDM and DMP intro
PDF
OU Library Research Support webinar: Data sharing
PPTX
Why science needs open data – Jisc and CNI conference 10 July 2014
PPTX
The Landscape of Research Data Management
PPTX
Data Management Planning for researchers
PPTX
Certifying and Securing a Trusted Environment for Health Informatics Research...
PPTX
Open science, open data - FOSTER training, Potsdam
PDF
Open Science Governance and Regulation/Simon Hodson
PPTX
Winning the Tour de France, Research Data and Data Stewardship
PPTX
Data Management and Horizon 2020
PPTX
Writing a successful data management plan with the DMPTool
PPTX
EPSRC research data expectations and PURE for datasets
PDF
The State of Open Data Report by @figshare
Jisc's new shared data centre
20160301 23 Research Data Things
RDM & ELNs @ Edinburgh
LEARN Final Conference: Tutorial Group | Using the LEARN Model RDM Policy
LEARN Final Conference: Tutorial Group | Implementing the LEARN RDM Toolkit
RDM and DMP intro
OU Library Research Support webinar: Data sharing
Why science needs open data – Jisc and CNI conference 10 July 2014
The Landscape of Research Data Management
Data Management Planning for researchers
Certifying and Securing a Trusted Environment for Health Informatics Research...
Open science, open data - FOSTER training, Potsdam
Open Science Governance and Regulation/Simon Hodson
Winning the Tour de France, Research Data and Data Stewardship
Data Management and Horizon 2020
Writing a successful data management plan with the DMPTool
EPSRC research data expectations and PURE for datasets
The State of Open Data Report by @figshare
Ad

Viewers also liked (19)

PPTX
Effective Nutrition Promotion within Agricultural Extension
PDF
10CSL67 CG LAB PROGRAM 3
PPTX
Personal branding: Rischi di una mancata Gestione Consapevole - Men in Web
PPT
Cribado cardiovascular en la infancia
ODP
Kee ora
PDF
Cuan dulce es confiar
PPTX
My future Job Rocio
DOCX
Research
PDF
Crphealth
PPT
Luciana
PPTX
Weaponizing the Nokia N900 -- TakeDownCon, Dallas, 2011
PPT
Sphinn con israel analytics [short] - go - mar2010
PDF
J. Mater. Chem. A, 2015, 3, 20408–20415
PPTX
PDF
Graphic display devices
PPTX
Bbcon digital experiments october 22 2016
PPTX
մարդ անհատը և եկեղեցին1000
PPTX
Epistolas Pastorales
PPTX
անահիտ 11-2
Effective Nutrition Promotion within Agricultural Extension
10CSL67 CG LAB PROGRAM 3
Personal branding: Rischi di una mancata Gestione Consapevole - Men in Web
Cribado cardiovascular en la infancia
Kee ora
Cuan dulce es confiar
My future Job Rocio
Research
Crphealth
Luciana
Weaponizing the Nokia N900 -- TakeDownCon, Dallas, 2011
Sphinn con israel analytics [short] - go - mar2010
J. Mater. Chem. A, 2015, 3, 20408–20415
Graphic display devices
Bbcon digital experiments october 22 2016
մարդ անհատը և եկեղեցին1000
Epistolas Pastorales
անահիտ 11-2
Ad

Similar to 20160523 23 Research Data Things (20)

PPTX
DataONE Education Module 02: Data Sharing
PDF
Preparing Research Data for Sharing
PPTX
Intro to RDM
PPTX
Securing, storing and enabling safe access to data
PPTX
Preparing research data for sharing
PPTX
20160719 23 Research Data Things
PPTX
Research Data Management and your PhD
PPTX
A coordinated framework for open data open science in Botswana/Simon Hodson
PPTX
Open Science Globally: Some Developments/Dr Simon Hodson
PDF
I o dav data workshop prof wafula final 19.9.17
PPTX
Research-Data-Management-and-your-PhD
PPTX
Research Data Management: An Introduction to the Basics
PPTX
Research Data Management Introduction: EUDAT/Open AIRE Webinar| www.eudat.eu |
PPTX
Research Data Management: An Introductory Webinar from OpenAIRE and EUDAT
PPTX
Research Data Management: An Introductory Webinar from OpenAIRE and EUDAT
PPTX
Intro to Data Management Plans
PDF
The state of global research data initiatives: observations from a life on th...
PPTX
DMP health sciences
PPTX
Providing support and services for researchers in good data governance
PDF
Data sharing: How, what and why?
DataONE Education Module 02: Data Sharing
Preparing Research Data for Sharing
Intro to RDM
Securing, storing and enabling safe access to data
Preparing research data for sharing
20160719 23 Research Data Things
Research Data Management and your PhD
A coordinated framework for open data open science in Botswana/Simon Hodson
Open Science Globally: Some Developments/Dr Simon Hodson
I o dav data workshop prof wafula final 19.9.17
Research-Data-Management-and-your-PhD
Research Data Management: An Introduction to the Basics
Research Data Management Introduction: EUDAT/Open AIRE Webinar| www.eudat.eu |
Research Data Management: An Introductory Webinar from OpenAIRE and EUDAT
Research Data Management: An Introductory Webinar from OpenAIRE and EUDAT
Intro to Data Management Plans
The state of global research data initiatives: observations from a life on th...
DMP health sciences
Providing support and services for researchers in good data governance
Data sharing: How, what and why?

Recently uploaded (20)

PPTX
Diploma pharmaceutics notes..helps diploma students
PDF
0520_Scheme_of_Work_(for_examination_from_2021).pdf
PDF
Chevening Scholarship Application and Interview Preparation Guide
PPTX
Reproductive system-Human anatomy and physiology
PDF
Physical education and sports and CWSN notes
PDF
faiz-khans about Radiotherapy Physics-02.pdf
PPTX
ACFE CERTIFICATION TRAINING ON LAW.pptx
PDF
Lecture on Viruses: Structure, Classification, Replication, Effects on Cells,...
PDF
Myanmar Dental Journal, The Journal of the Myanmar Dental Association (2015).pdf
PPTX
Key-Features-of-the-SHS-Program-v4-Slides (3) PPT2.pptx
PDF
anganwadi services for the b.sc nursing and GNM
PDF
fundamentals-of-heat-and-mass-transfer-6th-edition_incropera.pdf
PDF
Farming Based Livelihood Systems English Notes
PPTX
Climate Change and Its Global Impact.pptx
PDF
Laparoscopic Colorectal Surgery at WLH Hospital
PDF
The TKT Course. Modules 1, 2, 3.for self study
PDF
Fun with Grammar (Communicative Activities for the Azar Grammar Series)
PPTX
Integrated Management of Neonatal and Childhood Illnesses (IMNCI) – Unit IV |...
PPTX
4. Diagnosis and treatment planning in RPD.pptx
PDF
Hospital Case Study .architecture design
Diploma pharmaceutics notes..helps diploma students
0520_Scheme_of_Work_(for_examination_from_2021).pdf
Chevening Scholarship Application and Interview Preparation Guide
Reproductive system-Human anatomy and physiology
Physical education and sports and CWSN notes
faiz-khans about Radiotherapy Physics-02.pdf
ACFE CERTIFICATION TRAINING ON LAW.pptx
Lecture on Viruses: Structure, Classification, Replication, Effects on Cells,...
Myanmar Dental Journal, The Journal of the Myanmar Dental Association (2015).pdf
Key-Features-of-the-SHS-Program-v4-Slides (3) PPT2.pptx
anganwadi services for the b.sc nursing and GNM
fundamentals-of-heat-and-mass-transfer-6th-edition_incropera.pdf
Farming Based Livelihood Systems English Notes
Climate Change and Its Global Impact.pptx
Laparoscopic Colorectal Surgery at WLH Hospital
The TKT Course. Modules 1, 2, 3.for self study
Fun with Grammar (Communicative Activities for the Azar Grammar Series)
Integrated Management of Neonatal and Childhood Illnesses (IMNCI) – Unit IV |...
4. Diagnosis and treatment planning in RPD.pptx
Hospital Case Study .architecture design

20160523 23 Research Data Things

  • 1. 23 Research Data Things Research Data Coordinator Katina Toufexis
  • 2. Data Sharing • Research data may be shared in many ways. • Getting Started looks at sharing data via access methods: Open, Shared and Closed Data Thing 5
  • 4. Data Sharing Thing 5 1. What is 'open data'? 2. Who benefits from open data? 3. Overcoming barriers to opening data 4. Making data open 5. Open data in Research Data Australia
  • 5. Data Sharing Thing 5 1. What is 'open data'? 2. Who benefits from open data? 3. Overcoming barriers to opening data 4. Making data open 5. Open data in Research Data Australia
  • 6. Thing 5 What is 'open data'? 1. freely available to download in a reusable form. Large or complex data may be accessible via a service or facility that enables access in-situ or the compilation of sub-sets 2. licensed with minimal restrictions to reuse 3. well described with provenance and reuse information provided 4. available in convenient, modifiable and open formats 5. managed by the provider on an ongoing basis. The Open Data Handbook provides an introduction to the legal, social and technical aspects of open data. It discusses what open data is as well as why and how to make data open.
  • 7. Thing 5 Open Data Handbook
  • 8. Data Sharing Thing 5 1. What is 'open data'? 2. Who benefits from open data? 3. Overcoming barriers to opening data 4. Making data open 5. Open data in Research Data Australia
  • 9. Thing 5 Who benefits from open data? Everyone! According to the Royal Society, open data supports: • new research and new types of research • the application of automated knowledge discovery tools online • the verification of previous results • a broader base set of data than any one researcher can hope to collect • the exploration of topics not envisioned by the initial investigators • the creation of new data sets, information and knowledge when data from multiple sources are combined • the transfer of factual information to promote development and capacity building in developing countries • interdisciplinary, inter-sectoral, inter-institutional and international research.
  • 10. Thing 5 Who benefits from open data? The many ways open data benefits researchers, research organisations, funders, policy makers and the broader community:
  • 11. Data Sharing Thing 5 1. What is 'open data'? 2. Who benefits from open data? 3. Overcoming barriers to opening data 4. Making data open 5. Open data in Research Data Australia
  • 12. Thing 5 Overcoming barriers to opening data Someone might use my data to 'scoop' me
  • 13. Thing 5 Overcoming barriers to opening data Someone might use my data to 'scoop' me 1. Timing? You may choose to restrict access to your data until a key paper is published. You decide the appropriate time for making your data open.
  • 14. Thing 5 Overcoming barriers to opening data: Someone might use my data to 'scoop' me 2. What is the real risk of ‘scooping’? Little risk according to: Nature, Professor Issac Kohone, Harvard Medical School: "[we] need to convince people that the likelihood of being scooped if they put their data out there [is] not going to be high ... we need to do away with a culture of sitting on data until we have mined every useful scientific grain out of it". In a similar vein, some researchers report that any possible loss of future potential papers is well offset by the more immediate rewards of data citations and collaborative opportunities.
  • 15. Thing 5 Overcoming barriers to opening data: Someone might use my data to 'scoop' me 3. What is the real risk of ‘scooping’? In fact, many researchers find that opening up their data has greatly benefited their research. Report - Professor Tim Gowers, Royal Society Research Professor, University of Cambridge • opened up his data to crowd-source an unsolved mathematical problem. • 27 people made 800 substantive contributions to solve the problem in a matter of days. • Professor Gowers commented that this approach to research was "like driving a car whilst normal research is like pushing it".
  • 16. Thing 5 Overcoming barriers to opening data: My data are sensitive due to cultural, ethical, ecological or security considerations There are circumstances where it may not be appropriate to make data open. e.g. • where individuals may be identified; • threatened species located; or • information affecting national security revealed.
  • 17. Thing 5 Overcoming barriers to opening data: My data are sensitive due to cultural, ethical, ecological or security considerations However, there may be ways to make sensitive data at least partially open. This comprehensive 26 page Publishing and Sharing Sensitive Data - ANDS Guide (PDF, 0.73 MB) outlines best practice for the publication and sharing of sensitive research data in the Australian context. It should be read in conjunction with the ANDS Introduction to Sensitive Data.
  • 18. Thing 5 Overcoming barriers to opening data: My data are sensitive due to cultural, ethical, ecological or security considerations http://www.ands.org.au/__data/assets/pdf_file/0010/385309/sensitive- decision-tree.pdf ANDS Publishing and Sharing Sensitive Data DECISION TREE
  • 20. Thing 5 Overcoming barriers to opening data: I won't get any recognition or reward for making my data open Tools such as Thomson Reuters Data Citation Index, enable citation metrics to be captured for data
  • 21. Thing 5 Overcoming barriers to opening data: There are contractual or commercial interests associated with my data • research data may underpin a commercialisation opportunity such as a patent. • Or it may be that contractually, IP arising from a project is owned by a third party. • In others cases though, data is not shared because of the uncertainty arising from data not being explicitly addressed in contracts and project plans.
  • 22. Data Sharing Thing 5 1. What is 'open data'? 2. Who benefits from open data? 3. Overcoming barriers to opening data 4. Making data open 5. Open data in Research Data Australia
  • 23. Thing 5 Making Data Open Open data is Which ideally means ... So preferably not ... Freely available to download a) There is no cost to access the data; b) Access is via an internet accessible download; c) Data is in a form that can be readily downloaded. Large or complex data is located close to high performance computing or specialised services that enable access to the data in situ or the compilation of sub-sets. a) Costed at more than reproduction cost; b) Burned to a DVD and posted via 'snail mail'; c) Only available in huge packages that are difficult to reuse and/or take days to download. Licensed An open license such as CC-BY is applied. A restrictive license, or worse, no license at all. If no license is applied, no reuse is permitted. Well described Standards based metadata is used with details of data elements and inclusion of data dictionaries. Describe the purpose of the collection, the characteristics of the sample and the method of data collection. Metadata descriptions that are very brief or will not be widely understood. Avoid jargon and abbreviations and don't assume prior knowledge of the data or subject domain. Provided in an open format The data is in a convenient, modifiable and open format that can be readily retrieved, downloaded, indexed and searched. Where possible, formats should be machine-readable and non-proprietary formats are preferred. For example, prefer netCDF over .xls. Obscure formats or formats that require proprietary software to open and reuse. Well managed The data is managed on an ongoing basis with a point of contact designated to assist with data use. Data that is loaded on to a server and forgotten.
  • 24. Data Sharing Thing 5 1. What is 'open data'? 2. Who benefits from open data? 3. Overcoming barriers to opening data 4. Making data open 5. Open data in Research Data Australia
  • 25. Data Sharing Thing 5 Open data in Research Data Australia 1. New Interface highlights the openness of data 2. Licenses can be applied
  • 26. Data Sharing Wiley Survey Thing 5 http://www.acscinf.org/PDF/Giffi-%20Researcher%20Data%20Insights%20--%20Infographic%20FINAL%20REVISED.pdf
  • 27. Long-lived data: curation & preservation https://youtu.be/qEmme FFafUs US Library of Congress (LoC) Thing 6
  • 28. Long-lived data: curation & preservation Thing 6
  • 29. Long-lived data: curation & preservation Thing 6 What key advice would you give someone about preserving their born digital objects e.g. the family historian, a researcher, yourself? ….Hint: look for ideas on the Library of Congress Digital Preservation website.
  • 30. Long-lived data: curation & preservation Thing 6
  • 31. Long-lived data: curation & preservation Video - http://www.clir.org/initiatives-partnerships/data-curation Sayeed Choudhury, Associate Dean for Research Data Management at Johns Hopkins University (long video… to summarise)  Talks about the Stack Model for Data Mgt Thing 6 Storage •Disk, tape, cloud etc. Archiving •Identifiers for sharing and references Preservation •Policy, metadata, long-term reuse Curation •Adding value to data for reuse
  • 32. Data citation for access & attribution • Data citation continues the tradition of acknowledging other people’s work and ideas. • Along with books, journals and other scholarly works, it is now possible to formally cite research datasets and even the software that was used to create or analyse the data. Thing 7
  • 33. Data citation for access & attribution Thing 7
  • 34. Data citation for access & attribution https://resear chdata.ands. org.au/mont hly-drought- australia- drought- index/61872 Thing 7
  • 35. Data citation for access & attribution http://ands.org.au /working-with- data/citation-and- identifiers/data- citation Thing 7
  • 36. Thing 7 Data citation for access & attribution Force11 Joint Declaration of Data Citation Principles • a set of principles for citing data. • based on the premise that data citation, like the citation of other evidence and sources, is good research practice and is part of the scholarly ecosystem supporting data reuse. Since they were published 2 years ago, the Principles have been endorsed by numerous individuals and more than 100 data centres, publishers and societies.
  • 37. Thing 7 Data citation for access & attribution
  • 38. Thing 7 Data citation for access & attribution Force11 is endorsed by… https://www.force11.org/datacitation/endorsements
  • 39. Thing 7 Data citation for access & attribution Given such support and clear direction, why do you think data citation has not been uniformly adopted, so far, across all disciplines?
  • 40. Citation Metrics for Data Thing 8 What are Digital Object Identifiers (DOIs) and how do they support data citation and metrics for data and related research objects?
  • 41. Citation Metrics for Data Thing 8 DOIs are: • unique identifiers • provide persistent access to published articles, datasets, software versions and a range of other research inputs and outputs. • over 120million DOIs in use, • last year DOIs were “resolved” (clicked on) over 5 billion times! • typical DOI looks like this: http://doi.org/10.4225/08/50F62E0D359D5
  • 42. Citation Metrics for Data Thing 8 Google “The compendium of crop Proteins with Annotated Locations (cropPAL) version 1 ”
  • 43. Data citation for access & attribution http://ands.org.au /working-with- data/citation-and- identifiers/data- citation Remember in Thing 7…
  • 44. Citation Metrics for Data Here’s a controversial question to discuss: Should DOIs be routinely applied to all research outputs? Remember that DOIs carry an expectation of persistence (maintenance costs etc.) but can provide be used to collect metrics as well as link articles and data (evidence of impact.) Thing 8
  • 45. Thing 8 Citation Metrics for Data • Alternative metrics or altmetrics count the number of views, number of downloads, social media "likes" and recommendations associated with a dataset. • Because of their immediacy, altmetrics can be an early indicator of the impact or reach of a dataset; long before formal citation metrics can be assessed.
  • 46. Thing 8 Citation Metrics for Data http://classic.science mag.org/articleusage? gca=sci%3B346/6210 /763 Start by looking at the almetrics for this Phylogenomics article published in Science.
  • 48. Thing 8 Citation Metrics for Data Look also at the associated data in Dryad noting that the data has been assigned a DOI.
  • 49. Thing 8 Citation Metrics for Data By way of comparison, as of early April 2016: • the same dataset had been cited once in Thomson Reuters Data Citation Index • the article had been cited 143 times in Web of Science Share your thoughts Do you think altmetrics for data have value in academic settings? Why?
  • 50. Contacts Contact UWA 23 Things Coordinators: Caroline Clark [email protected] Nola Steiner [email protected] Katina Toufexis [email protected]

Editor's Notes

  • #21: It could be argued that few tangible rewards currently exist for those who make their data open. However, things are starting to change. Tools such as Thomson Reuters Data Citation Index, enable citation metrics to be captured for data, in much the same way as they are for publications. Thi Records in the Data Citation Index are intended to: provide attribution for a data object to the person(s) and institution(s) creating the data provide a standard form of citation for each data object to encourage citation (the format of the data citation recommended by Thomson Reuters follows the DataCite guidelines) track citations and reuse of data in the scientific literature and provide bidirectional links between research articles and the data they use or generate provide a means to discover data associated with research publications. Broadly, in order to be accepted into the Data Citation Index the records in a data source: must be able to provide minimum required metadata required to validate against the DCI schema ANDS have developed a mapping from RIF-CS to the DCI schema and a guide to optimising records for DCI compliance.  Elements needed to create a data citation must be present in the metadata. should describe data objects held in repositories under the control of the ANDS partner or data provider Records should not point to institutional web pages or replicate metadata descriptions for data held in other repositories, eg.PANGAEA More information(PDF, 0.5 MB).  If your data source does contain such records, they can be "tagged" for exclusion from the harvest to DCI.  should describe data collections, datasets or repositories - see RIF-CS Type meet the Thomson Reuters repository evaluation, selection and coverage policies. Establishing a harvest from your Data Source to the Data Citation Index The high level workflow for including an RDA data source in the DCI harvest involves: RDA provider contacts their Outreach Officer or [email protected] to express interest in establishing a DCI harvest. ANDS and the provider review and discuss record quality and transform as well as the proposed business processes and agree to proceed. ANDS provides an initial harvest from the Data Source to DCI and advises Thomson Reuters of the nominated contact for the Data Source. Thomson Reuters assess a sample of records in the DCI output against their criteria for inclusion as described above. They also check quality of content, compliance with the DCI metadata schema and the richness of the record as assessed against the content available in the source repository. Thomson Reuters staff will liaise directly with the nominated contact for the Data Source to discuss the metadata assessment and to create a Repository Record for the Data Source in DCI. This record provides the Repository Name in each DCI record. All collection records for the Data Source will be linked to this record in DCI. The screenshot below shows an example. Production harvest from the Data Source to DCI established. Thomson Reuters provide a DCI admin log-in for use by the nominated Data Source contact. Records are reharvested from RDA to DCI on a regular basis. s provides the opportunity for data citation metrics to be included in project proposals, promotion cases and CVs.
  • #22: research data may underpin a commercialisation opportunity such as a patent. Or it may be that contractually, IP arising from a project is owned by a third party. In others cases though, data is not shared because of the uncertainty arising from data not being explicitly addressed in contracts and project plans. Ideally, discussions around data ownership, ongoing management and access should start at the project proposal stage. Start from a position of "why not make the data open?" and consider how any perceived risks associated with making the data open can be addressed.
  • #24: Let's look at the 5 characteristics of open data in a little more detail.   It's worth noting that even if you can't meet all the criteria for 'open data' there are benefits in making data as open as possible.   Fewer barriers = more opportunities for data to be reused and cited.
  • #26: Open data in Research Data Australia In December 2014 it became possible for collection descriptions in Research Data Australia (RDA) to include information that highlights the 'openness' of the data being described. Collection records can be encoded as being openly accessible and openly licensed and include a link to download the data or access the data via a service. See Figure 1 below. The new RDA interface released in April 2015 significantly raised the profile of data that has open characteristics. The interface provides strong visual indicators for data that is publicly accessible online and offers search and browse options that enable users to easily discover and access open data. See Figure 2. The goal is to maximise the reuse and citation of data described in RDA. Take advantage of the opportunities these enhancements offer by ensuring your records provide the relevant RIF-CS encoding. Also, be sure to apply an open license where possible. A CC-BY licence is an open licence but also  requires the data provider gets attribution when the data is reused.
  • #27: Take a look at this infographic from Wiley titled Research Data Sharing Insights [PDF, 2.08MB] It provides a succinct overview of current data sharing practice and perceptions. Now look closely at the sections titled 'Global Data Sharing Trends' and 'Data Sharing By Discipline'.
  • #29: What key advice would you give someone about preserving their born digital objects eg the family historian, a researcher, yourself….Hint: look for ideas on the Library of Congress Digital Preservation website. Long term preservation Succession planning ANDS Suggests that we look at the website LoC This site gives info on how to preserve your digital materials
  • #30: What key advice would you give someone about preserving their born digital objects eg the family historian, a researcher, yourself….Hint: look for ideas on the Library of Congress Digital Preservation website. Long term preservation Succession planning ANDS Suggests that we look at the website LoC This site gives info on how to preserve your digital materials
  • #31: What key advice would you give someone about preserving their born digital objects eg the family historian, a researcher, yourself….Hint: look for ideas on the Library of Congress Digital Preservation website. Long term preservation Succession planning
  • #32: These terms are used interchangeably but create ambiguity Storage is necessary but not sufficient for archiving. Each layer is required for the next.
  • #34: Start by looking back to the Weddell Seal dataset we explored in Thing 4.   Check out how many times it has been cited. This citation count has been measured by Thomson Reuters Data Citation Index product. Click on the ‘Cite’ button to see the similarities between the formats for citation of data and other scholarly publications.   At eResearch we can mint a DOI for a dataset once it has been added to RDO
  • #35: Now look at the Hutchinson Drought Index data record in Research Data Australia. This research data makes cross disciplinary connections between episodes of drought and correlated increases in rural mental health issues. The beauty of this record is that it shows the entirety of the research outputs - publications, software, related datasets and more - all of which are citable.
  • #36: ANDS have a good introduction to data citation on their website At eResearch we can mint a DOI for a dataset once it has been added to RDO
  • #40: Awareness and support are not the same thing. Endorsing these principles is great, but if you don't actively do anything at your institution to promote and support data citation practices, then what is the point? I think there is a growing awareness of data citation principles, but actual practice is lagging behind because it will take a lot of time and effort to change the traditional research culture of keeping your data hidden away so no one else can scoop your work. Libraries are well placed to work with researchers to try to change this culture by demonstrating the benefits of preserving, sharing, and citing data.
  • #43: Google Go to b2find Resolve DOI Go back tp osearch results Go to RDA Show doi at bottom Show DOI in “Cite This” pop-up – as it appears in a structured citation.
  • #44: ANDS have a good introduction to data citation on their website At eResearch we can mint a DOI for a dataset once it has been added to RDO
  • #47: Note the number and pattern of downloads for this article since it was published in November 2014. Now click on the “donut” or the link to ‘More Details’ to see the wealth of information available.
  • #49: Look also at the associated data in Dryad noting that the data has been assigned a DOI.   Can you see how many times the data has been downloaded and the record viewed (scroll down to the bottom of the record)?