Edinburgh DataShare : Tackling research data in a DSpace institutional repository Robin Rice EDINA and Data Library, Information Services University of Edinburgh, Scotland DSpace User Group Meeting Gothenburg, Sweden  14-16 October, 2009
Storyboard About EDINA & Data Library at UoE About the DISC-UK DataShare project What’s different about data? Enter the Data Audit Framework Lessons learned from both projects
EDINA is the  JISC  national academic data centre based at the  University of Edinburgh *. Our mission and purpose is to ‘enhance the productivity of research, learning and teaching’ across all universities, research institutes and colleges in the UK.  We do this by delivering first-rate online services and by working with support staff in university and colleges and with other partners … and by carrying out successful R&D projects.
Data Library: History Established out of the  Program Library Unit  in early 1980s to provide access to data on mainframes, e.g. 1981 population census data.  Part of long tradition of sharing machine-readable data for secondary analysis in the social sciences Formed the  EDINA  national data centre in 1996 - data library continues University remit celebrated  25 th  anniversary  in 2008
What is a data library?  A  data library  refers to both the content and the services that foster use of collections of numeric, audio-visual, textual or geospatial data sets for secondary use in research.  A data library is normally part of a larger institution (academic, corporate, scientific, medical, governmental, etc.) established to serve the data users of that organisation. The data library tends to house local data collections and provides access to them through various means.
Edinburgh Data Library services … distilled Finding… “ I need to analyse some data for a project, but all I can find are published papers with tables and graphs, not the original data source.” Accessing … “ I’ve found the data I need, but I’m not sure how to gain access to it.” Using … “ I’ve got the data I need, but I’m having problems analysing it in my chosen software.” Managing … “ I have collected my own data and I’d like to document and preserve it and make it available to others.”
A forum for data professionals working in UK Higher Education who specialise in supporting staff and students in the use of numeric and geo-spatial data.  DISC-UK’s aims are - Foster understanding between data users and providers  Raise awareness of the value of data support in Universities  Share information and resources among local data support staff  We are also members of  IASSIST , an international organisation of professionals working in and with information technology and data services to support research and teaching in the social sciences.
DISC-UK has completed a JISC-funded repository enhancement project (March 07 - March 09) with the aim of “exploring new pathways to assist academics wishing to share their data over the Internet”.  With three institutions taking part – the Universities of Edinburgh, Oxford and Southampton – a range of institutional data repositories and related services have been established. The project was led by the JISC-funded national data centre, EDINA, at the University of Edinburgh, which also runs the University’s Data Library service.
DISC-UK DataShare Partners’ Repository Experience (March, 2007)
Project Keywords “ Live” cloud tag at  http://www.disc-uk.org/collective.html  based on social bookmarks
Project Briefing Papers Gibbs, H. (2007).  DISC-UK DataShare: State-of-the-Art Review Martinez, L. (2008).  The Data Documentation Initiative (DDI) and Institutional Repositories   Macdonald, S. (2008).  Data  Visualisation  Tools: Part 1 - Numeric Data in a Web 2.0 Environment ;   Part 2 - Spatial Data in a Web 2.0 Environment and Beyond Green, A., et al (2009).   Policy-making for Research Data in Repositories: A Guide
What’s  different  about  data  ? Research data are collected, not authored. Data may be shared, but are they published? In a data repository, is the repository the publisher? There are no explicit rewards for sharing data. Size, type, complexity, update frequency DSpace is improvement on informal sharing methods. Other solutions may work better for intensive data curation (see  Data Sharing Continuum ) Who ‘owns’ the data? Who is the rights-holder?  (individual/dept/institution/funder/subjects/nobody?) but minimal IPR exist in data. Issues about licensing. Is Dublin Core sufficient?  Edinburgh DataShare  has set up a Dublin Core metadata schema for datasets
Edinburgh DataShare  Dublin Core-compliant metadata fields Depositor (contributor) Data Creator  Title  Alternative Title  Dataset Description (abstract)  Type  Subject Classification (JACS) Subject Keywords  Funder (contributor) Data Publisher  Spatial Coverage  Time Period (temporal  coverage) Language  Source  Dataset Description (TOC)  Relation (Is Version Of)  Supercedes Relation (Is Referenced By)  Rights  Date Accessioned
Tech development:  Edinburgh DataShare Upgrade to DSpace ver 1.51 with new theme aligned with University corporate style  Registration streamlined using the University’s single sign-on  Embargo option - coded to restrict full data download with open metadata Open Data Commons license option (PDDL); else Rights field mandatory Date range enabled to allow Time Period (dc:coverage)  Dynamically queries Geonames, a community generated database to find matching places & ensure consistency in metadata entry for Spatial Coverage field Extension to DSpace to record bitstream downloads in usage statistics Anti-virus checking upon upload Download All option (zip file of all item components) Citation field automatically generated based on specified metadata values.
Data creation, collection, repurposing:  Partnerships between researchers & support services with subject expertise; informed by domain standards and guidelines relating to formats, metadata, version control, etc. Data processing, management and curation: Data are transformed, cleaned, derived as part of the research process; curators identify ‘partnering moments' to capture content for documentation and description.  Staging repositories offer curatorial workspaces. Data sharing and distribution: Repositories ingest and manage research outputs;  offer federated searching, redundant storage, access controls; scholarly publications linked to data. Data preservation, dissemination & long term stewardship:  Repositories and data archives provide preservation services such as format migration and media refreshment; dataset may survive a period of dis-interest before being re-discovered. Discovery and Planning Data Analysis Publication and Sharing Long term access Repositories Curation services Researchers PARTNERSHIPS Partnerships in the Data & Research Lifecycle Ann Green, Digital Lifecycle Computing
Enter Data Audit Framework Recommendation to JISC:   “JISC should develop a Data Audit Framework to enable all universities and colleges to carry out an audit of departmental data collections, awareness, policies and practice for data curation and preservation.”  Liz Lyon (2007).  Dealing with Data: Roles, Rights,  Responsibilities and Relationships
Data Audit Framework (DAF) Projects 2008   JISC funded five six-month projects:  DAF Development (DAFD) Project, led by Seamus Ross (Director), Sarah Jones (Project Manager) HATII/DCC, University of Glasgow Four pilot implementation projects: King’s College London University of Edinburgh University College London Imperial College London Two more conducted by DataShare partners, the Universities of Oxford and Southampton, as added deliverables
See  www.data-audit.eu DAF project reports available (findings) Appendices with questionnaires, interview schedules, etc Methodology document  Online tool ready for others to conduct data audits
Methodology Based on Records Management Audit methodology. Five stages: Planning the audit; Identifying data assets; Classifying and appraising data assets; Assessing the management of data assets; Reporting findings and recommending change.
Lessons Learned Overall (1) Top-down drivers are important for overcoming barriers to data sharing (e.g. funders’ requirements for data mgmt and sharing plans) as they are for open access publishing. Data management motivation is a better bottom-up driver for researchers than data sharing but is not sufficient to create culture change. Institutional repositories can play a part in overall infrastructure for data sharing (see  Data Sharing Continuum ). Data librarians, data managers and data scientists can help bridge communication between repository managers & researchers (see  Data Skills/Career study, Swan & Sheridan 2008 ).
Swan, Sheridan 2008 … The report calls for a ‘repositioning’ of the role of the library in data-intensive research. The authors of the report Alma Swan and Sheridan Brown write: ‘We see three main potential roles for the library... Increasing data-awareness amongst researchers; providing archiving and data preservation services through institutional repositories; and developing a new professional strand of practice in the form of data librarianship.’
Lessons Learned Overall (2) Institutions should consider developing research data policy, to clarify rights & responsibilities.  Institutions create a broad range of data in the course of research, not just numeric datasets. So for  institutional  data repositories, the self-archiving model is probably the best for ensuring data quality. (Repository is a host, not a publisher.) Nevertheless, researchers need guidance. IRs  can  improve impact of sharing data over the internet (permanent identifiers, citations, links with publications, discoverable metadata, long-term access and stewardship). Don’t conduct institutional data audits unless you’re prepared to open a can of data management worms!
Finally And don’t go it alone. Get buy-in from other institutional stake-holders (computing staff, librarians, department heads, principal investigators, records managers, archivists, research office staff). Collaborate. Have fun   http://datashare.edina.ac.uk/dspace http://www.disc-uk.org/datashare.html

More Related Content

PPTX
Where data and journal content collide: what does it mean to ‘publish your da...
PPTX
Repository Fringe 2016 - Survey Documentation and Analysis
PPTX
Designing and delivering an international MOOC on Research Data Management an...
PDF
Research Data Management Inititatives at University of Edinburgh
PPT
Research Data Management at Edinburgh: Effecting Culture Change
PPTX
University of Edinburgh RDM Training: MANTRA & beyond
PPT
Doing data in the social sciences and humanities: links to and from published...
PDF
Integration of research literature and data (InFoLiS)
Where data and journal content collide: what does it mean to ‘publish your da...
Repository Fringe 2016 - Survey Documentation and Analysis
Designing and delivering an international MOOC on Research Data Management an...
Research Data Management Inititatives at University of Edinburgh
Research Data Management at Edinburgh: Effecting Culture Change
University of Edinburgh RDM Training: MANTRA & beyond
Doing data in the social sciences and humanities: links to and from published...
Integration of research literature and data (InFoLiS)

What's hot (20)

PDF
Research Data Management at Edinburgh: Effecting Culture Change
PPT
Data Library Services In The Data Stewardship Lifecycle
PDF
Research Data Management: Policy Development
PDF
Smith - Developing Campus Stakeholders' Collaborations - Sept 8
PPT
Northumbria University Geospatial Metadata Workshop 20110505
PDF
Poster RDAP13: Research Data in eCommons @ Cornell: Present and Future
PPT
RDM Priorities, Stakeholders, Practice
PPTX
Opening up data – Jisc and CNI conference 10 July 2014
PPT
Leeds University Geospatial Metadata Workshop 20110617
PPT
Harnessing Collective Intelligence for Sustainable Development
PPTX
Uc3 pasig-asis&t-2013-08-20-support-of-data-intensive-research
PPT
Curation of Research Data
PDF
Geospatial Metadata and Spatial Data: It's all Greek to me!
PPT
David Shotton - Research Integrity: Integrity of the published record
PPT
The Rise of the Data Journal
PPTX
Sept 11 NISO Webinar: Research Data Curation Part 1: E-Science Librarianship
PPT
Geospatial Metadata Workshop
PDF
Baker - Evolution of Data Products and Designated Audiences
PDF
WWW2013 Tutorial: Linked Data & Education
PPTX
Research Data Management at the University of Edinburgh
Research Data Management at Edinburgh: Effecting Culture Change
Data Library Services In The Data Stewardship Lifecycle
Research Data Management: Policy Development
Smith - Developing Campus Stakeholders' Collaborations - Sept 8
Northumbria University Geospatial Metadata Workshop 20110505
Poster RDAP13: Research Data in eCommons @ Cornell: Present and Future
RDM Priorities, Stakeholders, Practice
Opening up data – Jisc and CNI conference 10 July 2014
Leeds University Geospatial Metadata Workshop 20110617
Harnessing Collective Intelligence for Sustainable Development
Uc3 pasig-asis&t-2013-08-20-support-of-data-intensive-research
Curation of Research Data
Geospatial Metadata and Spatial Data: It's all Greek to me!
David Shotton - Research Integrity: Integrity of the published record
The Rise of the Data Journal
Sept 11 NISO Webinar: Research Data Curation Part 1: E-Science Librarianship
Geospatial Metadata Workshop
Baker - Evolution of Data Products and Designated Audiences
WWW2013 Tutorial: Linked Data & Education
Research Data Management at the University of Edinburgh
Ad

Viewers also liked (11)

PPTX
Research software identification - Catherine Jones
PPTX
NISO/DCMI May 22 Webinar: Semantic Mashups Across Large, Heterogeneous Insti...
PPTX
NISO/DCMI September 25 Webinar: Implementing Linked Data in Developing Countr...
PPTX
NISO/DCMI Webinar: Schema.org and Linked Data: Complementary Approaches to Pu...
PPTX
NISO/DCMI Webinar: Metadata for Public Sector Administration
PDF
NISO DCMI Webinar bibframe-20130123
PPTX
April 24, 2013 NISO/DCMI Webinar: Deployment of RDA (Resource Description and...
PPT
Embedding Linked Data Invisibly into Web Pages: Strategies and Workflows for ...
PDF
Embedding Linked Data Invisibly into Web Pages: Strategies and Workflows for ...
PPTX
NISO/DCMI Webinar: Cooperative Authority Control: The Virtual International A...
PPTX
NISO/DCMI Webinar: Metadata for Managing Scientific Research Data
Research software identification - Catherine Jones
NISO/DCMI May 22 Webinar: Semantic Mashups Across Large, Heterogeneous Insti...
NISO/DCMI September 25 Webinar: Implementing Linked Data in Developing Countr...
NISO/DCMI Webinar: Schema.org and Linked Data: Complementary Approaches to Pu...
NISO/DCMI Webinar: Metadata for Public Sector Administration
NISO DCMI Webinar bibframe-20130123
April 24, 2013 NISO/DCMI Webinar: Deployment of RDA (Resource Description and...
Embedding Linked Data Invisibly into Web Pages: Strategies and Workflows for ...
Embedding Linked Data Invisibly into Web Pages: Strategies and Workflows for ...
NISO/DCMI Webinar: Cooperative Authority Control: The Virtual International A...
NISO/DCMI Webinar: Metadata for Managing Scientific Research Data
Ad

Similar to Edinburgh DataShare: Tackling research data in a DSpace institutional repository (20)

PPT
Open Data and Institutional Repositories
PPT
Services, policy, guidance and training: Improving research data management a...
PPTX
Services, policy, guidance and training: Improving research data management a...
PPT
Seminario Sobre Datasets Consorcio Madrono
PPTX
Scottish Digital Library Consortium Meeting: Edinburgh DataShare
PPT
User Engagement in Research Data Curation
PDF
Engaging with students and researchers: the case of the social sciences
PPT
EDINA / Data Library Overview
PPT
User engagement in research data curation
PPTX
RDM LIASA webinar
PPTX
AKVS - Edinburgh Data Repository Experiences June 2016
PPTX
RDM Programme @ Edinburgh: Data Librarian Experience
PPTX
Research Data Management at The University of Edinburgh
PPTX
Research Data Management Initiatives at the University of Edinburgh
PPT
DSpace for Data Revisited
PPT
DCC 101: Preservation
PPT
Digital Curation 101: Preserve
Open Data and Institutional Repositories
Services, policy, guidance and training: Improving research data management a...
Services, policy, guidance and training: Improving research data management a...
Seminario Sobre Datasets Consorcio Madrono
Scottish Digital Library Consortium Meeting: Edinburgh DataShare
User Engagement in Research Data Curation
Engaging with students and researchers: the case of the social sciences
EDINA / Data Library Overview
User engagement in research data curation
RDM LIASA webinar
AKVS - Edinburgh Data Repository Experiences June 2016
RDM Programme @ Edinburgh: Data Librarian Experience
Research Data Management at The University of Edinburgh
Research Data Management Initiatives at the University of Edinburgh
DSpace for Data Revisited
DCC 101: Preservation
Digital Curation 101: Preserve

More from Robin Rice (20)

PPTX
Securing, storing and enabling safe access to data
PPTX
Research Data Support at the University of Edinburgh
PPTX
Research Data Service at the University of Edinburgh
PPTX
Research data support: a growth area for academic libraries?
PPTX
Staffing Research Data Services at University of Edinburgh
PPTX
Providing support and services for researchers in good data governance
PPTX
The University of Edinburgh Research Data Management Service Suite
PPTX
FAIR vs GDPR: which will win?
PPTX
RDM Roadmap to the Future, or: Lords and Ladies of the Data
PPTX
Policies, procedures and standards for managing content in repositories.
PPTX
Providing research data services in changing times
PPTX
The University of Edinburgh Research Data Management Service Suite
PDF
Supporting researchers in managing data
PPTX
Managing active research in the University of Edinburgh
PPTX
Overcoming obstacles to sharing data about human subjects
PPTX
‘Good, better, best’? Examining the range and rationales of institutional dat...
PPTX
Data Library Services at the University of Edinburgh
PPTX
Guiding users through data deposit
PPTX
What does Open Science, Open Scholarship look like?
PPTX
Open data and research data management at the University of Edinburgh: polici...
Securing, storing and enabling safe access to data
Research Data Support at the University of Edinburgh
Research Data Service at the University of Edinburgh
Research data support: a growth area for academic libraries?
Staffing Research Data Services at University of Edinburgh
Providing support and services for researchers in good data governance
The University of Edinburgh Research Data Management Service Suite
FAIR vs GDPR: which will win?
RDM Roadmap to the Future, or: Lords and Ladies of the Data
Policies, procedures and standards for managing content in repositories.
Providing research data services in changing times
The University of Edinburgh Research Data Management Service Suite
Supporting researchers in managing data
Managing active research in the University of Edinburgh
Overcoming obstacles to sharing data about human subjects
‘Good, better, best’? Examining the range and rationales of institutional dat...
Data Library Services at the University of Edinburgh
Guiding users through data deposit
What does Open Science, Open Scholarship look like?
Open data and research data management at the University of Edinburgh: polici...

Recently uploaded (20)

PPTX
TEXTILE technology diploma scope and career opportunities
PDF
NewMind AI Weekly Chronicles – August ’25 Week III
PPTX
Configure Apache Mutual Authentication
PDF
Developing a website for English-speaking practice to English as a foreign la...
PPTX
Training Program for knowledge in solar cell and solar industry
PDF
Convolutional neural network based encoder-decoder for efficient real-time ob...
PDF
Hybrid horned lizard optimization algorithm-aquila optimizer for DC motor
PDF
Five Habits of High-Impact Board Members
PDF
CloudStack 4.21: First Look Webinar slides
PDF
Taming the Chaos: How to Turn Unstructured Data into Decisions
PPT
Galois Field Theory of Risk: A Perspective, Protocol, and Mathematical Backgr...
PDF
A proposed approach for plagiarism detection in Myanmar Unicode text
PDF
Produktkatalog für HOBO Datenlogger, Wetterstationen, Sensoren, Software und ...
PDF
Architecture types and enterprise applications.pdf
PDF
4 layer Arch & Reference Arch of IoT.pdf
PDF
STKI Israel Market Study 2025 version august
PPTX
Internet of Everything -Basic concepts details
PDF
Improvisation in detection of pomegranate leaf disease using transfer learni...
PDF
Flame analysis and combustion estimation using large language and vision assi...
PPTX
AI IN MARKETING- PRESENTED BY ANWAR KABIR 1st June 2025.pptx
TEXTILE technology diploma scope and career opportunities
NewMind AI Weekly Chronicles – August ’25 Week III
Configure Apache Mutual Authentication
Developing a website for English-speaking practice to English as a foreign la...
Training Program for knowledge in solar cell and solar industry
Convolutional neural network based encoder-decoder for efficient real-time ob...
Hybrid horned lizard optimization algorithm-aquila optimizer for DC motor
Five Habits of High-Impact Board Members
CloudStack 4.21: First Look Webinar slides
Taming the Chaos: How to Turn Unstructured Data into Decisions
Galois Field Theory of Risk: A Perspective, Protocol, and Mathematical Backgr...
A proposed approach for plagiarism detection in Myanmar Unicode text
Produktkatalog für HOBO Datenlogger, Wetterstationen, Sensoren, Software und ...
Architecture types and enterprise applications.pdf
4 layer Arch & Reference Arch of IoT.pdf
STKI Israel Market Study 2025 version august
Internet of Everything -Basic concepts details
Improvisation in detection of pomegranate leaf disease using transfer learni...
Flame analysis and combustion estimation using large language and vision assi...
AI IN MARKETING- PRESENTED BY ANWAR KABIR 1st June 2025.pptx

Edinburgh DataShare: Tackling research data in a DSpace institutional repository

  • 1. Edinburgh DataShare : Tackling research data in a DSpace institutional repository Robin Rice EDINA and Data Library, Information Services University of Edinburgh, Scotland DSpace User Group Meeting Gothenburg, Sweden 14-16 October, 2009
  • 2. Storyboard About EDINA & Data Library at UoE About the DISC-UK DataShare project What’s different about data? Enter the Data Audit Framework Lessons learned from both projects
  • 3. EDINA is the JISC national academic data centre based at the University of Edinburgh *. Our mission and purpose is to ‘enhance the productivity of research, learning and teaching’ across all universities, research institutes and colleges in the UK. We do this by delivering first-rate online services and by working with support staff in university and colleges and with other partners … and by carrying out successful R&D projects.
  • 4. Data Library: History Established out of the Program Library Unit in early 1980s to provide access to data on mainframes, e.g. 1981 population census data. Part of long tradition of sharing machine-readable data for secondary analysis in the social sciences Formed the EDINA national data centre in 1996 - data library continues University remit celebrated 25 th anniversary in 2008
  • 5. What is a data library? A data library refers to both the content and the services that foster use of collections of numeric, audio-visual, textual or geospatial data sets for secondary use in research. A data library is normally part of a larger institution (academic, corporate, scientific, medical, governmental, etc.) established to serve the data users of that organisation. The data library tends to house local data collections and provides access to them through various means.
  • 6. Edinburgh Data Library services … distilled Finding… “ I need to analyse some data for a project, but all I can find are published papers with tables and graphs, not the original data source.” Accessing … “ I’ve found the data I need, but I’m not sure how to gain access to it.” Using … “ I’ve got the data I need, but I’m having problems analysing it in my chosen software.” Managing … “ I have collected my own data and I’d like to document and preserve it and make it available to others.”
  • 7. A forum for data professionals working in UK Higher Education who specialise in supporting staff and students in the use of numeric and geo-spatial data. DISC-UK’s aims are - Foster understanding between data users and providers Raise awareness of the value of data support in Universities Share information and resources among local data support staff We are also members of IASSIST , an international organisation of professionals working in and with information technology and data services to support research and teaching in the social sciences.
  • 8. DISC-UK has completed a JISC-funded repository enhancement project (March 07 - March 09) with the aim of “exploring new pathways to assist academics wishing to share their data over the Internet”. With three institutions taking part – the Universities of Edinburgh, Oxford and Southampton – a range of institutional data repositories and related services have been established. The project was led by the JISC-funded national data centre, EDINA, at the University of Edinburgh, which also runs the University’s Data Library service.
  • 9. DISC-UK DataShare Partners’ Repository Experience (March, 2007)
  • 10. Project Keywords “ Live” cloud tag at http://www.disc-uk.org/collective.html based on social bookmarks
  • 11. Project Briefing Papers Gibbs, H. (2007). DISC-UK DataShare: State-of-the-Art Review Martinez, L. (2008). The Data Documentation Initiative (DDI) and Institutional Repositories Macdonald, S. (2008). Data Visualisation Tools: Part 1 - Numeric Data in a Web 2.0 Environment ; Part 2 - Spatial Data in a Web 2.0 Environment and Beyond Green, A., et al (2009). Policy-making for Research Data in Repositories: A Guide
  • 12. What’s different about data ? Research data are collected, not authored. Data may be shared, but are they published? In a data repository, is the repository the publisher? There are no explicit rewards for sharing data. Size, type, complexity, update frequency DSpace is improvement on informal sharing methods. Other solutions may work better for intensive data curation (see Data Sharing Continuum ) Who ‘owns’ the data? Who is the rights-holder? (individual/dept/institution/funder/subjects/nobody?) but minimal IPR exist in data. Issues about licensing. Is Dublin Core sufficient? Edinburgh DataShare has set up a Dublin Core metadata schema for datasets
  • 13. Edinburgh DataShare Dublin Core-compliant metadata fields Depositor (contributor) Data Creator Title Alternative Title Dataset Description (abstract) Type Subject Classification (JACS) Subject Keywords Funder (contributor) Data Publisher Spatial Coverage Time Period (temporal coverage) Language Source Dataset Description (TOC) Relation (Is Version Of) Supercedes Relation (Is Referenced By) Rights Date Accessioned
  • 14. Tech development: Edinburgh DataShare Upgrade to DSpace ver 1.51 with new theme aligned with University corporate style Registration streamlined using the University’s single sign-on Embargo option - coded to restrict full data download with open metadata Open Data Commons license option (PDDL); else Rights field mandatory Date range enabled to allow Time Period (dc:coverage) Dynamically queries Geonames, a community generated database to find matching places & ensure consistency in metadata entry for Spatial Coverage field Extension to DSpace to record bitstream downloads in usage statistics Anti-virus checking upon upload Download All option (zip file of all item components) Citation field automatically generated based on specified metadata values.
  • 15. Data creation, collection, repurposing: Partnerships between researchers & support services with subject expertise; informed by domain standards and guidelines relating to formats, metadata, version control, etc. Data processing, management and curation: Data are transformed, cleaned, derived as part of the research process; curators identify ‘partnering moments' to capture content for documentation and description. Staging repositories offer curatorial workspaces. Data sharing and distribution: Repositories ingest and manage research outputs; offer federated searching, redundant storage, access controls; scholarly publications linked to data. Data preservation, dissemination & long term stewardship: Repositories and data archives provide preservation services such as format migration and media refreshment; dataset may survive a period of dis-interest before being re-discovered. Discovery and Planning Data Analysis Publication and Sharing Long term access Repositories Curation services Researchers PARTNERSHIPS Partnerships in the Data & Research Lifecycle Ann Green, Digital Lifecycle Computing
  • 16. Enter Data Audit Framework Recommendation to JISC: “JISC should develop a Data Audit Framework to enable all universities and colleges to carry out an audit of departmental data collections, awareness, policies and practice for data curation and preservation.” Liz Lyon (2007). Dealing with Data: Roles, Rights, Responsibilities and Relationships
  • 17. Data Audit Framework (DAF) Projects 2008 JISC funded five six-month projects: DAF Development (DAFD) Project, led by Seamus Ross (Director), Sarah Jones (Project Manager) HATII/DCC, University of Glasgow Four pilot implementation projects: King’s College London University of Edinburgh University College London Imperial College London Two more conducted by DataShare partners, the Universities of Oxford and Southampton, as added deliverables
  • 18. See www.data-audit.eu DAF project reports available (findings) Appendices with questionnaires, interview schedules, etc Methodology document Online tool ready for others to conduct data audits
  • 19. Methodology Based on Records Management Audit methodology. Five stages: Planning the audit; Identifying data assets; Classifying and appraising data assets; Assessing the management of data assets; Reporting findings and recommending change.
  • 20. Lessons Learned Overall (1) Top-down drivers are important for overcoming barriers to data sharing (e.g. funders’ requirements for data mgmt and sharing plans) as they are for open access publishing. Data management motivation is a better bottom-up driver for researchers than data sharing but is not sufficient to create culture change. Institutional repositories can play a part in overall infrastructure for data sharing (see Data Sharing Continuum ). Data librarians, data managers and data scientists can help bridge communication between repository managers & researchers (see Data Skills/Career study, Swan & Sheridan 2008 ).
  • 21. Swan, Sheridan 2008 … The report calls for a ‘repositioning’ of the role of the library in data-intensive research. The authors of the report Alma Swan and Sheridan Brown write: ‘We see three main potential roles for the library... Increasing data-awareness amongst researchers; providing archiving and data preservation services through institutional repositories; and developing a new professional strand of practice in the form of data librarianship.’
  • 22. Lessons Learned Overall (2) Institutions should consider developing research data policy, to clarify rights & responsibilities. Institutions create a broad range of data in the course of research, not just numeric datasets. So for institutional data repositories, the self-archiving model is probably the best for ensuring data quality. (Repository is a host, not a publisher.) Nevertheless, researchers need guidance. IRs can improve impact of sharing data over the internet (permanent identifiers, citations, links with publications, discoverable metadata, long-term access and stewardship). Don’t conduct institutional data audits unless you’re prepared to open a can of data management worms!
  • 23. Finally And don’t go it alone. Get buy-in from other institutional stake-holders (computing staff, librarians, department heads, principal investigators, records managers, archivists, research office staff). Collaborate. Have fun  http://datashare.edina.ac.uk/dspace http://www.disc-uk.org/datashare.html

Editor's Notes

  • #16: Title: Partnerships in the data and research life cycle Research life cycle: blue arrows Data life cycle: yellow boxes