Introduction to Data
Management
Cunera Buys
RRF 2015
https://www.flickr.com/photos/hellocatfood/7957989238/ (CC BY-NC-SA 2.0)
• Background on data management
• Why data management is important
• Intro to best practices for data management
• Library resources for data management.
Photo by Carl Vogtmann
Data Snafu
Data Sharing and Management Snafu in 3 Short Acts
https://www.youtube.com/watch?v=N2zK3sAtr-4
What are data?
://www.flickr.com/photos/rh2ox/9990024683/ (CC BY-SA 2.0)
Data- Some Definitions
Digital Curation Center (UK): “Data, any information in binary digital form, is at
the centre of the Curation Lifecycle.”
Office of Management and Budget: “Research data means the recorded factual
material commonly accepted in the scientific community as necessary to
validate research findings”
The Oxford English Dictionary (OED)defines “data” as:
Related items of (chiefly numerical) information considered collectively,
typically obtained by scientific work and used for reference, analysis, or
calculation.
Data can be both analogue and digital materials.
Data in the Sciences and Humanities
BICEP2 (South Pole telescope) Performativity, Place, Space
Burgess and Hamming, 2011BICEP2 Collaboration, 2014
Every discipline has data!
Types of data include:
• observational data
• laboratory experimental data
• computer simulation
• textual analysis
• physical artifacts or relics
Examples of data include:
• Audio and video files
• Code or scripts
• Digital text
• Lab notebooks
• Geospatial images
• Instrumental data
• Photographs
• Rock samples
• Survey results
• Scanned documents
• Spreadsheets
• Video games
https://www.flickr.com/photos/23165290@N00/9338136777/(CC BY-SA 2.0)
Why do funders and broader science
community want to share and preserve
data?
https://www.flickr.com/photos/joyvanb/11111295964/ (CC BY-NC-ND 2.0)
Brief History of Data Sharing Requirements
• February 26, 2003 - NIH requires a Data Sharing Policy for projects
above $500K.
• January 18, 2011- NSF requires Data Management Plans (DMPs) to
be submitted with all new grant proposals.
• February 22, 2013- Memo issued by White House Office of Science
and Technology Policy (OSTP).
http://www.whitehouse.gov/sites/default/files/microsites/ostp/ost
p_public_access_memo_2013.pdf
Responding Agencies to OSTP Memo
Agency for Healthcare Research and Quality (AHRQ)
HHS Office of the Assistant Secretary for Preparedness and Response (ASPR)
Centers for Disease Control and Prevention (CDC)
Department of Commerce (DOC)
Department of Defense (DOD)
Department of Energy (DOE)
Department of the Interior (DOI)
Department of Health and Human Services (HHS)
Department of Homeland Security (DHS)
Department of Transportation (DOT)
Department of Education (ED)
Environmental Protection Agency (EPA)
Food and Drug Administration (FDA)
National Aeronautics and Space Administration (NASA)
National Institutes of Health (NIH)
National Institute of Standards and Technology (NIST)
National Oceanic and Atmospheric Administration (NOAA)
National Science Foundation (NSF)
Office of the Director of National Intelligence (ODNI)
Smithsonian Institution (SI)
United States Agency for International Development (USAID)
United States Department of Agriculture (USDA)
United States Department of Veterans Affairs (VA)
Journal Requirements
PLOS journals require authors to make all data underlying the findings
described in their manuscript fully available without restriction, with rare
exception.
Prevent Data Loss
Reproducibility
Introduction to data management
Introduction to data management
Introduction to data management
Recognition
Chapter II.C.2.f(i)(c), Biographical Sketch(es), has been revised to rename the
“Publications” section to “Products” and amend terminology and instructions
accordingly. This change makes clear that products may include, but are not
limited to, publications, data sets, software, patents, and copyrights.
Benefits of Sharing Data
• Clearly documents and provides evidence for research in conjunction with
published results.
• Meet copyright and ethical compliance (i.e. HIPAA).
• Increases the impact of research through data citation.
• Preserves data for long-term access and prevents loss of data.
• Describes and shares data with others to further new discoveries and research.
• Prevent duplication of research.
• Accelerates the pace of research.
• Promotes reproducibility of research.
Data reuse success story # 1
Data reuse success story # 2
• Background on data management
• Why data management is important
• Intro to best practices for data management
• Library resources for data management.
Photo by Carl Vogtmann
Data Management
• Managing data effectively across the data lifecycle is critical for the
success of a research project
– Make a data management plan
• Data management refers to all aspects of creating, housing,
delivering, maintaining, and archiving and preserving data
• It is one of the essential areas of responsible conduct of research
• All subject areas (humanities, social science, and hard sciences)
engage with data in many formats.
• Absence of data documentation and management will limit the
potential use of that data.
From: Fary, Michael and Owen, Kim, Developing an
Institutional Research Data Management Plan Service,
Educause ACTI white paper, January 2013,
http://net.educause.edu/ir/library/pdf/ACTI1301.pdf
Common Data
Lifecycle Stages
http://data.library.virginia.edu/data-management/lifecycle/
Introduction to data management
Aspects of Research Data
Management
•DMPs/Planning
•File organization & naming
•Documentation & metadata
•Storage & backup
•Legal/ethical considerations
•Sharing & reuse
•Preservation & Archiving
Start with a plan…
• Types of data to be produced.
• Standards or descriptions that would be used with the data
(metadata).
• How these data will be accessed and shared.
• Policies and provisions for data sharing and reuse.
• Provisions for archiving and preservation.
https://flickr.com/photos/inl/5097547405 (CC BY 2.0)
Points to address in your Data Management Plan (DMP)
Introduction to data management
Introduction to data management
Introduction to data management
Introduction to data management
Aspects of Research Data
Management
•DMPs/Planning
•File organization & naming
•Documentation & metadata
•Storage & backup
•Legal/ethical considerations
•Sharing & reuse
•Preservation & Archiving
Thoughts on naming stuff and why you
should care…
• Find your files easier
• Creates uniformity
• Allows for sorting
• Under stand what is “under the hood”
• Allows for versioning
Directories
• Folders should be major functions/activities
• Subfolders by year
• Make folder names explanatory
• Avoid personal names
• Avoid duplication
• Simple and simplistic
Source: http://bentley.umich.edu/dchome/resources/filenaming.php
Some good data practices
File organization and naming
• Label and define the content of your data files in a systematic way
• Use descriptive file names
– For example not- FIAGC (Fluffy is a great cat) but age, blood pressure
etc.
• Use consistent date formatting ( e.g. YYYYMMDD)
• Keep file names short (no more than 25 characters)
• Don’t use special characters
• Use underscores instead of blank spaces
• Keep track of versions
• Don’t use confusing labels ( e.g. Pete’s data, final, final2, really final,
really really final)
Aspects of Research Data
Management
•DMPs/Planning
•File organization & naming
•Documentation & metadata
•Storage & backup
•Legal/ethical considerations
•Sharing & reuse
•Preservation & Archiving
Description and Documentation
(Metadata)
• Commonly defined as “data about data”
• It is information that describes the data
• It gives you the ability to explain to your research to somebody that knows
nothing about it
• Provides information about one or more aspects of the data, such as:
– Means of creation of the data
– Purpose of the data
– Time and date of creation
– Creator or author of the data
– Location on a computer network where the data
were created
– Standards used
https://www.flickr.com/photos/musebrarian/3289649684/ (CC BY-NC-SA 2.0)
Metadata according to ICPSR…
• A number of elements should be included in metadata, including, but not
limited to:
• Principal investigator
• Funding sources
• Data collector/producer
• Project description
• Sample and sampling procedures
• Weighting
• Substantive, temporal, and geographic coverage of the data collection
• Data source(s)
• Unit(s) of analysis/observation
• Variables
• Technical information on files
• Data collection instruments
Aspects of Research Data
Management
•DMPs/Planning
•File organization & naming
•Documentation & metadata
•Storage & backup
•Legal/ethical considerations
•Sharing & reuse
•Preservation & Archiving
Data nightmares
Data nightmares
Tweeted in 2012 by Gail Steinhart, Head of
Research Services, Mann Library, Cornell
University
Data nightmares
Toy Story 2
How Toy Story 2 Almost Got Deleted: Stories From Pixar Animation: ENTV
https://www.youtube.com/watch?v=8dhp_20j0Ys
Storage, back up and securing data
• Have at least 3 copies of your data- 2 local and
1 distant if possible
• Don’t use your personal computer, data sticks
or CDs if you can avoid it
– They break, get lost, lose data over time
• Use a hard drive if you can
• Use cloud storage if you can (but be aware of
sensitive data)
flickr.com/photos/s_w_ellis/3877534599 (CC By 2.0)
Northwestern Box
http://www.it.northwestern.edu/file-sharing/box.html
Aspects of Research Data
Management
•DMPs/Planning
•File organization & naming
•Documentation & metadata
•Storage & backup
•Legal/ethical considerations
•Sharing & reuse
•Preservation & Archiving
Legal Concerns
• Intellectual property rights
• Copyright- see the new NU policy on copyright
http://invo.northwestern.edu/policies/copyright-policy
• Patents
• Trade secrets
• Licensing
• Creative Commons
• Monetary charges for data usage
• Open source versus proprietary software
• Data retention
Ethical Considerations
• Have you obtained IRB approval(or gotten
a waiver)?
• Have you obtained the appropriate
consent from the subject?
• Have all personal identifiers been removed
from the data set?
• Are you using someone else’s data?
• Have you obtained appropriate permissions?
• Are you responsibly using and citing others’ data?
Aspects of Research Data
Management
•DMPs/Planning
•File organization & naming
•Documentation & metadata
•Storage & backup
•Legal/ethical considerations
•Sharing & reuse
•Preservation & Archiving
Preservation and Sharing data
• Some options for preserving and sharing data
– Self-archive
– Institutional repository
– Open data repository
– National or international data archive or
repository
By Florian Hirzinger - www.fh-ap.com (Own work (Florian Hirzinger)) [CC BY-SA 3.0 (http://creativecommons.org/licenses/by-sa/3.0) or GFDL (http://www.gnu.org/copyleft/fdl.html)], via Wikimedia Commons
Introduction to data management
Introduction to data management
Introduction to data management
• Background on data management
• Why data management is important
• Intro to best practices for data management
• Library resources for data management.
Photo by Carl Vogtmann
RESOURCES:
Northwestern University Library Data Management LibGuide:
http://libguides.northwestern.edu/datamanagement
DMPTool: https://dmp.org/
Northwestern University's Research Data: Ownership, Retention and Access Policy:
http://www.research.northwestern.edu/policies/documents/research_data.pdf
Cunera Buys- e-science librarian: c-buys@northwestern.edu

More Related Content

PDF
Research Data Management: How will Northwestern address new sharing requireme...
PPTX
Computational Research day 2015
PPTX
Data Literacy: Creating and Managing Reserach Data
PPTX
Introduction to data management
PPTX
Introduction to Data Management
PDF
Va sla nov 15 final
PPTX
Compliance: Data Management Plans and Public Access to Data
PPTX
Research Data Management for SOE
Research Data Management: How will Northwestern address new sharing requireme...
Computational Research day 2015
Data Literacy: Creating and Managing Reserach Data
Introduction to data management
Introduction to Data Management
Va sla nov 15 final
Compliance: Data Management Plans and Public Access to Data
Research Data Management for SOE

What's hot (20)

PPTX
How to Comply with Grants: Writing Data Management Plans and Providing Public...
PPTX
Al aposter mhenderson2015
PDF
Data management (1)
PPTX
DataONE Education Module 03: Data Management Planning
PPTX
DataONE Education Module 01: Why Data Management?
PPTX
Introduction to Research Data Management - 2015-02-09 - MPLS Division, Univer...
PPTX
Inroads into Data: Getting Involved in Data at Your Institution
PPTX
DataONE Education Module 10: Legal and Policy Issues
PPTX
Research Data Management Services at UWA (November 2015)
PPTX
Adding valuethroughdatacuration
PDF
Research Data Management: Part 1, Principles & Responsibilities
PPTX
Preparing Your Research Data for the Future - 2015-06-08 - Medical Sciences D...
PPTX
Data Services presentation for Psychology
PPTX
Introduction to Research Data Management - 2015-05-27 - Social Sciences Divis...
PPTX
Introduction to Research Data Management at UWA
PDF
NSF Data Requirements and Changing Federal Requirements for Research
PPTX
Federal Funding Agency's Public Access Policies and You
PPTX
IDCC Workshop: Analysing DMPs to inform research data services: lessons from ...
PDF
Good (enough) research data management practices
PDF
Research Data Management: Part 2, Practices
How to Comply with Grants: Writing Data Management Plans and Providing Public...
Al aposter mhenderson2015
Data management (1)
DataONE Education Module 03: Data Management Planning
DataONE Education Module 01: Why Data Management?
Introduction to Research Data Management - 2015-02-09 - MPLS Division, Univer...
Inroads into Data: Getting Involved in Data at Your Institution
DataONE Education Module 10: Legal and Policy Issues
Research Data Management Services at UWA (November 2015)
Adding valuethroughdatacuration
Research Data Management: Part 1, Principles & Responsibilities
Preparing Your Research Data for the Future - 2015-06-08 - Medical Sciences D...
Data Services presentation for Psychology
Introduction to Research Data Management - 2015-05-27 - Social Sciences Divis...
Introduction to Research Data Management at UWA
NSF Data Requirements and Changing Federal Requirements for Research
Federal Funding Agency's Public Access Policies and You
IDCC Workshop: Analysing DMPs to inform research data services: lessons from ...
Good (enough) research data management practices
Research Data Management: Part 2, Practices
Ad

Similar to Introduction to data management (20)

PDF
Guy avoiding-dat apocalypse
PPTX
Creating a Data Management Plan
PPTX
Data Management for Research (New Faculty Orientation)
PDF
Escaping Datageddon
PDF
Research Data Management and Sharing for the Social Sciences and Humanities
PPT
Research Data Management
PPTX
Impressive Data Management and Sharing - presentation.pptx
PDF
Data Management Lab: Session 1 Slides
PPTX
Responsible Conduct of Research: Data Management
PDF
Data Management for the Digital Humanities
PPTX
Publishing your research: Research Data Management (Introduction)
PPTX
Managing and sharing data
PPT
ManagingOrganizingData_ReusableSlides.ppt
PPTX
Research Data Curation _ Grad Humanities Class
PPT
Data management plans (dmp) for nsf
PPT
Data management plans (dmp) for nsf
PPTX
Managing Your Research Data
PPTX
Winter school in research data science research data management - final
PPTX
Research Data Management
PPTX
Demography pro sem
Guy avoiding-dat apocalypse
Creating a Data Management Plan
Data Management for Research (New Faculty Orientation)
Escaping Datageddon
Research Data Management and Sharing for the Social Sciences and Humanities
Research Data Management
Impressive Data Management and Sharing - presentation.pptx
Data Management Lab: Session 1 Slides
Responsible Conduct of Research: Data Management
Data Management for the Digital Humanities
Publishing your research: Research Data Management (Introduction)
Managing and sharing data
ManagingOrganizingData_ReusableSlides.ppt
Research Data Curation _ Grad Humanities Class
Data management plans (dmp) for nsf
Data management plans (dmp) for nsf
Managing Your Research Data
Winter school in research data science research data management - final
Research Data Management
Demography pro sem
Ad

Recently uploaded (20)

PDF
Hikvision-IR-PPT---EN.pdfSADASDASSAAAAAAAAAAAAAAA
PPTX
Sheep Seg. Marketing Plan_C2 2025 (1).pptx
PDF
Book Trusted Companions in Delhi – 24/7 Available Delhi Personal Meeting Ser...
PPTX
Basic Statistical Analysis for experimental data.pptx
PDF
book-34714 (2).pdfhjkkljgfdssawtjiiiiiujj
PPTX
Bussiness Plan S Group of college 2020-23 Final
PDF
toaz.info-grade-11-2nd-quarter-earth-and-life-science-pr_5360bfd5a497b75f7ae4...
PPTX
Chapter security of computer_8_v8.1.pptx
PDF
NU-MEP-Standards معايير تصميم جامعية .pdf
PPTX
DIGITAL DESIGN AND.pptx hhhhhhhhhhhhhhhhh
PDF
General category merit rank list for neet pg
PPTX
cyber row.pptx for cyber proffesionals and hackers
PPTX
inbound6529290805104538764.pptxmmmmmmmmm
PPTX
inbound2857676998455010149.pptxmmmmmmmmm
PDF
Nucleic-Acids_-Structure-Typ...-1.pdf 011
PPTX
AI-Augmented Business Process Management Systems
PDF
Teal Blue Futuristic Metaverse Presentation.pdf
PPTX
Transport System for Biology students in the 11th grade
PPTX
9 Bioterrorism.pptxnsbhsjdgdhdvkdbebrkndbd
PDF
9 FinOps Tools That Simplify Cloud Cost Reporting.pdf
Hikvision-IR-PPT---EN.pdfSADASDASSAAAAAAAAAAAAAAA
Sheep Seg. Marketing Plan_C2 2025 (1).pptx
Book Trusted Companions in Delhi – 24/7 Available Delhi Personal Meeting Ser...
Basic Statistical Analysis for experimental data.pptx
book-34714 (2).pdfhjkkljgfdssawtjiiiiiujj
Bussiness Plan S Group of college 2020-23 Final
toaz.info-grade-11-2nd-quarter-earth-and-life-science-pr_5360bfd5a497b75f7ae4...
Chapter security of computer_8_v8.1.pptx
NU-MEP-Standards معايير تصميم جامعية .pdf
DIGITAL DESIGN AND.pptx hhhhhhhhhhhhhhhhh
General category merit rank list for neet pg
cyber row.pptx for cyber proffesionals and hackers
inbound6529290805104538764.pptxmmmmmmmmm
inbound2857676998455010149.pptxmmmmmmmmm
Nucleic-Acids_-Structure-Typ...-1.pdf 011
AI-Augmented Business Process Management Systems
Teal Blue Futuristic Metaverse Presentation.pdf
Transport System for Biology students in the 11th grade
9 Bioterrorism.pptxnsbhsjdgdhdvkdbebrkndbd
9 FinOps Tools That Simplify Cloud Cost Reporting.pdf

Introduction to data management

  • 1. Introduction to Data Management Cunera Buys RRF 2015 https://www.flickr.com/photos/hellocatfood/7957989238/ (CC BY-NC-SA 2.0)
  • 2. • Background on data management • Why data management is important • Intro to best practices for data management • Library resources for data management. Photo by Carl Vogtmann
  • 3. Data Snafu Data Sharing and Management Snafu in 3 Short Acts https://www.youtube.com/watch?v=N2zK3sAtr-4
  • 5. Data- Some Definitions Digital Curation Center (UK): “Data, any information in binary digital form, is at the centre of the Curation Lifecycle.” Office of Management and Budget: “Research data means the recorded factual material commonly accepted in the scientific community as necessary to validate research findings” The Oxford English Dictionary (OED)defines “data” as: Related items of (chiefly numerical) information considered collectively, typically obtained by scientific work and used for reference, analysis, or calculation. Data can be both analogue and digital materials.
  • 6. Data in the Sciences and Humanities BICEP2 (South Pole telescope) Performativity, Place, Space Burgess and Hamming, 2011BICEP2 Collaboration, 2014
  • 7. Every discipline has data! Types of data include: • observational data • laboratory experimental data • computer simulation • textual analysis • physical artifacts or relics Examples of data include: • Audio and video files • Code or scripts • Digital text • Lab notebooks • Geospatial images • Instrumental data • Photographs • Rock samples • Survey results • Scanned documents • Spreadsheets • Video games https://www.flickr.com/photos/23165290@N00/9338136777/(CC BY-SA 2.0)
  • 8. Why do funders and broader science community want to share and preserve data? https://www.flickr.com/photos/joyvanb/11111295964/ (CC BY-NC-ND 2.0)
  • 9. Brief History of Data Sharing Requirements • February 26, 2003 - NIH requires a Data Sharing Policy for projects above $500K. • January 18, 2011- NSF requires Data Management Plans (DMPs) to be submitted with all new grant proposals. • February 22, 2013- Memo issued by White House Office of Science and Technology Policy (OSTP). http://www.whitehouse.gov/sites/default/files/microsites/ostp/ost p_public_access_memo_2013.pdf
  • 10. Responding Agencies to OSTP Memo Agency for Healthcare Research and Quality (AHRQ) HHS Office of the Assistant Secretary for Preparedness and Response (ASPR) Centers for Disease Control and Prevention (CDC) Department of Commerce (DOC) Department of Defense (DOD) Department of Energy (DOE) Department of the Interior (DOI) Department of Health and Human Services (HHS) Department of Homeland Security (DHS) Department of Transportation (DOT) Department of Education (ED) Environmental Protection Agency (EPA) Food and Drug Administration (FDA) National Aeronautics and Space Administration (NASA) National Institutes of Health (NIH) National Institute of Standards and Technology (NIST) National Oceanic and Atmospheric Administration (NOAA) National Science Foundation (NSF) Office of the Director of National Intelligence (ODNI) Smithsonian Institution (SI) United States Agency for International Development (USAID) United States Department of Agriculture (USDA) United States Department of Veterans Affairs (VA)
  • 11. Journal Requirements PLOS journals require authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception.
  • 17. Recognition Chapter II.C.2.f(i)(c), Biographical Sketch(es), has been revised to rename the “Publications” section to “Products” and amend terminology and instructions accordingly. This change makes clear that products may include, but are not limited to, publications, data sets, software, patents, and copyrights.
  • 18. Benefits of Sharing Data • Clearly documents and provides evidence for research in conjunction with published results. • Meet copyright and ethical compliance (i.e. HIPAA). • Increases the impact of research through data citation. • Preserves data for long-term access and prevents loss of data. • Describes and shares data with others to further new discoveries and research. • Prevent duplication of research. • Accelerates the pace of research. • Promotes reproducibility of research.
  • 19. Data reuse success story # 1
  • 20. Data reuse success story # 2
  • 21. • Background on data management • Why data management is important • Intro to best practices for data management • Library resources for data management. Photo by Carl Vogtmann
  • 22. Data Management • Managing data effectively across the data lifecycle is critical for the success of a research project – Make a data management plan • Data management refers to all aspects of creating, housing, delivering, maintaining, and archiving and preserving data • It is one of the essential areas of responsible conduct of research • All subject areas (humanities, social science, and hard sciences) engage with data in many formats. • Absence of data documentation and management will limit the potential use of that data.
  • 23. From: Fary, Michael and Owen, Kim, Developing an Institutional Research Data Management Plan Service, Educause ACTI white paper, January 2013, http://net.educause.edu/ir/library/pdf/ACTI1301.pdf Common Data Lifecycle Stages
  • 26. Aspects of Research Data Management •DMPs/Planning •File organization & naming •Documentation & metadata •Storage & backup •Legal/ethical considerations •Sharing & reuse •Preservation & Archiving
  • 27. Start with a plan…
  • 28. • Types of data to be produced. • Standards or descriptions that would be used with the data (metadata). • How these data will be accessed and shared. • Policies and provisions for data sharing and reuse. • Provisions for archiving and preservation. https://flickr.com/photos/inl/5097547405 (CC BY 2.0) Points to address in your Data Management Plan (DMP)
  • 33. Aspects of Research Data Management •DMPs/Planning •File organization & naming •Documentation & metadata •Storage & backup •Legal/ethical considerations •Sharing & reuse •Preservation & Archiving
  • 34. Thoughts on naming stuff and why you should care… • Find your files easier • Creates uniformity • Allows for sorting • Under stand what is “under the hood” • Allows for versioning
  • 35. Directories • Folders should be major functions/activities • Subfolders by year • Make folder names explanatory • Avoid personal names • Avoid duplication • Simple and simplistic Source: http://bentley.umich.edu/dchome/resources/filenaming.php
  • 36. Some good data practices File organization and naming • Label and define the content of your data files in a systematic way • Use descriptive file names – For example not- FIAGC (Fluffy is a great cat) but age, blood pressure etc. • Use consistent date formatting ( e.g. YYYYMMDD) • Keep file names short (no more than 25 characters) • Don’t use special characters • Use underscores instead of blank spaces • Keep track of versions • Don’t use confusing labels ( e.g. Pete’s data, final, final2, really final, really really final)
  • 37. Aspects of Research Data Management •DMPs/Planning •File organization & naming •Documentation & metadata •Storage & backup •Legal/ethical considerations •Sharing & reuse •Preservation & Archiving
  • 38. Description and Documentation (Metadata) • Commonly defined as “data about data” • It is information that describes the data • It gives you the ability to explain to your research to somebody that knows nothing about it • Provides information about one or more aspects of the data, such as: – Means of creation of the data – Purpose of the data – Time and date of creation – Creator or author of the data – Location on a computer network where the data were created – Standards used https://www.flickr.com/photos/musebrarian/3289649684/ (CC BY-NC-SA 2.0)
  • 39. Metadata according to ICPSR… • A number of elements should be included in metadata, including, but not limited to: • Principal investigator • Funding sources • Data collector/producer • Project description • Sample and sampling procedures • Weighting • Substantive, temporal, and geographic coverage of the data collection • Data source(s) • Unit(s) of analysis/observation • Variables • Technical information on files • Data collection instruments
  • 40. Aspects of Research Data Management •DMPs/Planning •File organization & naming •Documentation & metadata •Storage & backup •Legal/ethical considerations •Sharing & reuse •Preservation & Archiving
  • 42. Data nightmares Tweeted in 2012 by Gail Steinhart, Head of Research Services, Mann Library, Cornell University
  • 44. Toy Story 2 How Toy Story 2 Almost Got Deleted: Stories From Pixar Animation: ENTV https://www.youtube.com/watch?v=8dhp_20j0Ys
  • 45. Storage, back up and securing data • Have at least 3 copies of your data- 2 local and 1 distant if possible • Don’t use your personal computer, data sticks or CDs if you can avoid it – They break, get lost, lose data over time • Use a hard drive if you can • Use cloud storage if you can (but be aware of sensitive data) flickr.com/photos/s_w_ellis/3877534599 (CC By 2.0)
  • 47. Aspects of Research Data Management •DMPs/Planning •File organization & naming •Documentation & metadata •Storage & backup •Legal/ethical considerations •Sharing & reuse •Preservation & Archiving
  • 48. Legal Concerns • Intellectual property rights • Copyright- see the new NU policy on copyright http://invo.northwestern.edu/policies/copyright-policy • Patents • Trade secrets • Licensing • Creative Commons • Monetary charges for data usage • Open source versus proprietary software • Data retention
  • 49. Ethical Considerations • Have you obtained IRB approval(or gotten a waiver)? • Have you obtained the appropriate consent from the subject? • Have all personal identifiers been removed from the data set? • Are you using someone else’s data? • Have you obtained appropriate permissions? • Are you responsibly using and citing others’ data?
  • 50. Aspects of Research Data Management •DMPs/Planning •File organization & naming •Documentation & metadata •Storage & backup •Legal/ethical considerations •Sharing & reuse •Preservation & Archiving
  • 51. Preservation and Sharing data • Some options for preserving and sharing data – Self-archive – Institutional repository – Open data repository – National or international data archive or repository By Florian Hirzinger - www.fh-ap.com (Own work (Florian Hirzinger)) [CC BY-SA 3.0 (http://creativecommons.org/licenses/by-sa/3.0) or GFDL (http://www.gnu.org/copyleft/fdl.html)], via Wikimedia Commons
  • 55. • Background on data management • Why data management is important • Intro to best practices for data management • Library resources for data management. Photo by Carl Vogtmann
  • 56. RESOURCES: Northwestern University Library Data Management LibGuide: http://libguides.northwestern.edu/datamanagement DMPTool: https://dmp.org/ Northwestern University's Research Data: Ownership, Retention and Access Policy: http://www.research.northwestern.edu/policies/documents/research_data.pdf Cunera Buys- e-science librarian: [email protected]