Before depositing

Great that you want to make your data FAIR using the DANS Data Stations! The following aspects should be considered before you deposit your data with us.

Consult our selection criteria 

Before depositing data, consult the selection policy of our Data Stations to check whether your dataset meets the conditions for acceptance.

Note that depositing to our Data Stations is free of charge up to 50 GB per account.  More information can be found on this page.

Consult our terms of use

When you create an account for the Data Stations, you agree to be bound by the Terms of Use. These terms of use apply to the general use of the Data Stations, such as requesting and downloading datasets, and to depositing datasets. Additional information about the policies that govern our Data Stations can be found in our Data Stations Policy.  

Preparing FAIR data

The DANS Data Stations are domain-specific trustworthy digital repositories which help to make your data FAIR – Findable, Accessible, Interoperable and Reusable. Before depositing, it is good to understand what FAIR means. These guidelines give an overview of the FAIR principles and the steps data owners and repositories can take to increase the FAIRness of data. If you are interested in the work DANS does around FAIR in various projects, you can find more information in our data-expertise section

The good news is that by depositing in a Data Station, DANS will take care of many aspects of the FAIR principles for you: We provide a Persistent Identifier (PID) for your dataset, make your metadata findable in standardized formats, and allow others to download your data for reuse, either directly (open access) or after permission has been granted (restricted access). Your biggest responsibility to make your data FAIR is to provide detailed metadata and documentation so others can understand your data and know how it can be reused. 

Considerations around Personal data

Personal data within the meaning of the General Data Protection Regulation (GDPR) is data by which a living person can be identified, either directly or indirectly. Examples include names, identification numbers, location data, online identifiers, or elements that are characteristic of a person’s physical, physiological, genetic, mental, economic, cultural or social identity. Anonymous data is not considered personal data. Pseudonymized data, however, is considered personal data.

It is important to consider the following aspects, when you want to deposit personal data in the Data Stations:

  • You, or your organisation, are the data controller and DANS is the data processor. In addition to the general terms of use, you agree to be bound by the processing addendum.
    • As a data controller under the GDPR, you or your organisation are responsible for the correct processing of personal data in your data set. Depending on your project, this may include de-identification  and minimising personal data, or encrypting the data. Regarding encryption, it is important that you store the key for the encrypted data securely and permanently elsewhere. Your organisation may have additional guidance available on how to handle personal data in scientific research and local policies. 
    • You may need to archive files containing personal data as restricted files. This way, you can manage who accesses these files and you have the possibility to impose appropriate conditions for reuse. As data controller you are solely responsible for appropriate access and reuse restrictions. More information about depositing files under restricted access can be found in this guidebook
  • As the data controller you need to ensure that you have the legal basis to deposit the personal data at DANS. In many cases you will have to ask informed consent from your participants for archiving and publishing the personal data. It can be useful to add a blank copy of the consent form to your documentation when you deposit your dataset in the Data Stations. Please do note that the signed informed consent forms should not be included in the deposit. They contain personal details from the participants which should be kept by the data owner separate from the deposit.

Preparing documentation and files

Selecting files and formats

  • Which files are you going to deposit? Not all data needs to be preserved for the long term. More information can be found in this report.
  • Not all file formats ensure long-term usability, accessibility and preservation of data. DANS works with preferred formats. For more information, see the File formats page.
  • For the purpose of sustainable archiving, DANS may convert non-preferred formats to preferred formats and publish a curated version of your dataset as a new version. 
  • One of the transformations that Dataverse – the software of the Data Stations – does automatically is the transformation of  a tabular data file, such as a SPSS file or another statistical file, into a TAB-delimited file. This functionality will only work for files whose content can be interpreted as a table. Multi-sheet spreadsheets and CSV files with a different number of entries per row are two examples where the transformation cannot be executed. Please consider this when uploading tabular data files.
  • For clarity of the content of your dataset, DANS recommends not to use compressed archives such as ZIP, RAR, 7z etc.  Storage of data as ZIP or TAR files with the dataset can, however, be a solution for datasets of very large sizes (dozens of GBs) or containing very large volumes of data (thousands of data files). You can, for instance, make use of ZIP files to easily upload many data files at once (see ‘During deposit’). ZIP files containing up to 2000 files will then be automatically extracted. Please contact DANS if you want to deposit large datasets.
  • Are there many files in your dataset? If you deliver many files at once, please provide a file list, i.e. a list of file names, descriptions of the content and of any connections between the files.

Preparing documentation

  • Providing  relevant documentation is crucial for making your dataset understandable and reusable. This guide from the Consortium of European Social Science Data Archives provides you with guidance for creating good documentation and understandable data structures. It is important that you provide information alongside your data about how the data were collected, what different variables mean and explain abbreviations and terminology.  Relevant information also includes codebooks and the dataset structure.
  • Does your dataset include personal data? Please make sure that file names do not include personal data that should not be publicly available. This is because file names can be viewed by anyone. 
  • If you plan to deposit data under restricted access, please consult our guide and prepare a Data Access Protocol which outlines the access conditions for the restricted access data. 

Discipline-specific deposit requirements

Specific deposit requirements apply to the following disciplines:

  • Qualitative data (e.g. Oral history):
    • Making qualitative data reusable can be particularly challenging. You can find tips and tricks in our qualitative data guide
    • For interview data, adding a transcription of the interview, which includes the interview metadata, is highly desirable with a view to reusing the data. Together with the University of Amsterdam, DANS has developed a Metadata Transcription Template that you can use for this purpose.
    • If you add an audio or video file to your dataset, you may want to upload a subtitle file. The support only works if the subtitle files have a name with the following structure:  <basename audio video>.(language code>.vtt.
      There are three languages available:
      <language code> = nl <basename audio video>.nl.vtt
      <language code> = de <basename audio video>.de.vtt
      <language code> = en <basename audio video>.en.vtt
  • Historical sciences:
    • Describe the (archival) sources.
    • Describe the selection procedure used.
    • Describe the way in which the sources were used.
    • Refer to the standards or classification systems (such as HISCO) which were applied.
  • Social and behavioural sciences:
    • Describe the variable labels and value labels.
    • Describe the questionnaires and/or other research tools.
    • Include the fieldwork report (if available).
    • Include a codebook: a description of variables and information about population, types of data (units of observation/analysis), sample procedure, response/non-response, data collection method, weighting variables, constructed and/or derived variables.
    • Ensure that the language of the variable labels and value labels correspond to the language of the rest of the dataset. The metadata should be provided in Dutch or English.
  • Language and literature studies:
  • Archaeology:
    • Would you like to know more about the E-depot for Dutch Archaeology? Please visit this page.
    • Projects that have been described using the archaeological exchange protocol (SIKB0102 standard) must be submitted via the ArcheoDepot. The dataset files deposited with the provinces must be supplied in Preferred Formats.
    • Via the ArchaeoDepot, datasets are automatically sent from the provincial depot to the DANS archive.
    • During the startup phase of the ArcheoDepot, datasets can still be deposited directly at DANS if the province is not yet connected to the ArcheoDepot.
    • Be aware whether your data includes contact details of field staff or other parties involved, such as in the overview of personnel in planning documentation or in the administrative data of the final report. Refer to the Personal data section above if personal contact details are included.
  • Life Sciences – Medical:
    • For handling your data, please follow the guidelines provided by the Data Stewardship Handbook (HANDS) of Health-RI/NFU.
    • Include relevant data from your Electronic Lab Notebook.
    • Try to include ID’s associated with your research data such as the obvious ID’s from your related publication, but also ID’s from CTR, CCMO, preclinical trial registries and GenBank accession numbers. For this information use the Relation Metadata section.
  • Life Sciences – Health:
    • For handling your data, please follow the guidelines provided by the Data Stewardship Handbook (HANDS) of Health-RI/NFU.
    • For cohort and population studies, describe the variable labels and value labels.
    • Describe the questionnaires and/or other research tools, please take care not to share details on questionnaires that are copyright, or otherwise, protected, such as available from https://euroqol.org/.
    • Include a codebook: a description of variables and information about population, types of data (units of observation/analysis), sample procedure, response/non-response, data collection method, weighting variables, constructed and/or derived variables.
    • For health research within restricted geographical areas, try to include the appropriate geographical metadata (descriptive locality data, bounding boxes, polygons).
  • Life Sciences Biologie, Ecologie, Biodiversity and Agriculture:
    • Please make clear at which organisational level(s) your research focussed (Molecular, Macro-molecular, Organelles, Cells, Tissues, Organs, Organ systems, Organisms, Populations, communities, Ecosystem(s), Biomes, Biosphere).
    • Include, when appropriate, a list of the species that were subject of your research, including a reference for your used taxonomy (Catalogue of Life, Dutch Species Register, etc.).
    • For field based research within restricted geographical areas, please include the appropriate geographical metadata (descriptive locality data, point data, bounding boxes, polygons). You can include multiple localities for a single dataset.
  • Physical and Technical Sciences
    • For strictly technical datasets such as engineering, material studies, physics, computing, etc. the 4TU.ResearchData repository is recommended.
    • For just depositing software, DANS recommends the Software Heritage Archive, and not the DANS Data Station for Physical and Technical Sciences.
    • For Chemical datasets the IUPAC terminology should be used.
    • For geographic and environmental datasets, please include the appropriate geographical metadata (descriptive locality data, bounding boxes, polygons).

    Additional resources

    General Research Data Management

    GDPR and personal data 

     



     

    © DANS R.5.2 Version 1.4, August 1, 2025