Prepare your data

Before you upload your data to the EUR Data Repository, you need to prepare your data. This ensures that you can effectively use the EUR Data Repository and have your data published quickly and without unnecessary delay.

Below you will find a list of things to keep in mind when preparing your data. Ideally these have already been addressed at the start of the research project in your data management plan. If so, preparing your data will be relatively quick. If not, getting your data ready to share may take some time.

EU privacy law or the General Data Protection Regulation (GDPR) applies to all personal data. Personal data is any piece of information or any combination of pieces of information that can directly or indirectly identify your research participant. Examples are: name, e-mail address, IP-address, geospatial location.

GDPR sets forth requirements that need to be met before you can publish personal data. This includes removing personal data that is not required when reusing the data, making your data as difficult as possible to trace back to a research participant, how to interpret the informed consent that has been given, if applicable, etc.

Choose a framework for naming your files in such a way that they reflect what they contain. This way, you and others can easily identify the files needed. Elements to consider for inclusion are date of creation, description, location, project number, version number. Other things to take into account are naming files consistently, keeping the file names short but descriptive (<25 characters), avoiding special characters or spaces, using capitals and underscores instead of periods or spaces or slashes, using a fixed date format (e.g. ISO 8601: YYYYMMDD), and include version numbers.

Examples: 20200125_DMP_V3.pdf, 20200211_IC_Template.pdf, 20190719_Image_Cropped.jpg, 20210628_Data_Processed.sav.

Data that is shared should be in a file format that ensures long-term access. It is recommended to use formats that are frequently used, have open specifications, and are independent of specific software, developers, or vendors.

Although frequently used, formats such as Word and Excel are not preferred. Instead, consider converting Word files to PDF, and Excel files to csv format. You can find a list of preferred formats here.

Data that is available for reuse needs to include enough documentation for others to be able to know what it represents and use it correctly. This can include general descriptive elements such as topic, main analysis method, location and timeframe of data collection, in- and exclusion criteria, and also more specific documents such as a codebook with variable names and -labels and syntax or code used to run analysis

Open data is not just about the data file itself. Equally important is the accompanying documentation that provides the context in which the data was collected and describes how the data was collected and analyzed. This collection of files is often called the ‘publication package’ and it includes everything that is needed to reproduce the research or reuse the data.

Examples of files to include in a publication package are: the [raw] data file, a codebook listing the variables and categories, the syntax or code used to analyse the data, a copy of the actual survey or questionnaire that was used, a list of interview questions, the [transcripts of] audio or video recordings, and a readme file describing the method and steps you used to analyze the data.   

If you have any questions or need help preparing your data, contact your faculty data steward. If needed, your faculty data steward will involve other research support staff such as a privacy or legal officer.

Compare @count study programme

  • @title

    • Duration: @duration
Compare study programmes