Skip to Main Content
It looks like you're using Internet Explorer 11 or older. This website works best with modern browsers such as the latest versions of Chrome, Firefox, Safari, and Edge. If you continue with this browser, you may see unexpected results.

Research Data: Documentation and metadata

Tips and support for data management for researchers at ÅAU

Why documentation and metadata?

Documentation refers to describing what happens to the data during the research process. Researchers are experts on their own research data/material and are, consequently, those who have expertise to create the documentation needed. Data without documentation and metadata are meaningless, since they cannot be understood and re-used. If it is difficult to assess what documentation is needed - imagine what an outsider needs to know in order to understand what your data is about and how it can be used.

The researcher documents the work throughout the project on several levels: 1) project level (background information, methods, etc.), 2) file level (relations between files), 3) variable level (descriptions of variables and their origin, use etc.).

Examples of documentation are:

  • code books/schemas, lab books, field diaries, notes,
  • descriptions of settings and calibrations for instruments and equipment,
  • desriptions of methods used,
  • readme-file: a .txt fil which describes the origin of the data and its contents,
  • administrative documents related to the research project, such as research plans, data management plans, contracts and agreements, research permits, scientific publications, permissions for data use, licenses, etc.

 

In general, the practices for documentation vary throughout disciplines and depends on the needs of the project.

The advantages of good documentation are:

  • The contents of the research project and its data are made understandble for the researcher and for others. Without documentation it is difficult to remember afterwards what has been done, when and how.
  • The risks for wrong interpretations and misunderstandings are minimized.
  • Documentation is needed at the end of the project, at the latest when archiving and publishing/opening the research data. Proper documentation practices already from project start make the archiving process more smooth.
  • Detailed documentation is necessary for validation of results and potential replications of the study.
 

Metadata means "data about data" and refers to information about the data needed for understanding and interpreting the data and how to use it: for example, the origin of the data, who has collected/generated it, time, place, methods, subject words, which describe the main content. Consequently, metadata is a crucial part of the documentation. A central aspect of the FAIR-principles is that the metadata is structured and machine-readable, which means that the data can be transferred between different data services.

 

Guides about documentation

CSC - data documentation https://research.csc.fi/metadata-and-documentation

DCC - disciplinary metadata https://www.dcc.ac.uk/guidance/standards/metadata

Finnish Social Science Data Archive's guide on data description and metadata  https://www.fsd.tuni.fi/en/services/data-management-guidelines/data-description-and-metadata/#metadata-standards

Siiri Fuchs, & Mari Elisa Kuusniemi. Making a research project understandable - Guide for data documentation (Version 1.2). Zenodo. http://doi.org/10.5281/zenodo.1914401

Organize your data files

Tired of not finding what you are looking for? It is a good idea to have a clear system for file management. Some best practices when organizing your files:

  • Create a simple, consistent and meaningful system for file names already in the beginning of the project. Do not use the same file name more than once.
  • Create a logical folder structure to more easily search and find files, when needed also a hierarchical structure can be used (main folders and sub folders).
  • Tag files to find them more easily. A file can be located in one folder only, but have many tags instead.
  • Use version control to manage old and new versions of the files, either manually or by using software for automatic version control (e.g. in GitLab). Manual version control is usually enough in projects which are not data intensive. Indicate the version in the end of the file name, for example: V02-03
  • Write a readme file which contains all information needed for interpreting the data, for example the origin of the data, contents, name conventions. Add the readme file in a logical place in the folder together with other data files.