Documentation refers to describing what happens to the data during the research process. Researchers are experts on their own research data/material and are, consequently, those who have expertise to create the documentation needed. Data without documentation and metadata are meaningless, since they cannot be understood and re-used. If it is difficult to assess what documentation is needed - imagine what an outsider needs to know in order to understand what your data is about and how it can be used.
The researcher documents the work throughout the project on several levels: 1) project level (background information, methods, etc.), 2) file level (relations between files), 3) variable level (descriptions of variables and their origin, use etc.). The FAIR principles (findable, accessible, interoperable and reusable) must characterize the entire research process with reproducibility as a criterion for how the research process and data are documented and which data is opened in the long term.
Examples of documentation are:
In general, the practices for documentation vary throughout disciplines and depends on the needs of the project.
The advantages of good documentation are:
Metadata means "data about data" and refers to information about the data needed for understanding and interpreting the data and how to use it: for example, the origin of the data, who has collected/generated it, time, place, methods, subject words, which describe the main content. Consequently, metadata is a crucial part of the documentation. A central aspect of the FAIR-principles is that the metadata is structured and machine-readable, which means that the data can be transferred between different data services.
1. Is your work steered by a data management plan throughout the entire data lifecycle so that all the data processing procedures are open and sufficiently documented?
2. How have you taken into account the openness of data and usage restrictions throughout the process?
3. Have you utilised shared practices, such as standards and glossaries, in the metadata and the actual data?
4. Have you systematically documented the lifecycle of the research data and is the description accurate? Are as many data processing stages as possible automated and is the code stored? Are the software and settings used documented (technical documentation)?
5. Have you versioned the data and other outputs?
6. Have you stored the data and its documentation in a referenceable form (persistent identifiers and metadata)?
Lehtisalo, A. et al. (2023). Improve the quality and impact of your research through data management - A guide for making your data FAIR. Zenodo. https://doi.org/10.5281/zenodo.8012377
CSC - data documentation https://research.csc.fi/metadata-and-documentation
DCC - disciplinary metadata https://www.dcc.ac.uk/guidance/standards/metadata
Finnish Social Science Data Archive's guide on data description and metadata https://www.fsd.tuni.fi/en/services/data-management-guidelines/data-description-and-metadata/#metadata-standards
Siiri Fuchs, & Mari Elisa Kuusniemi. Making a research project understandable - Guide for data documentation (Version 1.2). Zenodo. http://doi.org/10.5281/zenodo.1914401
Improve the quality and impact of your research through data management - A guide for making your data FAIR - AVOTT working group (2023)
Tired of not finding what you are looking for? It is a good idea to have a clear system for file management. Some best practices when organizing your files: