Your data are valuable! Preserve, archive, publish, or open your research data and the metadata (descriptions of the data) in a responsible way at latest at the end of the project. Note that large research funders may require FAIR and open data, in case there are no reasons for keeping the data closed or destroying them.
In a research project, large amounts of data are collected and generated. Consequently, it is a good idea to already at the plannig stage think about which datasets may be useful or valuable in the future.
The researcher needs to assess which data (everything or a part of it) should be preserved, for which purposes access can be provided (for example, research only or also teaching), and for whom. Research data may be deposited in a data archive which curates the data and provides access. In addition, juridical, contractual and research ethical reasons may affect to what extent the data can be published and opened. Disposal of data also requires planning in advance.
ÅAU:s open science policy promotes the openness, transparency and reuse of research data following the FAIR principles, according to which research data should be findable, accessible, interoperable and reusable. According to the national policy on open research data and methods (2021-2025), research data and methods should be made as open as possible, and as closed as necessary. In addition, data should be managed in a proper way to meet the FAIR principles.
FAIR data and open data are not synonymous, although the terms appear together. Research data may be FAIR without being (completely) open, and data may be open without following the FAIR principles. The aim is to make research data as open and as FAIR as possible, following needed juridical and research ethical aspects.
For example, the metadata (description of a dataset) may meet the FAIR principles, in case it is not possible to completely open the data (such as sensitive data or data related to patents, innovations). In many cases, anonymized data (from which personal, sensitive, confidential data have been deleted) can be archived and/or published/opened for future use. Embargoes can be applied when the data cannot be made available immediately.
When archiving data, it is important that
Firsthand, discipline-specific data archives/repositories are recommended in case they adhere to the FAIR principles. As far as the criteria above are met, the archive/repository can be considered appropriate for your data. In case there is no suitable archive for your data, a generalist data archive/repository is a good option.
Humanities, social sciences, health sciences etc.:
Language research:
Resources for finding data archives/repositories in natural sciences (also according to data types):
Service for publishing code:
The following are free to use for researchers and maintain a good archiving policy:
Opening data make them available for re-use for other researchers and for the entire society. Open data also promotes the transparency and reliability of research.
CC BY Danny Kingsley & Sarah Brown
The Finnish FAIR data services, provided by CSC, consists of IDA for data storage, the Qvain tool for describing and publishing datasets, and the data finder Etsin for exploring available datasets.
The recommended minimum effort is to make a description (=metadata) of your dataset available in Etsin. Enter the metadata by using the Qvain tool, which provides your dataset with a landing page. After the dataset has been published, other researchers and research funders may find the dataset in Etsin.
Applying an open license is a way of informing others of what rights they have to share and reuse one's research data. Without a license, potential valuable reuse may be unwillfully restricted.
It is possible to legally protect and restrict the reuse of one´s data referring to:
Publishing data with a restrictive license (CC-BY-NC-ND) is to be preferred to keeping it on your own harddrive.
Opening the data under the license CC-BY (or CC0 including a requirement to quote) is explicitly giving others the right to reuse it, which may prove beneficial in the long run since those who may want to use the data won't have to track down every single participant in the creation of the data to get permission for reuse.