What are the concrete requirements of funders or publishers regarding long-term data management and accessibility
Persistent identifcatiors are names that are given to publications of any kind (research results, data, software). These names are linked to the place of their publication within a table. If this place changes, only the reference within the table has to be changed while the identificator stays the same. This usually happens automatically, unbeknownst to authors and users.
Through this, literature research, quoting and links of publications and accompanying data are possisible in the long term. Examples for these identificators are the Digital Object Identifier (DOI), the Uniform Resource Name (URN), ePIC or Handle.
Further information on this can be found in chapter 13.2 of the nestor guidebook “Eine kleine Enzykopädie der digitalen Langzeitarchivierung”.
Meta data are structured data that contain additional pieces of information on a specific ressource. This can be a description of contents, a technical description, the context of its creation, relations to other sources and works …
Due to different disciplines having different requirements for meta data, different standards have emerged (e.g. the „Ecological Metadata Language“ and the „Gene Ontology“ for biology). Metadata provide a standardized, machine-readable description that makes finding, referencing and using research data later on possible.
Re-using means that published research data can be cited and/or used for other scientific research, all within the bounds of your data’s chosen license.
The best formats are open, non-proprietary ones. The project RADAR offers a good and specific overview over the options available.
Research data should be archived properly to guarantee that future research can profit from it and to ensure that your research can be reproduced by the scientific community.
Collecting research data is time-consuming, money- and labour-expensive. Therefore, archiving your data is oftentimes cheaper than collecting data anew, especially since sometimes, your once collected data can not be reproduced (for example data on weather).
Archiving your data is recommended by research funding third-parties and scientific publishers to ensure good scientific standards. Sometimes it is even compulsory. Also, there are benefits for you as well.
Citation styles differ depending on subject and publisher.
Ask your colleagues or your publisher in advance, or consult a guidebook on scientific research dealing with the conventions of your subject.
An example on how to quote data in a bibliography would be the following (according to a recommendation by FORCE11):
- Author(s) (Year of Publication): Title of Research Data. Data Repository or Archive. Version. Globally persistent Identificator (preferably a URL)
The citation of software can be done accordingly:
- Author(s) (Year of Publication): Title of Software (Version). [Form, e.g. Computer Software] Source as a URL and/or DOI (Date of last retrieval)
Further information on this topic can be found here:
- The literature management programme Endnote (version x5 onwards) provides citation templates for research data (dataset) and software (computer programme).
- In case you know the DOI of the data you want to cite, you can use Crosscite to convert it into a full reference and different quotation styles.(http://software.ac.uk/so-exactly-what-software-did-you-use)
Does research date management imply that anybody has unrestricted access to my data, and if so, what can I do if I first want to analyse my data?
You are in charge of deciding who has access to your data and you can determine this via licenses. In general, you can publish your data with a delay (a so-called embargo) or only publish metadata.
Please mind the specific conditions and policies of your funders and publishers.
Depending on the place where your data is saved (repository, data journal, …) and your individual goals, there are different licensing models available.
The most used models right now are Creative-Commons and Open-Data-Commons.
For a specific recommendation on this issue, please turn to Petra Heermann, the library’s contact person for legal issues.
There are legitimate reasons to not publish and grant the general public access to your data:
- In case you want to apply for a patent.
- In case your data contains confidential or personal data (for example from questionnaires or interviews) that could not be annonymized and / or where you do not have a written form of agreement from your test persons to publish these sensible points of data.
- In case your research is funded by a commercial sponsor who did not agree to publication.