Our guide to sharing and citing data explores what research data are and the many benefits of sharing data. The following steps outline how you can effectively share your data and what to consider along the way.
1. Plan to share: write a data management plan
It’s best practice, and often required by funders, to draw up a data management plan (DMP) right at the start of your research project. The format of these plans varies but will typically include how you will collect and securely handle collected data throughout the project and beyond, as well as who is responsible for them. For example, see this example from the Digital Curation Centre.
Please also think about how you will share your data and detail that in the plan. In particular, you should consider:
- Consent for data sharing: participant consent forms should not only include permission to use their data in your research but also publication of the (anonymized) results.
- Format for sharing: if you decide what format you want to use to share your data, it will save you time later if you incorporate that into the data collection or processing stages.
2. Prepare your data for sharing, in line with ethical best practice
Firstly, you will need to ensure that there aren’t any ethical, legal, or security reasons why you shouldn’t make your data available. Even if you can share your data, you may need to anonymize it to protect the privacy of research participants. See our introduction to data sharing ethics.
There may be discipline standards about how to format your data and label files for sharing. Please also check whether your chosen data repository has any specific requirements.
As well as the data themselves, you also need to prepare any resources others will need to understand your data. This might include a data dictionary and the details of any software required to view the data.
3. Deposit your data in a repository
A data repository is an online platform for researchers to deposit datasets associated with their work. There is a wide range of data repositories to choose from, so we recommend speaking to your institutional librarian, funder, or colleagues for guidance on choosing a repository that is relevant to your discipline. You can also use FAIRsharing and re3data.org to search for a suitable repository – both provide a list of certified data repositories. Read our full guidance on choosing a data repository.
As well as depositing your data, you will most likely need to supply information about the dataset; this is known as metadata. Metadata can include the date it was created, file type and format, creator of the data, key words, location, a description of the data and how they were generated, relationships to other digital objects such as the DOI of the article in which the data are described and analyzed, version information, relationships to other digital objects, and other information that might be relevant to your subject area.
We encourage all researchers to consider the FAIR Data Principles when depositing data, to maximize its use. FAIR data is:
- Findable: includes rich metadata and a persistent identified (e.g. DOI) so that others can easily find and discover it.
- Accessible: the data and metadata are understandable, so machines and humans can read and process it. Data also needs to be located in a trustworthy repository which preserves them in perpetuity.
- Interoperable: the metadata are expressed in a formal, accessible, shared, and broadly applicable language, allowing sharing between different systems.
- Reusable: data have a clear and accessible usage license that specifies reuse.
For more information on FAIR data read the FAIR data principles.
An increasing number of funders and publishers have policies which require you to make your data open. You can do this by depositing your data in a public repository under a CC BY, CC0 or equivalent license. This means that not only is the dataset freely available for anyone to access but, crucially, it is also available for anyone to reuse for any lawful purpose. Find out why you should choose open data.
4. Consider publishing a data note
Data notes are a short peer-reviewed article type that concisely describe research data stored in a repository. They provide information on how the data was collected and validated and any conditions of use. They increase the discoverability and transparency of your research, helping to comply with funder mandates on data sharing and make data FAIR.
5. Add a Data Availability Statement (DAS) to your research article
When you’re submitting your paper to a journal with a data sharing policy, you’ll be prompted to include a data availability statement (DAS) in your submission. These statements provide information on where and under what conditions the data directly supporting the publication can be accessed. The aim of such statements is to make data more findable and discoverable.
Please include your data availability statement within the text of your manuscript, before your ‘References’ section. So that readers can easily find it, please give it the heading ‘Data availability statement’.
Before submitting your article, please read our guide to data availability statements, which includes template statements you can use.
6. Share your research code
You may also have created new code during your research; perhaps as a direct output of your work or as a tool to help you analyze the data you’ve collected. You should also consider including code in your data sharing plan, especially if the code you’ve created is required for others to validate your results.
As with sharing other forms of data, this can make your work more discoverable, reproducible, and help ensure you get credit for a type of research output that often remains behind the scenes. Find out more about sharing your code.
7. Link your data to your article
Just as a Data Availability Statement will aid discoverability, by connecting your journal article to its associated data, you should also link your data in a repository to any publications. Once your research has been published, you can add the article’s Digital Object Identifier (DOI) to the metadata of your dataset to establish a permanent link between these two outputs of your research.