Top 7 reasons why some researchers hold back from sharing their data
1. I don’t have any data to share
It is a common misconception that the term ‘research data’ only refers to spreadsheets of numbers. In fact, data are any of the information collected or generated during your research that supports your results. This can include methodologies and protocols to lab books, sequences, or code.
In areas such as the Humanities or Social Sciences, what constitutes research data may not be so immediately obvious. However, think about the information you have created and saved during the process of your research, such as interview recordings and transcripts, field notes, annotations, and sketches. All of this is data and can potentially be shared, as long as there are no ethical or legal reasons why you shouldn’t. For more, see what are research data?
Publishing tips, direct to your inbox
Expert tips and guidance on getting published and maximizing the impact of your research. Register now for weekly insights direct to your inbox.
Some researchers worry that if they share their research data before they’ve had chance to publish any analysis, other researchers may use those data to publish first and steal the credit. If this is a big concern in your area of research, then you may wish to restrict or embargo access to your data for a period.
However, sharing your data early does allow you to plant your flag in a research area. Even if you’re not ready to publish conclusions based on a dataset, you can further strengthen your association with it by publishing a data note, describing how the data was collected and conditions of use.
“It is also good to remember that if someone uses the data, they need to cite that prior work. Citations are the metric of impact in science; more citations promote the research.”
If your dataset includes sensitive information, especially if it involves human subjects, it is legitimate to be concerned about the ethical, legal, or security issues that may prevent you from sharing it.
In some instances, research data cannot be shared publicly due to risk of violating privacy. However, even highly sensitive information might be shared if you follow the right steps. These include:
4. If I deposit my dataset in a repository, I’ll lose control of it
While we’re great supporters of making data as open as possible, there will be times when it’s right to share data on a more restricted basis. This might be because of the sensitive nature of the data or for commercial reasons. However, hosting your dataset in a repository needn’t mean that you no longer have control over who can access it or what they do with it.
Many repositories will allow you to deposit your data but limit access to them, either permanently or following an embargo period. Some of the generalist repositories offering this type of functionality include Figshare, Zenodo, and OSF. Read our guide to data repositories for more details.
Repositories also often give you the option to select from a range of licenses, which allow you to control the extent to which others can reuse your data and whether they need to credit you as the creator of those data. So, for example, if you wish to prevent others from using your data for commercial purposes, you can do so by selecting a Creative Commons Attribution-NonCommercial license.
If you do decide to restrict the access or reuse terms for your data, please note that this may conflict with the data policy of your funder or publisher, unless there are legitimate reasons why you are unable to make your data more available. Please also explain any restrictions in the Data Availability Statement included in your article.
5. Sharing my data would disclose my identity during anonymous peer review
Some journals operate a double-anonymous peer review process. This means that reviewers shouldn’t know the identity of a paper’s author. Researchers submitting to one of these journals are required to make sure that their paper is fully anonymized. But what if the data cited in the article reveals the researcher’s identity?
Fortunately, there are options that will allow you to share your data in a repository and then submit to a journal with double-anonymous peer review. For example, you can use the repository Figshare to generate a ‘private sharing link’ for free. This feature is especially designed for anonymous peer review and doesn’t include the author field or any non-Figshare branding. Dryad is another (paid for) alternative which allows you to make your data temporarily “private for peer review.”
The main steps that may take time are preparing your data for sharing, selecting a suitable repository, and depositing your data in your chosen repository. However, there are ways of doing each of these efficiently:
Preparing your data for sharing How long this takes will depend on the size and nature of your data. If, for example, your data needs to be anonymized first this is a process you should do very carefully. But whatever your data, preparation will be much reduced if you plan ahead. Create a data management plan at the beginning of your project that includes details about the data format you’ll want to deposit. That way you’ll be able to arrange your data accordingly while they are being collected or processed. Some repositories will also provide curation services to ensure your data is in the best shape to ensure discoverability and reusability.
While some of these actions to prepare data for sharing can seem time consuming, it may save time in the long run if you receive requests from readers to access the data in the future. Even if you cannot share your data publicly, having clearly marked files and folders with the final versions of your data will be valuable in answering questions in the years to follow, especially as you move on to other research projects.
Choosing a repository There are many different repositories to choose from, so it may take a little time to select the one that’s right for you, especially if you’ve not used repositories before. You can though save time by talking to your library, data specialists at your institution, or your peers, to get their views on the repositories in your field, and by reading our guide to data repositories.
Depositing your data in your chosen repository Make this part of the process as smooth as possible by gathering everything you need before getting started. As well as depositing your data, you may be required to supply information about the dataset (metadata), such as when it was created, keywords, and any associated publications. For more details see How to share your research data.
7. Sharing my data will be expensive
Aside from the cost of your own time used to prepare and deposit your data, it doesn’t need to be an expensive process. While some data repositories do charge, especially if they curate or validate the data, there are also plenty of repositories which are free to use. Read our guide to data repositories for more details.
If you’re working on large data sets you may need to set budget aside for data preservation and storage while the project is running. Be sure to consider this as part of your Data Management Plan and discuss options at your institution with your library or data specialists.
Although sharing your data is another thing to add to the researcher’s already very long to-do list and may use up some of your budget, it’s undoubtedly a worthwhile use of your time and money. As our list of the benefits of data sharing demonstrates, it’s a process that can bring many returns.
What to read next
We hope this run-down of some common data sharing myths has addressed any concerns you may have had. Now it’s time to find out how to share your research data with our step-by-step guide.