How to anonymize data
Take care to anonymize any data that may otherwise identify study participants. The UK Data Service best practice guide to managing and sharing data has lots of good advice about how to anonymize your data, including:
Remove anything that identifies the subject – this might include names, addresses, workplaces, occupations, or salaries.
Take out unnecessarily precise information – for example, you can replace peoples’ date of birth with their age.
Generalize where you can – for example, you can replace peoples’ specific area of expertise with more general definitions.
Use pseudonyms – you can use fictitious names that take the place of peoples’ real names.
Avoid listing the upper or lower ranges of variables – this will disguise outliers, such as salary range for example.
Pay special attention to relational data where relationships between variables in datasets could reveal identities and where geo-referenced data and spatial references may reveal location.
How to manage data anonymization
These 3 tips can help you manage the process of anonymizing your data.
It helps if you consider your data anonymizing plans early on in the research process while you are in the process of collecting them – if you don’t, it might prove time consuming and costly.
Keep the original data separate and secure
It is essential to keep a copy of the original data for your own use and make a record of all the information that has been removed in the process of anonymization. Always store this information separately from the final anonymized data files and ensure that it is secure.
Be transparent about where you’ve anonymized data
When you remove content and replace it with generalized information, mark this in an obvious way. For example, show that you have edited interview text with brackets or use markup tags.
Control access to your data
We support the principle that research data should be as open as possible but as closed as necessary.
For sensitive data you may only want to make it available to third parties who have a legitimate reason and who you are certain will treat the data carefully.
In these instances, it is still possible to deposit your data in a repository but restrict access to it. This might mean that the files are private, but you can share access with others if certain requirements are met. You may also want to set different privacy settings for different components of your data. Some of the generalist repositories offering this type of functionality include Figshare, Zenodo, and OSF.
If you want more details about this, read our guide to data repositories.
Important: there are some cases that you should not share your data with third parties – read on to find out more about this.