Research data repositories

You may be asked to publish your research data or code by a publisher when you submit a journal article for publication, or you may wish to preserve your research data or research process beyond the lifetime of your project. In both cases, repositories provide the best option for publishing, preserving, and sharing your research data.

The UKRI Common principles on research data recommend that research data should be made openly available with as few restrictions as possible in a timely and responsible manner. Typically, this means that research data should be deposited in an appropriate repository, as open as possible and as restricted as necessary, no later than the time of publication of the associated research output or the end of the project.

Preserving your research data in a repository is a necessary first step towards data sharing, but it is still important to preserve your research data even if you do not plan to or cannot share them with others.

On this page you’ll find guidance on how to publish, preserve and share your research data via a data repository, how to find a data repository, and how to assess the suitability of a repository. 

You will also find information on publishing your data via a data journal.

To find out how to publish and share your code, read Github’s guidance on using the Zenodo repository to make your code citable.

How to use a repository

When depositing research data for publication, preservation, and sharing, you will need to:

  1. Choose which research data or research materials you need to publish or preserve (see our guidance on appraising and selecting data for preservation)
  2. Choose an appropriate repository to preserve, publish and share your research data (see our guidance on finding and assessing research data repositories below)
  3. Prepare your research data by organising your files in a meaningful structure and converting your files to open or common formats, to maximise the longevity and reusability of your research data (see our guidance on preparing data for preservation)
  4. Create metadata to accompany your deposited datasets in line with any repository metadata requirements, to enable your research data to be discoverable, understandable and reusable (see our guidance on describing your data)
  5. Decide how open or closed your data will be. You will need to check that data is shared in compliance with funders’ data sharing requirements and any terms of participants’ consent, and that permission to deposit and share has been granted by all stakeholders in the IPR of the data, as well as right-holders of third-party material. You may need to split your data into several deposits with different levels of access, or publish a metadata-only record (see our guidance on restricting access to data)
  6. Upload your data to your chosen repository and generate a Digital Object Identifier (DOI)
  7. Select an appropriate reuse licence (see our guidance on licensing your work)
  8. Add an embargo on open publication of the data up to the date of any associated research outputs if your data are in support of a publication, or if you have exclusive use of the data for a specified period
  9. Link the data deposit to its associated publication (if applicable) using the publication’s DOI, and include the dataset’s DOI in your article’s data access statement (see our guidance on writing a data access statement)
  10. Register your dataset as an output in the Westminster Virtual Research Environment

How to find a suitable data repository

The best repository to choose for your research data will be a national centre or discipline-specific repository because they have the expertise and resources to deal with particular types and sizes of data.

If you are depositing and publishing your research data to accompany a journal article, you should check whether your publisher recommends a particular repository. For example, Scientific Data (Nature journals) maintains a list of recommended data repositories, and PLOS hosts a list of recommended repositories by discipline. A number of journals support the use of the cross-disciplinary repositories Dryad, Figshare and Zenodo (a repository that specialises in preserving software and code).

You should also check whether your discipline recommends or mandates the use of specific repositories. For some data types, such as genetic sequences and protein structures, you must deposit the data in GenBank and Protein Data Bank, respectively.

Find a data repository relevant to your discipline at the Registry of Research Data Repositories (re3data).

If your research has been funded by a research council or funding body, you should check whether your funder mandates or recommends the use of a specific repository. For example:

If you wish to deposit your research data or research materials in the University of Westminster’s institutional repository, please contact [email protected].

How to assess the suitability of an external data repository

There are a number of things to consider when selecting a suitable repository to preserve and publish your research data:

  • What type of data does the repository accept and what is its subject focus?
  • Does the data repository already have good reputation in your field and is it recommended by your funder or journal?
  • Will the repository provide enough metadata to enable your data to be discovered and cited by other researchers?
  • Will the repository issue your data with a persistent identifier, such as a Digital Object Identifier (DOI) or an accession number, that you can include in your data access statement? A search for repositories in re3data allows you to tick a box restricting results to those that provide persistent identifiers.
  • Are access restrictions or embargoes permitted? Will the repository ensure that confidential or personal data are secured if that is required?
  • Do the repository's terms and conditions fit with the University's Intellectual Property policy(intranet link login to access)? For example, does the repository require that you assign any copyright in the data to the repository? We recommend avoiding using repositories that require transfer of rights.
  • What licences are available and do they comply with the University's Research Data Management Policy?
  •  Is the repository established and well funded so that you can rely on it still preserving your data in 10 years time (the minimum retention/preservation period specified by most research funders)

If you are considering using an external data archive and require advice on its suitability, please contact [email protected] for advice.

Publishing your data or metadata in a data journal

Open data journals publish peer-reviewed datasets and descriptions of datasets (metadata records). Publishing a description of your dataset (or, metadata-only record) is a good option to help you demonstrate transparency and integrity in research when the dataset itself may be too sensitive for open publication.

There are subject-specific data journals, such as the Journal of Open Psychology Data and Open Health Data, or multidisciplinary data journals such as Gigascience (Oxford University Press) and Scientific Data (Springer Nature).

You can find guidance on how to write and publish a description of your dataset as a data note on Wellcome Open Research.

To find an open data journal relevant to your discipline, search the list of peer-review open data journals hosted by the University of Edinburgh.

More information

For further guidance on choosing and using repositories, see the Digital Curation Centre's checklist on where to keep your research data.

 

We acknowledge the work of the University of Bath, the University of Sheffield and the University of Southampton in the development of this guidance.

Contact us

For further guidance and support, please contact the research data management officer at [email protected].