Get and sync all Crossref metadata

Would you like to have all Crossref metadata? No problem! You can. There are a few questions you might want to ask yourself first, and below we look at how you can get your hands on it.

First, why do we give our data away like this, couldn’t we get more value from limiting access? To put it simply, it’s part of our mission. We want the metadata we collect to be used as widely as possible. The value of Crossref lies in our community, not in the ability to collect and hoard data. The larger our community is, the more we are moving towards fulfilling our mission. That community includes the people and organisations that use metadata, as well as those who provide it.

Do I need all of the data?

There are some good reasons to access all of Crossref’s metadata:

  1. Combining it with metadata from other sources to build products and services, or carry out research.
  2. To interrogate the data in complex or unusual ways that aren’t possible using the feature set of Crossref APIs.
  3. To use metadata with very high frequency and/or protect your product or tool from relying on the stability of Crossref’s services.

There are also a few reasons you won’t need all of the data:

  1. For a one-off research task that uses a subset of the data. You can often get what you need from APIs.
  2. Testing and prototyping new products and services. Having all of the data is likely to be overkill.
  3. Live products with a low or medium request rate. You might find that Metadata Plus provides enough stability and volume if you outgrow the polite pool.
  4. You’re bored. Do you really need that much data? Our latest snapshot is 208 GB in a compressed format. It takes some commitment, planning, and resources to handle the whole dataset.

Where can I find the data?

If you decide you would like a complete copy, you can start a local cache with a snapshot or public data file. The public data file is free, and we’ve just released the 2026 edition. Monthly snapshots are available to Metadata Plus subscribers.

How can I sync my data?

To keep your copy of the metadata updated, there are a few options depending on your needs.

To get literally everything, use the from-index-date and until-index-date filters in the REST API. This will get you all changes, and can (unlike the name suggests) be run using timestamps down to a 1 s resolution. In contrast with other date filters, -index-date is updated whenever metadata changes from members, Crossref, or external sources.

Most reindexes are citation counts, so if you don’t need those you will get all works redeposited by members by using the from-update-date and until-update-date filters. Note that the timestamp here is the deposit time and it can take up to a few hours to get into the REST API. We recommend a 24 hour delay to be sure you’ve captured everything.

If you’re only interested in knowing what’s new, use the from-created-date and until-created-date filters. We also recommend a 24-hour delay to ensure you have all of the records.