ELI5: What’s a DOI? - Membership Ticket of the Month - October 2024

Hello again!

A publisher recently asked if we had a super basic explanation of what DOIs are and what they do. They wanted to include this with their instructions for authors, to encourage them to include DOIs with their references when they submitted manuscripts for peer review. (Side note: authors can get help with that using our very user-friendly Simple Text Query tool.) After polling the Crossref team, we realized that we didn’t have exactly this sort of explanation and a lot of what can be found in our documentation assumes an existing level of familiarity with the concept of the DOI. Here, I’ll try to offer a simple primer on the what, why, and how of DOIs. I’ve also dropped a tl;dr at the bottom of this post if you don’t have a lot of time.

First, let’s think about the problem: around the year 2000, scholarly publishing of peer-reviewed articles and books was increasingly going online. To make access to web-based resources easier and quicker, publishers started to include cited articles’ URLs with their references, like in this citation of an article in the Electronic Journal of Communication from an article in the journal First Monday:

S. C. Herring, 1993. “Gender and Democracy in Computer Mediated Communication,” Electronic Journal of Communication, volume 3, number 2, at http://www.cios.org/www/ejc/v3n293.htm

The presence of the URL (http://www.cios.org/www/ejc/v3n293.htm) in the reference makes it easier for the reader who may wish to jump from the First Monday article they’re accessing to the Electronic Journal of Communication article it cites. If the link were not there, the reader would need to copy the article details and paste them into a search engine to try to locate the article.

There is a problem with this: links rot. Maybe a publisher updates their hosting platform which results in new URLs for all its articles, or maybe the journal is sold to another publisher entirely, resulting in the articles moving to a whole new website! Because content moves around online, links change. If an article moves to a new location then the old URL may simply stop working. Whose responsibility, then, would it be to update the URL in the First Monday reference section if the article in the Electronic Journal of Communication changed its URL? Would the Electronic Journal of Communication need to email the editor of every journal citing this article, or would the First Monday editor need to be proactively checking every URL in every one of their articles every so often? Or would First Monday rely on readers spotting broken links and asking them to update the URL in the reference section every time? And what if the URL has been published in the print version of a journal that’s published both online and on paper? Does every updated URL means that a reprint of the text will be required???

DOIs solve this problem.

A DOI is a digital object identifier. Each DOI is a unique string of characters that identifies some digital object, like a journal article, book, report, or dataset. For example, the First Monday article mentioned above has been assigned the DOI 10.5210/fm.v3i9.617. Each digital object should only have one DOI assigned to it, and only the digital object’s publisher or an organization designated by the publisher has the right to register a DOI for that object. Publishers register DOIs with Registration Agencies (also called RAs), of which Crossref is one.

However, a DOI isn’t only that string of characters. By itself 10.5210/fm.v3i9.617 tells us almost nothing. The other part of the DOI, the powerful but invisible part, is the metadata.

When a publisher registers a DOI for an article in one of their journals, they send their RA a lot of additional information – metadata – about the digital object, including (but not limited to): its title, who wrote it, when it was published, which volume/issue it was part of, and its page numbers. Perhaps most importantly, publishers also include the current URL where the article can be found.

Once a DOI has been registered, anyone can access the URL by resolving the DOI: https://doi.org/10.5210/fm.v3i9.617. If you click this link, it takes you to the article’s current URL: https://firstmonday.org/ojs/index.php/fm/article/view/617. Your computer knows to go to this URL because the handle system which underpins DOIs uses the metadata provided by the publisher to find the current location for this paper.

Once a DOI string is registered it cannot be changed; DOIs are intentionally persistent identifiers. 10.5210/fm.v3i9.617 will forever be the only DOI for the article “Predicting E-mail Effects in Organisations” from volume 3 of First Monday. However, all of the metadata associated with a DOI can be updated. The publisher can always make corrections, add more details to the metadata and, critically, change the article’s URL. For example, when 10.5210/fm.v3i9.617 was first registered with Crossref, it led readers to the article’s old home, http://journals.uic.edu/ojs/index.php/fm/article/view/617. Some time later, the publisher updated the URL to the current location: https://firstmonday.org/ojs/index.php/fm/article/view/617. If the link ever changes in the future, the publisher can just update the metadata again! But notice, even as the URL associated with the DOI changes, the DOI itself (10.5210/fm.v3i9.617) has not changed.

As you can see, the ability to update metadata for a DOI allows publishers to include DOIs in their references instead of more easily broken URLs, lessening the problem of broken links and making access to cited references smoother.

DOIs are a solution to the problem of broken links, but they are also much more than that. Here are a few more powerful features of and uses for Crossref DOIs and their metadata:

  • References: when publishers submit an article’s or book’s list of references as part of their metadata, our systems match the cited and citing articles, creating a linked relationship between the article and each of the works it cites.
  • Author ORCID iDs: ORCID iDs are persistent identifiers for researchers, similar to how DOIs are persistent identifiers for research works (here’s mine: https://orcid.org/0000-0002-4032-3117). Like a DOI, an ORCID iD doesn’t change. If you change your name or use variations in your name across different publications (like C. Knopp-Schwyn versus Collin Knopp-Schwyn versus Lin Knopp-Schwyn), you will always have the same ORCID iD. When authors include these IDs in their manuscripts and publishers submit those ORCID iDs with their DOI metadata deposits, the published article information can automatically be sent to your central ORCID profile (with your permission, of course), saving you the time of updating it manually for every work you publish.
  • Affiliations and ROR IDs: ROR IDs are persistent identifiers for organizations involved in the research process, such as universities, archives, companies, research funders, institutes, and nonprofits (here’s Crossref’s: https://ror.org/02twcfp32). When publishers include ROR IDs with a DOI’s metadata it can help make tracking the research outputs for a specific entity much easier. For instance: the Université Djilali de Sidi Bel Abbès in Algeria is also called Djillali Liabes University and Djillali Liabes University Sidi Bel Abbès in English, or Université Djillali Liabes and Université de Sidi Bel Abbés in French, or جامعة جيلالي ليابس-سيدي بلعباس in Arabic. An administrator working at the Université Djilali de Sidi Bel Abbès might have a lot of work ahead of them trying to track all the publications by scholars affiliated with the university under so many different names! But when the institution’s ROR ID is included alongside the affiliated author’s name in the Crossref metadata, it can make tracking research outputs by institutions much simpler.
  • Grant Linking System: funding organizations such as foundations and government agencies are able to register DOIs for the grants they award, so researchers can provide these identifiers in the metadata they provide to their publishers. Those publishers can then, in turn, include this in the metadata for the related research output, making it really easy for funders to track the impact of their research.

These are just a few of the features and uses of Crossref DOIs, but there are many more. Plus, the fact that all of this metadata is freely and openly available for anyone to access and analyze themselves means that people are finding new ways of studying, tracking, and building upon scholarly outputs all the time. And the fact that we collect this metadata in a standard, machine-readable way means that it can be used by hundreds of tools and services in the scholarly ecosystem too - you can read more here if you’re interested.

tl;dr: DOIs are persistent identifiers for research works like articles and books. The metadata associated with DOIs can be updated without updating the DOI itself, which helps DOIs solve the problem of broken links between cited works and the materials that cite them. The metadata associated with DOIs can also be used for other helpful purposes, like tracking citations, author outputs, and research outputs by university!

Want to know more? Check out this DOI primer from the DOI Foundation and this document about the process of submitting metadata to Crossref.

If you have any questions please just comment here and ask or send us a message at member@crossref.org!

Kindly,
—Collin

7 Likes