Looking at the 5.4.0 schema for citations, I see no way to add a url for blog or software citation. Is this data not needed for a citation.
That’s correct. There’s no ‘url’ tag for structured citations. URLs can be included in unstructured citations, though they’re not used for citation matching.
URLs are not persistent (that’s why DOIs exist, after all), so they can’t reliably be used to identify cited works.
Please let me know if you have any further questions.
A more accurate statement is that some URLs are not permanent. As a publisher, we have been maintaining permanent URLs since 2000, and those URLs are used as permanent identifiers on our web site. An example is https://eprint.iacr.org/2000/010 which is coming up on 25 years old. It is common for people to cite preprints from this server, and they are referred to by their permanent URL. This Springer article cited some, and the permalinks were reported in their crossref metadata. Furthermore, a study in 2024 found that over half of the URLs archived by the Internet archive between 1996 and 2000 are still active. APA style recommends including a URL in a citation in some specific cases when a DOI does not exist (a DOI would almost always be preferred, but some things don’t have DOIs). MLA also recommends including a URL in some cases. One problem with URLs is that the content at that URL may change over time, so what the author sees may not be the same later. That’s why MLA and APA recommend including a date for what a URL was cited.
Unfortunately, DOIs are also vulnerable to various forms of “link rot”. There was a study published in 2022 with an earlier (readable) preprint that found a significant amount of inconsistencies in the content returned from a redirected DOI, which casts doubt on the value of the DOI as an identifier or scholarly literature. DOIs suffer from other problems that were reported by Martin Eve. Around 3% of DOIs fail to resolve at any given time.
One application of this citation data is to provide the backwards lookup of citations to a reference. Crossref is probably only interested in doing that for things that have a DOI, but that’s only one reason to collect citation information. Crossref should strive to collect the best metadata possible, and the schema needs improvement in this case. We started reporting URLs in the elocation_id field since the definition of that field is vague and that was the best match to the missing URL field. That’s perhaps an abuse of the schema, but that’s what happens when schemas are deficient. The JATS <element-citation> does better in this. Publishers now have access to machine learning tools like Grobid to break down unstructured references into a structured form, and STEM publishers already often collect structured formats in BibTeX. We should expect publishers to improve their reporting of citation information in the future, and OJS has announced plans for this. For these reasons, it definitely makes sense to have the option to supply a URL in the <citation> structured element.