Ticket of the month - March 2022 - Getting started with REST API queries

Our REST API is largely used for retrieval of metadata by machines. But, there are some manual queries that can be performed by you or me to retrieve metadata. Let’s take a look at a common question of the support team from a member looking for information about her colleagues:

I work at the Science State University and I am trying to learn how to use the Crossref API in order to retrieve data about the publications of the researchers of my institution.

In particular, I would like to retrieve the list of the publications of at least one author affiliated to Science State University. Could you please help me with this?

I always like starting with a straightforward query like this: https://api.crossref.org/works?query.affiliation=Science+State+University&select=DOI,title,author&rows=500&mailto=support@crossref.org

In this query, I am searching our whole corpus for any affiliation that includes the word Science or State or University; I am limiting the metadata that is returned to me for each DOI to:

  • DOI
  • title of the work
  • the contributor/author list for that DOI

The results are in order of relevance to the query affiliation, so the top results will have a higher match or relevance score for the words Science+State+University than the results down the page. By adding my email address, I can identify myself and thus access the Polite pool of the REST API (you can omit it, if you need to stay anonymous, which will result in querying our Public pool).

Let’s say that I also want the cited-by counts of those same DOIs that I received in my results above. I’d simply add the is-reference-by-count parameter to my query:

https://api.crossref.org/works?query.affiliation=Science+State+University&select=DOI,title,author,is-referenced-by-count&rows=500&mailto=support@crossref.org

We, Crossref staff, keep a list of some of the more useful queries that members and metadata users have sent to or requested of us. I thought it worthwhile to share these with you. You’ll see those below with a brief explanation of the query. It’s also important to note that, if you’re interested in getting started with constructing your own REST API queries, you can use the functionality with our REST API documentation (Swagger) to assist.


Some of the more common REST API query requests we see:

All works on a particular prefix: https://api.crossref.org/prefixes/10.35195/works

Article titles on a particular prefix: https://api.crossref.org/prefixes/10.35195/works?select=DOI,title

If you want the journal title too, it’s: https://api.crossref.org/prefixes/10.35195/works?select=DOI,container-title,title

Which DOIs for a specific prefix have license information deposited for them?
https://api.crossref.org/prefixes/10.1098/works?filter=has-license:true&rows=300 (300 results at a time)

All works by title for a prefix:
https://api.crossref.org/prefixes/10.21240/works?select=DOI&rows=1000

All works for this ISSN sorted with these elements DOI, title, volume, issue and page number (first 20 results returned):
https://api.crossref.org/journals/1527-2095/works?select=DOI,title,volume,issue,page

All works for this ISSN sorted with these elements DOI, title, volume, issue and page number (first 1000 results returned):
https://api.crossref.org/journals/1527-2095/works?select=DOI,title,volume,issue,page&rows=1000

All works registered with a specific ORCID iD:
https://api.crossref.org/works?filter=orcid:0000-0002-9117-4510

All works registered with a specific ORCID iD with results sorted with only the following metadata: DOI, title, and citation count:
https://api.crossref.org/works?filter=orcid:0000-0002-9117-4510&select=DOI,title,is-referenced-by-count

All chapter-level DOIs registered against an ISBN with the results sorted with elements DOI, title, type, container (book-level) title, links, and ISBN:
https://api.crossref.org/works?filter=isbn:9783030814649&select=DOI,title,ISBN,type,container-title,link&rows=100

List of DOI counts registered against a prefix and the type of content registered for those DOI counts:
https://api.crossref.org/prefixes/10.37670/works?rows=0&facet=type-name:*

What is the most recently deposited DOI that has been indexed in the REST API on a particular prefix Today: https://api.crossref.org/prefixes/10.1021/works?filter=from-created-date:2021-05-14,until-created-date:2021-05-14&sort=deposited&order=desc

Overall:
https://api.crossref.org/prefixes/10.1021/works?sort=deposited&order=desc

Show me who is registering grants:
https://api.crossref.org/types/grant/works?rows=0&facet=funder-name:*

List of top 10 (and, 1000) most-cited DOIs for a prefix (results limited to DOI, title, and citation count):
https://api.crossref.org/prefixes/10.1001/works?sort=is-referenced-by-count&select=DOI,title,is-referenced-by-count&order=desc&rows=10

https://api.crossref.org/prefixes/10.1001/works?sort=is-referenced-by-count&select=DOI,title,is-referenced-by-count&order=desc&rows=1000

Works created between two dates:
https://api.crossref.org/prefixes/10.31356/works?rows=100&filter=from-created-date:2020-01-01,until-created-date:2020-12-31

All DOIs registered with the isTranslationOf relation: https://api.crossref.org/works?filter=relation.type:is-translation-of

I should also note, if you are eager to get started with manual queries and you do not have a JSON formatter installed in the browser of your choice, now might be the time to reconsider that. I use this extension for Google Chrome, but there are other extensions and other browsers, so I encourage you to find one that is right for you: JSON Formatter - Chrome Web Store

Let us know what you think. Thanks for reading!

-Isaac

5 Likes

Thanks for this excellent post. A few examples for query.bibliographic, and the concept of paths (e.g other than the “works” path) would be nice for your future post.

Regards

1 Like

Hi @psmku,

Thank you!

I’ve included a few examples of query.bibliographic queries and some information about the paths (or, resource components) below. Let me know what you think, or if you have any follow-up questions or comments.

Bibliographic queries
I am looking for all grants including the award number 7196:
https://api.crossref.org/works?query.bibliographic=7196&filter=type:grant

Also grants-related. I’m searching for all grants that include the project title RIZ1:
https://api.crossref.org/works?query.bibliographic=RIZ1&filter=type:grant

In this query, I’m searching for all works that include island, biogeography, or Wilson. Ideally, I’m hoping to find content by E.O. Wilson on island biogeography. My results will be sorted in order of relevance with the top results including all three words in the bibliographic metadata (i.e., query.bibliographic information, useful for citation look up, includes titles, authors, ISSNs and publication years):
https://api.crossref.org/works?query.bibliographic=island+biogeography+Wilson

Paths (or, resource components)
Major resource components supported by the Crossref API are:

  • works
  • funders
  • members
  • prefixes
  • types
  • journals

These can be used alone like this

resource description
/works returns a list of all works (journal articles, conference proceedings, books, components, etc), 20 per page
/funders returns a list of all funders in the Funder Registry
/members returns a list of all Crossref members (mostly publishers)
/types returns a list of valid work types
/licenses return a list of licenses applied to works in Crossref metadata
/journals return a list of journals in the Crossref database

Resource components and identifiers

Resource components can be used in conjunction with identifiers to retrieve the metadata for that identifier.

resource description
/works/{doi} returns metadata for the specified Crossref DOI.
/funders/{funder_id} returns metadata for specified funder and its suborganizations
/prefixes/{owner_prefix} returns metadata for the DOI owner prefix
/members/{member_id} returns metadata for a Crossref member
/types/{type_id} returns information about a metadata work type
/journals/{issn} returns information about a journal with the given ISSN

Combining resource components

The works component can be appended to other resources.

resource description
/works/{doi} returns information about the specified Crossref DOI
/funders/{funder_id}/works returns list of works associated with the specified funder_id
/types/{type_id}/works returns list of works of type type
/prefixes/{owner_prefix}/works returns list of works associated with specified owner_prefix
/members/{member_id}/works returns list of works associated with a Crossref member (deposited by a Crossref member)
/journals/{issn}/works returns a list of works in the given journal

Warm regards,
Isaac

1 Like

Congratulation Isaac !!

Another excellent follow-up post. I appreciate this effort from the core of my heart.
The section on Combining resource components is extremely helpful for me and my students here.

Another confusing concept in Crossref (as reported by students) is the application/combination of rows and offset.

For example,

https://api.crossref.org/works?query.bibliographic=LGBT+LGBTQ+LGBTQI+LGBTQIA&filter=type:book-chapter,type:journal-article&select=DOI,title,subtitle,container-title,type,ISBN,ISSN,publisher,publisher-location,published-print,published-online,is-referenced-by-count,references-count,abstract,subject&rows=1000&offset=0

will retrieve the first set (keeping in view the limitation of 1000 results per call).

Then I always find it difficult for students to understand that to retrieve the second set we need to change values of rows and offset as rows=1000&offset=1000.

A few examples on this issue will be very helpful for users like us.

Best regards

Hi @psmku,

For the query:

https://api.crossref.org/works?query.bibliographic=LGBT+LGBTQ+LGBTQI+LGBTQIA&filter=type:book-chapter,type:journal-article&select=DOI,title,subtitle,container-title,type,ISBN,ISSN,publisher,publisher-location,published-print,published-online,is-referenced-by-count,references-count,abstract,subject&rows=1000&offset=0

there are over 10,000 results. Offsets for /works are limited to 10,000, so using offsets here is not recommended. Instead, you should be using cursor. See more details on this distinction below.

Offset

The number of returned items is controlled by the rows parameter, but you can select the offset into the result list by using the offset parameter. So, for example, to select the second set of 5 results (i.e. results 6 through 10), you would do the following:

https://api.crossref.org/works?query=allen+renear&rows=5&offset=5

Offsets for /works are limited to 10K. Use cursor (see below) for larger /works results sets.

Deep paging with cursors

Using large offset values can result in extremely long response times. Offsets in the 100,000s and beyond will likely cause a timeout before the API is able to respond. An alternative to paging through very large result sets (like a corpus used for text and data mining) is to use the API’s exposure of Solr’s deep paging cursors. Any combination of query, filters and facets may be used with deep paging cursors. While rows may be specified along with cursor, offset and sample cannot be used. To use deep paging make a query as normal, but include the cursor parameter with a value of *. In this example we will page through all journal-article works from member 311:

https://api.crossref.org/members/311/works?filter=type:journal-article&cursor=*

A next-cursor field will be provided in the JSON response. To get the next page of results, pass the value of next-cursor as the cursor parameter. For example:

https://api.crossref.org/members/311/works?filter=type:journal-article&cursor=AoE/CGh0dHA6Ly9keC5kb2kub3JnLzEwLjEwMDIvdGRtX2xpY2Vuc2VfMQ==

Note that the actual cursor value will be different from this illustration.

Clients should check the number of returned items. If the number of returned items is fewer than the number of expected rows then the end of the result set has been reached. Using next-cursor beyond this point will result in responses with an empty items list.

The cursor parameter is available on all /works resources.

My best,
Isaac

Hi @ifarley
I wonder if it is possible to get access to all the meta data of articles published in a specific journal. I would like to do a bibliometrics analysis of a journal (journal of applied psychology - ISSN: 1939-1854), but if I manage to get access to journal info and sometimes to few articles published in it, I don’t get the whole set of publications. Could you help me or at least guide me on how to build a relevant query regarding my research goal. Note that I am relatively neophyte into coding and API manipulating… and I am sorry for that if my request is irrelevant to you :relaxed:
Thanks.
Alex

1 Like

Hi @AlRen ,

If the question is relevant to you, then it is relevant to me and others in our community.

Here is the query you’d use to retrieve all the full metadata records that have been registered that include the ISSN 1939-1854:

https://api.crossref.org/journals/1939-1854/works?mailto=support@crossref.org

There are 10,175 DOIs registered with us that include that ISSN in the metadata record.

Additionally, you may review the depositor report for the Journal of Applied Psychology here:
https://data.crossref.org/depositorreport?pubid=J3371

This report includes all of the DOIs that have been registered for this journal, the DOI prefix that owns each DOI, the date of the last metadata update, and citation counts for each individual DOI.

My best,
Isaac

Hi @ifarley,
Thank you for your quick answer and help.
I ve tried the request you suggested me and it works (I added &rows=1000&offset=0 because otherwise I retrieved only 56 docs). I manage to download the metadata of articles published in this journal, except the cited references, which is the category we are working on with our bibliometric methods.
Do you know why it is possible to dl the cited references when I focus on one specific articles metadata and not when I want to download a full set of metadata?
Thanks a lot for your help.
Alex

Hi @AlRen ,

Thanks for following up. I think this query might give you everything you’re looking for. I’ve tried to reduce some of the noise and only select for elements of interest. So, at the time of this writing, there are only 667 DOIs registered for this journal that include reference metadata. The remaining DOIs have not had reference metadata registered for them.

https://api.crossref.org/journals/1939-1854/works?mailto=support@crossref.org&filter=has-references:true&select=DOI,title,type,reference,references-count,is-referenced-by-count&rows=700

My best,
Isaac

Thanks again @ifarley. I downloaded the 667 metadata entries.
May I ask you two more questions and a last request.

1/ could you decrypt the request you wrote, I mean to explain the logic of it? Then I can reproduce it later fully aware of what I do.

Q1: Is it normal that the cited references don’t have the same form. indeed, some are cited this way:

{“doi-asserted-by”:“crossref”,“unstructured”:“Austin. Another view of dynamic criteria: A critical reanalysis of Barrett, Caldwell, and Alexander. 42 583 1989 », « key”:“r2_10.1037/0021-9010.86.3.446”,“DOI”:“10.1111/j.1744-6570.1989.tb00670.x”}

while other are cited this way:

{« doi-asserted-by":“publisher”,“key”:“r5_10.1037/0021-9010.86.3.446”,“DOI”:“10.2307/256190”},

I am used to use Scopus data and we collect raw data, exactly as they are cited by authors in their publications. Does crossref clean the cited references?

Q2: Do you have any idea why only 667 have references? Is it because the journal is poorly referenced and it doesn’t provide full data to Crossref? Or is it because most of its publication doesn’t cite any references? In both case the differential between 10K documents and 667 is impressive…

Thanks a lot.
Sincerely
Alex

Hey @AlRen ,

See my answers below.

Sure thing.

https://api.crossref.org/journals/1939-1854/works?mailto=support@crossref.org&filter=has-references:true&select=DOI,title,type,reference,references-count,is-referenced-by-count&rows=700

I’ll go in order of the query, so you can follow the logic:

  1. I’m querying the journals route for any works with ISSN 1939-1854.

  2. By including my email address as a mailto parameter, my query is being performed by our Polite pool of the API.

  3. I am then filtering the responses so that I only get results for DOIs with ISSN 1939-1854 in the metadata that have references registered with us.

  4. Next, using the select parameter, I am telling the API to reduce some of the noise of the response, so I don’t get the full metadata record. Instead, I only want: DOI, title of the article registered for the DOI, content type of the DOI (these are all journal articles, so moot), the references registered for this DOI, the count of references registered for this DOI, and then the number of other Crossref DOIs that have cited this DOI (using the is-referenced-by-count parameter).

  5. And, finally, since I know there are 667 results and I want to see all of the results in the response, I have told the API to give me the first 700 rows (or, resulting records) back.

Q1: Is it normal that the cited references don’t have the same form. indeed, some are cited this way:

{“doi-asserted-by”:“crossref”,“unstructured”:“Austin. Another view of dynamic criteria: A critical reanalysis of Barrett, Caldwell, and Alexander. 42 583 1989 », « key”:“r2_10.1037/0021-9010.86.3.446”,“DOI”:“10.1111/j.1744-6570.1989.tb00670.x”}

while other are cited this way:

{« doi-asserted-by":“publisher”,“key”:“r5_10.1037/0021-9010.86.3.446”,“DOI”:“10.2307/256190”},

This is somewhat normal. I suspect that whoever was registering this content for American Psychological Association (APA), the Crossref member who stewards the Journal of Applied Psychology (ISSN 1939-1854), had the citation for the DOI matching 10.1111/j.1744-6570.1989.tb00670.x, but did not know the DOI, so they submitted the citation and we matched the DOI, so that is why the reference metadata is presented in this way.

For the reference metadata where only the DOI is present, American Psychological Association (APA) only gave us the DOI to establish the cited-by match. They didn’t need to provide us with the citation because they had the DOI. That’s the most definitive way to establish a cited-by match between the citing and cited DOIs.

Does crossref clean the cited references?

No.

Q2: Do you have any idea why only 667 have references? Is it because the journal is poorly referenced and it doesn’t provide full data to Crossref? Or is it because most of its publication doesn’t cite any references? In both case the differential between 10K documents and 667 is impressive…

We have 667 DOIs with reference metadata for this ISSN because this is what American Psychological Association (APA) has registered with us. Unfortunately, the questions about the content and why only 667 DOIs have reference metadata registered are best answered by APA. I can tell you that we do not require our members to register reference metadata, so it is likely that there is reference metadata for DOIs of this journal that just have not (yet) been registered with us. Adding reference metadata to any existing Crossref DOI’s metadata record is free for our members, but they’d need to register the references.

Warm regards,
Isaac

Hello @ifarley
Thank you for all your answers and advice. They were very informative for me and helped me a lot. I have a broader question, not sure if this is the right place to ask it, but anyway. I wanted to explore CrossRef’s coverage of journals referenced in a scientific ranking. I found the right command to explore a single journal, but is it possible to get the list of journals present in CrossRef, or at least the ISSN list? And if so, what information can be added: for example, would it be possible to have the years covered and the number of articles included in each journal?
Thank you for your help, which is always valuable.
regards
Alex

Hi Alex,

We don’t have queries that are going to give you exactly what you are requesting, but here are a few that get close/are a starting point:

This will give you all journal articles registered with us (I have selected for only the article DOI, article title, journal title, and ISSN):
https://api.crossref.org/works?filter=type:journal-article&select=DOI,title,container-title,ISSN&rows=1000&mailto=support@crossref.org

Similar to the first query, this gives you journal articles sorted by date created (or, registered with Crossref). The newest DOIs are atop the results:
https://api.crossref.org/works?sort=created&filter=type:journal-article&select=DOI,ISSN,container-title,created&rows=1000&mailto=support@crossref.org

If you know the ISSN you’re eager to see works for you, you can include it in your query, like this one below. These results are sorted in order of most recently created (registered with Crossref):
https://api.crossref.org/works?sort=created&filter=issn:1939-1854&select=DOI,title,container-title,created&rows=1000&mailto=support@crossref.org

My best,
Isaac

Hi Isaac,
I am back here with another question ^^
Do you know if it is possible to have the list of metadata available for a specific journal / or specific editor?
I have 2 specific concerns:

  • does the journal/editor provide: (i) the abstracts; (ii) the references cited

I will be interested for this journal Accounting, Organizations and Society (Print ISSN: 1873-6289 and e-ISSN: 0361-3682); and this editor; Springer with the prefix 10.1016

Thanks a lot.
Regards,
Alexandre

1 Like

Hi @AlRen ,

Good questions. You can retrieve this information in our REST API:

There are no works registered with us that include ISSN 1873-6289 in the metadata record, as you can see with this query: https://api.crossref.org/journals/18736289/works?mailto=support@crossref.org

As for works registered with us for ISSN 0361-3682, some have references registered:
https://api.crossref.org/journals/03613682/works?filter=has-references:true,&mailto=support@crossref.org

But, none have abstracts registered with us:
https://api.crossref.org/journals/03613682/works?filter=has-abstract:true,&mailto=support@crossref.org

You can confirm that here (all works are returned when including the parameter: filter=has-abstract:false)
https://api.crossref.org/journals/03613682/works?filter=has-abstract:false,&mailto=support@crossref.org

Which works have an abstract registered with us on prefix 10.1016:
https://api.crossref.org/prefixes/10.1016/works?filter=has-abstract:true&mailto=support@crossref.org

Which works have references registered with us on prefix 10.1016:
https://api.crossref.org/prefixes/10.1016/works?filter=has-references:true&mailto=support@crossref.org

My best,
Isaac

1 Like

Thank you so much for your precise and so helpful answers. :raised_hands:
I think I have everything I need. Thank you so much again :trophy:

1 Like

You’re welcome! We’re always happy to help :slight_smile:

Hello, Isaac,

in your first message in this thread you wrote an example of a query, which processes the query tokens with OR:

My question is, how to make a query with AND operand between several tokens?

Hi @edgolovin ,

Thanks for following up.

You’re right that in the query: https://api.crossref.org/works?query.affiliation=Science+State+University&select=DOI,title,author&rows=500&mailto=support@crossref.org that I get back all results that include the word Science OR State OR University (that’s why I get back more than 12 million results), but the results are returned in order of their relevance for Science AND State AND University. That relevance is scored using a relevance score, you can see it if you include it in the select parameter, like this:

https://api.crossref.org/works?query.affiliation=Science+State+University&select=DOI,title,author,score&rows=500&mailto=support@crossref.org

My best,
Isaac

1 Like

Thank you, @ifarley.

Your answer is very clear and helpful. I will take in use this relevance score then.

Could you point out if there is any documentation on how this score is calculated, or maybe the particular name of the relevance model?