Visit the main Crossref website

Query.affiliation

When I retrieve through a syntax like

https://api.crossref.org/works?query.affiliation=Kalyani+University&filter=from-pub-date:2020-01-01,until-pub-date:2020-12-31,type:journal-article&select=DOI,title,container-title&rows=300&offset=0

… it shows abnormal number of documents possibly due to treating the query like Kalyani OR University.

The retrieval set for -

https://api.crossref.org/works?query.affiliation=Kalyani&filter=from-pub-date:2020-01-01,until-pub-date:2020-12-31,type:journal-article&select=DOI,title,container-title&rows=300&offset=0

… is more reasonable document wise but taking all institutes with the place name Kalyani not only Kalyani University.

What is way out to retrieve documents with affiliation as Kalyani University?

Regards

1 Like

Hi @psmku. Thanks for your question and welcome to the community forum.

Our REST API does not support Boolean operators (i.e., OR, AND). Instead, we score and sort the relevance of our results. So, the highest results in the API for your query https://api.crossref.org/works?query.affiliation=Kalyani+University&filter=from-pub-date:2020-01-01,until-pub-date:2020-12-31,type:journal-article&select=DOI,title,container-title&rows=300&offset=0 include matches for (ALL) affiliations that include the words Kalyani or University. Comparatively, if you were to page through more of the results, you might find that you’d eventually only find matches that included (ONLY) the words Kalyani or University in the affiliation element.

It might be helpful to include the score in your results as well: https://api.crossref.org/works?query.affiliation=Kalyani+University&filter=from-pub-date:2020-01-01,until-pub-date:2020-12-31,type:journal-article&select=DOI,title,container-title,score&rows=300&offset=0 and eliminate results with a score below a certain threshold.

For instance, this DOI 10.1386/dtr_00025_1 has a relevance score of 0.81349975 and the affiliation metadata registered for all of the contributors of that DOI is: 0000000092155771Lesley University. Thus, I would suggest eliminating this as a viable result (for this specific query), and I would ignore anything with a score below that as well. Note: this is simply an arbitrary example; I do not mean to suggest that this score is the threshold for all queries (or, even this one - I assume a higher relevance score might be a better fit for this query, but I’ll defer to you).

Please let me know if you have any additional questions.

Kind regards,
Isaac