For the second affiliation, “Instituut voor Kern- en Stralingsfysica”, which I think is properly meant to be “Instituut voor Kernen Stralingsfysica,” or the Institute for Nuclear Radiation and Physics at KU Leuven, I think APS most likely is still working on parsing out the correct part of the affiliation to send to Crossref, which in this case would be KU Leuven, https://ror.org/05f950310.
There isn’t any way to get the raw affiliation strings from the Crossref API unless APS sends them to Crossref, but I’ll pass on your message and see what they can do.
You could also get in touch with them directly to ask that they send full affiliation strings, at least until they have added more ROR IDs to their metadata so that they can identify affiliations like the one for KU Leuven in the example you give.
Hi! I work for the APS journals and have been somewhat involved with our affiliation work. I believe before May 2024 we weren’t sending any affiliation data; when we implemented ROR we also started sending associated affiliation strings to Crossref whether or not the affiliation was identified with a ROR id. However the data being sent may be incomplete, thanks for pointing out this issue and I’m asking some of our IT staff to look into it.
Hi Amanda - is there a standard field in the Crossref upload format that represents the “raw affiliation string” somewhere? It looks like all we’re sending (aside from ROR ids) is the “name” field, which is whatever our system thinks the “name” is?
@apsmith No, there isn’t a place for that “raw affiliation string” – that’s a good point. It’s broken down into three separate fields: institution_name, institution_place, and institution_department. You can also add institution_acronym and of course institution_id.
That said, I wouldn’t be surprised if there are a lot of existing records with a raw affiliation string in institution_name. Here’s the documentation: Affiliations and ROR - Crossref
In the example @Ren gives for KU Leuven, the ideal example would look like this:
Hmm, that’s not really a good match for our JATS affiliation data. All our vendors are tagging right now is <aff> and then within that <institution-wrap> which holds the <institution-id> (ROR) and <institution> (name) elements. So no “department” or “place” designation. Are you recommending just using the full string (without tags) as the name value if that’s all we have?
No, I definitely wouldn’t recommend putting the full string as the institution name value. I’d be curious to hear from @Ren what parts of the raw string are most important for their needs.
This change impacts one very important emerging service OpenAlex (I don’t work for them). I see that on recent APS publications.
api . openalex. org/works / doi : 10.1103 / PhysRevC.110.034315
It is now very hard for OpenAlex affiliation to institution resolver to resolve institutions. => the APS record fails matching a basic university “KU Leuven”, because information is not present anymore along with address country, city. (that are injected in matcher)
example :
{
“author_position”: “middle”,
“author”: {
“id”: " // openalex. org /A5081842888",
“display_name”: “Cyril Bernerd”,
“orcid”: " // orcid. org / 0000-0002-2183-9695"
},
“institutions”: , => OpenAlex cannot match KU Leuven university (array is empty)
“countries”: ,
“is_corresponding”: false,
“raw_author_name”: “C. Bernerd”,
“raw_affiliation_strings”: [
“Instituut voor Kern- en Stralingsfysica” => missing raw affiliation information in OpenAlex.
],
“affiliations”: [
{
“raw_affiliation_string”: “Instituut voor Kern- en Stralingsfysica”,
“institution_ids”:
}
]
},
It seems raw affiliation is available in NASA adsabs :
ui. adsabs. harvard. edu / abs/ 2024PhRvC.110c4315Y / abstract
(I don’t know how they get raw affiliation from APS, Arthur do you know ? )
Conclusion ( you may disagree ) :
It is important to keep the raw affiliation concept, scientists are using it for many years to express their affiliation. We need the complete raw affiliation string. There is no standard way to cut it (lab, dpt, uni, city, country is perfect world that does not exists). I hope Crossref could clarify where to put this raw affiliation information.
Matching raw affiliation to a ROR is just a second unperfect step, we need to keep the possibility for external systems (like OpenAlex) to match institutions. We need to simply allow user to search within raw affiliations for specific terms (ex: an acronym, a specific name, etc…), OpenAlex provides this feature.
“Unperfect step” because ROR can even be incomplete at the time you enter your affiliation in APS.
Thanks @Ren for that explanation! And apologies for not having remembered earlier that the Crossref schema doesn’t support raw affiliation strings. I can pass your request on to the metadata team.
Other example : https :// journals aps org prb abstract 10.1103 PhysRevB.110.174434
affiliation : “Institute of Theoretical Physics, Chinese Academy of Sciences, Beijing 100190, China”
is exported as : crossref http :// api crossref org works 10.1103 PhysRevB.110.174434 (accessed 2024/11/21)
“Institute of Theoretical Physics” (no ROR)
=> not yet loaded in OpenAlex (checked today), but I don’t think it will correctly match.
OpenAlex could perfectly match this affiliation, if it was full ex : https:// api . openalex . org / works /doi : 10.1103 / physrevd.65.084014
=> for “Institute of Theoretical Physics, Chinese Academy of Sciences, P.O. Box 2735, Beijing 100080, China” openalex gives 2 RORs. one for “Institute of Theoretical Physics” (in beijing) and one for “Chinese Academy of Sciences”
After I started looking at what authors submit for ROR, I’ve noticed that they sometimes submit incorrect information. Part of the problem was that the structure of the ROR database can be confusing with parent/child relationships and various other oddities. As it turns out, the name is crucial in resolving differences and what people put in their articles for the name of an institution can vary somewhat (for example, if they write in English but the primary name is actually in another language). It’s also problematic for people who work for global companies. Walmart has subsidiary research labs in various countries, but they aren’t represented in ROR. IBM has a research lab in Cambridge, but it isn’t represented in ROR. Brown University has a campus in Bologna that isn’t represented. I guess it depends on whether you take the name from the author or the ROR database. The ROR database has multiple names for each institution, and there are complicated relationships between institutions.
Thanks for contribution on that. A few comments to link your statements to the “raw affiliation”.
ROR historically got its data from grid, which was an initial export of wikidata (in wikidata most company have unique local headquarter entry).
regarding other entries (not companies). This leads to current ROR company names similar to wikidata names like “GlaxoSmithKline (Germany)” Wikidata : Q29123139, ROR 05gedqb32
The usage of ROR is used in different ways to abstract Organisations which is closely related to how research is financed and organised. Italian national funder CNR recorded ROR institutions that are headquarters based (you don’t have ROR records for the city branches, they all are under the same organisation-acronym-name) . Spanish national funder CSIC has research lab institutions labs in ROR that are quite city centric, often attached to a university.
ROR has labels for languages ex : 02k8cbn47 (so not an issue to resolve I think)
you are free to represent the Brown University Campus in Bologna. “Georgetown University in Qatar” exists ROR : 029e47x73. It seems unfair to kind of force an author to affiliate a work to ROR based in US, when you are actually working in Italy. The proper management of raw affiliation (without ROR) in the publication chain allows mitigating this risk.
raw affiliation contains the real place, city where people are actually living and working. This will allow creating maps not relying (only) on ROR unperfect locations.
Renaud,
PS : regarding the ROR search for end users, the scientists. I remember using Grid search. I think they were indexing the wikipedia text pages. This means if a company had a old name mentioned in wikipedia text (not even in grid labels or anywhere). Grid could push up this record in results. I can’t 100% confirm that, GRID search is not active any more. This behaviour is not the case in ROR search : ( ex: https // ror org search?query = Nitta Gelatin (in wikipedia Arkema French text page) => does not return Arkema
Brown does not seem to have a campus in Bologna – it’s just a study abroad program. Brown in Bologna | Office of Global Engagement | Brown University However, we’re willing to evaluate that and the other organizations you mention for addition to ROR if you submit requests for them with the relevant information.
There are many global companies in ROR, as @Ren helpfully points out, and we did inherit the way those are managed from GRID, and that emulates Wikidata. Most global companies should have their local headquarters represented in ROR. See for instance Nokia, 3M, and Google. Again, we’re happy to accept requests to update our data.
By the way, I have also since learned, @Ren and @apsmith, that many publishers do send full affiliation strings to Crossref and store them in the name field in affiliation, so perhaps that’s something APS could do as well, though I can’t think it optimal in terms of structured metadata.
Using the name for affiliation is not optimal yes. But if for now, this is the only way to keep the raw affiliation open data. Could this be added to crossref documentation.
Do we know if Elsevier plans to implement ROR in crossref soon ?
Note : I start seeing labs with existing ROR, not associated in APS by the user. (the user is just linking the university ROR), lab disappears from crossref and their client databases like OpenAlex.
OpenAlex starts working on tools to help community fixing/curating ROR association in their database. https www. youtube . com / watch ? v = OIFHhz2OQPg (they use crossref data, crossref is the main source of trust, if the lab name is not there anymore. curators won’t be able do much)