News: Crossref and Retraction Watch - Crossref

Crossref · 12 September 2023 13:04

This is a companion discussion topic for the original entry at https://www.crossref.org/blog/news-crossref-and-retraction-watch

MikeCUPA · 13 September 2023 13:56

The Blog mentions “a community call on 27th September at 1 p.m. UTC to discuss this new development in the pursuit of research integrity.” how is that accessed? have details gone out already?

rclark · 13 September 2023 18:34

Hello! Thanks for reaching out. You can register here: Webinar Registration - Zoom

We are looking forward to seeing you online!

Rosa

CAYdenberg · 22 September 2023 02:59

Will the retraction dataset be fully integrated into Crossref’s API? For example, will we be able to search for retractions by journal, author name, institution, etc.?

ifarley · 25 September 2023 18:07

Yes, that’s our plan. Geoffrey, our Director of Technology & Research, mentioned that very thing in a thread earlier today:

'Our recently announced opening of the RetractionWatch data will only ever be made available via the REST API.

-Isaac

gami · 29 September 2023 14:46

Will the Retraction Watch Hijacked Journals Checker also be implemented somehow, that would be marvelous!!!

SleepyLibrarian · 16 May 2024 19:16

Hey, Crossref, I wanted to let you know that the csv version of the database that can be directly downloaded from this article has mixed character encoding. It’s mostly UTF-8, but has embedded Windows-1252, which frequently happens when copy-pasting from different sources. This unfortunately makes importing that file into any program a real PITA to figure out.

I strongly encourage you to correct the encoding and replace the version at that link (and anywhere else it lives, if there are multiple endpoints). Heck, email me, and I will send you a version in pure UTF-8. You will save many people the time and frustration it takes to figure out why none of the normal encodings are working and to find an appropriate conversion tool.

In the meantime, a tip for anyone using Python: UnicodeDammit.detwingle() can be used to fix this file.

Shayn · 17 May 2024 16:39

Thanks for that tip!

The context provided by my colleagues on our technical team who are working most closely on the Retraction Watch data is that the character encoding of the file that is given to us by Retraction Watch is somewhat broken. We can either decode it as UTF-8, ignoring errors like the ones you’ve pointed out, or decode as Latin1, which incorrectly displays Russian names (among other things).

In January, we switched the encoding from Latin1 to UTF-8, just ignoring the errors. Neither solution is ideal, but that’s the trade-off we have for the moment.

Another workaround in Python is to pass errors=“ignore” to the decode function - there will still be a few problematic entries, but it should make using it go more smoothly.

We are working with Retraction Watch to eventually build a system that will produce the data without those encoding errors.

Topic		Replies	Views
Retraction Watch retractions now in the Crossref API - Crossref Metadata Retrieval rest-api , blog , retractions , research-integrity , retraction_watch	7	85	25 March 2025
Ticket of the month - December 2024 - Retraction Watch Tips Interfaces for Machines support , metadata-retrieval , ticket_of_month , retractions , retraction_watch	0	54	9 January 2025
Research Integrity Roundtable 2024 - Crossref Integrity of the Scholarly Record (ISR) blog , research-integrity , isr	1	14	7 December 2024
Towards a connected and dynamic scholarly record of updates, corrections, and retractions Questions from Crossref metadata , crossmark , retractions , corrections , relationships	0	1180	20 September 2022
Communicating post-publication updates: Inviting feedback on the next steps for Crossmark Crossmark crossmark , research-integrity , feedback-needed	0	1069	23 January 2024

News: Crossref and Retraction Watch - Crossref

Related topics