"split" citations?

Hi,

In some cases, when sending a metadata deposit containing citations, the result contains multiple citation keys for one of the source citations, prefixed with #cr-split.

For example, we have this citation in the doi_batch XML:

<citation key="rb44">
<unstructured_citation>
<scp>Saussure</scp>
, Ferdinand de (1916) :
<i>Cours de linguistique générale</i>
, édition critique préparée par Tullio de Mauro (1972). Paris : Payot.
</unstructured_citation>
</citation>

And in the doi batch result:

        <citation key="#cr-split#-rb44.1" status="stored_query"></citation>
        <citation key="#cr-split#-rb44.2" status="stored_query"></citation>

What does it mean, and what is the cause?

Thanks,
Béranger

Hi Béranger,

Thanks for your question.

I’m not sure what that “cr-split” in the citation diagnostic means. It’s not something that comes up very often.

I’ll need to check with some of my colleagues on that, and get back to you when I have more information. There are a lot of people out on vacation this week, so it may take a bit more time than usual.

-Shayn

Hi Béranger,

I was able to get some more information on the split citations. Our citation matching system has a feature where it checks for instances where an <unstructured_citation> could inadvertently contain two references, rather than just one. That happens often enough that we built in a mechanism to split them apart.

There are a number of ways a reference might be formatted such that it will be split, including having either a colon or semi-colon.

So, if you resubmit that reference as

<citation key="rb44">
<unstructured_citation>
<scp>Saussure</scp>, Ferdinand de (1916). <i>Cours de linguistique générale</i>, édition critique préparée par Tullio de Mauro (1972). Paris, Payot.
</unstructured_citation>
</citation>

That should take care of the problem.

Best,
Shayn

1 Like