Hello CrossRef community,
I have a question about date fields - the motivation for this is a continuation of a question I asked previously about unexpectedly low volumes of results from my search query. My end users are still unhappy - we now get more results but apparently only a low proportion are truly relevant.
Our use case is to do a trawl of new academic literature being released each week on poverty in developing countries. Part of the solution last time was switching from publication dates to creation dates, and Iām wondering if altering the date field again could help.
Iāve reviewed the documentation on date fields and Iād like to understand a little more about some of the main categories:
pub-date- this is the publication date according to the publisher - we seem to have issues with these being a long way in the past (or sometimes the future!) even when an record was ācreatedā in the last week. What could cause that? Whatās the process & any validation checks for publishers depositing this data? Also, can this change e.g. could it be set for a preprint and then updated at final publication etc.? If a publication gets published in print and online, which value would this take?created-date- this is the date of first deposit - what drives that? Is it normal for publishers to start registering very old work? Conversely, do they create deposits of anticipated releases well in advance of publication?update-date- what events cause this to be updated? Itās a deposit or redeposit, but what real world events lead to this?deposit-date- just checking this is identical toupdate-date?index-date- how does this differ fromupdate-dateanddeposit-date?
The solution Iām considering is switching from created date to update date. It seems to me this should catch publications at all significant steps in their publication journey, and the only downside is introducing duplicates into our dataset. Any comments on that strategy would also be welcome.
Please let me know if more context or clarification would be useful.