-
Notifications
You must be signed in to change notification settings - Fork 487
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DataciteXML changes Plus RelationType field #10632
DataciteXML changes Plus RelationType field #10632
Conversation
trying to avoid a separate tx boundary
…itativeDataRepository/dataverse.git into datacite_plus_relPubRelType
OK - I think I addressed all the comments. |
Jenkins is failing but I pushed a minor doc tweak to force another run. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
More thoughts on the docs.
|
||
Additional metadata, including metadata about Related Publications is now being sent to DataCite when DOIs are registered and published and is available in the DataCite XML export. For existing datasets where no "Relation Type" has been specified, "IsSupplementTo" is assumed. The additions are in rough alignment with the OpenAIRE XML export, but there are some minor differences in addition to the Relation Type addition, including an update to the DataCite 4.5 schema. | ||
|
||
For details see https://github.com/IQSS/dataverse/pull/10632 and https://github.com/IQSS/dataverse/pull/10615 and the [design document](https://docs.google.com/document/d/1JzDo9UOIy9dVvaHvtIbOI8tFU6bWdfDfuQvWWpC0tkA/edit?usp=sharing) referenced there. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The real meat of what's changing is squirreled away in a Google doc. This doesn't quite sit right with me. 🤔 This is the stuff users might like to know. And the stuff QA will test against.
However, our release notes tend to get long and I'm not sure the details should be here either.
The more I think about it... I'd prefer to have the Google doc copied and pasted here into the release notes. Git is a much better way to preserve this information. And it keeps the info with the pull request.
I'm open to other ideas, of course. Perhaps a new changelog in the guides? Or throw it in the API changelog? A separate text file linked from the release notes and/or the guides?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm hesitant to focus on the 40+ changes, many of which are only relevant if you're comparing old, new, and OpenAIRE closely (and are really closer to per-commit changes we usually make). I've added some additional detail to the release note to try and give more of a sense of the scope of the change (v4.5 schema, files, license/terms info, PIDs, ...).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still think this information should be in git but I give up.
Thanks for the additional information in the release note. It does help.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A couple more comments on code.
src/main/java/edu/harvard/iq/dataverse/pidproviders/PidProviderFactoryBean.java
Outdated
Show resolved
Hide resolved
src/main/java/edu/harvard/iq/dataverse/pidproviders/doi/XmlMetadataTemplate.java
Show resolved
Hide resolved
…itativeDataRepository/dataverse.git into datacite_plus_relPubRelType
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
API tests are passing. This is a lot to review and QA but I'm happy enough with how the code and docs look. This will be a great feature. People have been asking us to send more metadata to DataCite for years. Approved.
I re-tested some more today, since the last changes were made this am. I am satisfied with the PR and ready to merge. But please stop me if you're thinking of making more changes. |
@scolapasta and @jggautier please review and update/close linked issues in PR body. Thanks! |
What this PR does / why we need it: This PR adds a RelationType child field to the related publication parent field and uses it to provide a RelationType in the OpenAire and DataCite XML exports, DataCite XML sent to dataset (and the JSON and OAI_ORE exports which include all fields). It builds upon #10615 and should be reviewed/QA'd after that (or we can create a PR against that branch to more easily see the changes just to add a RelationType.
Which issue(s) this PR closes:
Relates to:
Special notes for your reviewer:
Suggestions on how to test this: Nominally the new XMLTemplateTest (and all others) should pass and it should be possible to publish datasets with any/all metadata using a DataCite test account. The log shouldn't contain any issues where DataCite responds with a 422 and indicates that the XML doesn't comply with their 4.5 schema. There should be lots of additional metadata for related publications, author entries should include ORCID info if provided and affiliations and GrantNumberAgency should have ROR info if a ROR rather than plain text was entered. Typos like having a related publication with id type doi and either no or non-DOI entries for the identifier and url should result in a log message and that particular related Publication not getting included in the XML, but otherwise should not cause a failure to update the XML. Etc.
FWIW: I have been able to run this on all the QDR production data and have everything update OK (though we have a few typos in the metadata to fix).
Does this PR introduce a user interface change? If mockups are available, please link/include them here:
Yes, it adds "Relation Type" to "Related Publication":
Is there a release notes update needed for this change?: included.
Additional documentation: As noted in the release note, there's a long doc listing ~all of the intended changes from the previous version - see https://docs.google.com/document/d/1JzDo9UOIy9dVvaHvtIbOI8tFU6bWdfDfuQvWWpC0tkA/edit?usp=sharing.
Changes to the guides can be previewed at https://dataverse-guide--10632.org.readthedocs.build/en/10632/admin/dataverses-datasets.html#send-metadata-to-pid-provider