Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OAI PMH: Invalid Identify-Response #4597

Closed
lmaylein opened this issue Apr 18, 2018 · 11 comments · Fixed by #7062
Closed

OAI PMH: Invalid Identify-Response #4597

lmaylein opened this issue Apr 18, 2018 · 11 comments · Fixed by #7062

Comments

@lmaylein
Copy link
Contributor

There seems to be a problem with the OAI-PMH-Identify-Response.

Different OAI-PMH validators report an invalidity of the
Dataverse OAI-Identify response:

For example:

OAI-PMH URL: https://heidata.uni-heidelberg.de/oai

ERROR: Identify response well-formed but invalid: Element 'XOAIDescription': This element is not expected. Expected is ( ##other{http://www.openarchives.org/OAI/2.0/}* )., line 1

Our Dataverse version is 4.8.2

@youssefOuahalou
Copy link
Member

Hello is the problem still relevant? Because I have the same problem.
ErrorOaiPmh

@pdurbin
Copy link
Member

pdurbin commented Jun 11, 2020

@youssefOuahalou yes, this does still appear to be an issue. I just tried it with Harvard Dataverse running Dataverse 4.20 and got the same error you did:

Screen Shot 2020-06-11 at 10 31 22 AM

As always, pull requests are welcome! 😄

@youssefOuahalou
Copy link
Member

youssefOuahalou commented Jun 11, 2020

@pdurbin Do you think that errors can appear elsewhere due to this?

@pdurbin
Copy link
Member

pdurbin commented Jun 11, 2020

@youssefOuahalou it's possible but I'm not aware of other errors, other problems. That said, it would be nice to get Dataverse to pass these tests completely.

@youssefOuahalou
Copy link
Member

@pdurbin Okay thank you so much :D

@JingMa87
Copy link
Contributor

JingMa87 commented Jul 7, 2020

@pdurbin As I suspected, the Identify XML response is created by the third party library XOAI. If you don't supply your own description for the Identify XML, the library automatically adds the <XOAIDescription xmlns="">XOAI: OAI-PMH Java Toolkit</XOAIDescription> tag in the response, which is faulty. I developed a fix on my computer which turns the XML into a presumably correct one. I'm saying presumably because I modeled the XML based on the example response in the official documentation: http://www.openarchives.org/OAI/openarchivesprotocol.html#Identify. I also used the Identify response from https://easy.dans.knaw.nl/oai/?verb=Identify, since it passes the validation tests on the websites mentioned above.
However, there's still 2 issues.

  1. I can't validate the XML response that I currently have online. I did use a XSD validator using the official XSD (http://www.openarchives.org/OAI/2.0/OAI-PMH.xsd), but this returns issues for both my XML as the https://easy.dans.knaw.nl/oai/?verb=Identify from before which passed the validation websites. So I don't think that validating the XSD is conclusive.
  2. In the XML response there's info that's repository dependent, like the repositoryIdentifier and the delimiter. Which default values do you think would be good? I think my current values are ok, which can be seen in the screenshot. Otherwise the dataverse admin or devs would have to configure the values somewhere.

image

I could make a Pull Request for the fix, let me know what your thoughts are.

@pdurbin
Copy link
Member

pdurbin commented Jul 7, 2020

The default values seem fine but it would be nice if they could be configurable. If it's not too hard to make a pull request, please go ahead. Sometimes it's easier to talk about code than words. 😄

@JingMa87
Copy link
Contributor

JingMa87 commented Jul 7, 2020

@pdurbin I made a pull request for the current fix and I'll look into making it configurable. Somewhere in the UI I suppose, no?

@pdurbin
Copy link
Member

pdurbin commented Jul 8, 2020

@JingMa87 for now it's better to have it configurable with a database setting without worrying about the UI. I would suggest looking at this example of a configurable setting:

InternetAddress internetAddress = MailUtil.parseSystemAddress(settingsService.getValueForKey(SettingsServiceBean.Key.SystemEmail));

That's from the file you touched in your pull request (#7062).

@JingMa87
Copy link
Contributor

JingMa87 commented Jul 9, 2020

@pdurbin I'm adding the new settings in the setup_all.sh script for new installations and also in one of those upgrade_v4.10.1_to_v4.11.sql files. How should I call the new upgrade file? I tested both the sql file and reinstalling dataverse which triggers the setup file. Both are working.

@pdurbin
Copy link
Member

pdurbin commented Jul 9, 2020

@JingMa87 well, for a database setting, you may not need a SQL script, and you may not need to change the setup script either. We use a convention of having a "sane default" for database settings. Here's an example:

String saneDefault = "http://guides.dataverse.org";
String guidesBaseUrl = settingsService.getValueForKey(SettingsServiceBean.Key.GuidesBaseUrl, saneDefault);
return guidesBaseUrl + "/" + getGuidesLanguage();

So if you hard code some sane defaults (maybe the ones you're using now), that's fine. And if people want to change them, they can do this on their own after installation is complete.

The important thing, of course, is to document the database options. And we would appreciate if you add a release note in your pull request. The process is here: http://guides.dataverse.org/en/4.20/developers/making-releases.html#write-release-notes . You could just write "added the following database options: [one], [two], [three]".

p.s. Since you asked, here's our process for SQL script naming, etc: http://guides.dataverse.org/en/4.20/developers/sql-upgrade-scripts.html

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants