Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

739 html codebook #6081

Merged
merged 14 commits into from
Aug 14, 2019
Merged

739 html codebook #6081

merged 14 commits into from
Aug 14, 2019

Conversation

lubitchv
Copy link
Contributor

@lubitchv lubitchv commented Aug 9, 2019

It creates an html codebook during publishing. Html codebook can be accessed from ExportMetadata->DDI html codebook

Related Issues

Pull Request Checklist

@djbrooke
Copy link
Contributor

djbrooke commented Aug 9, 2019

Thanks @lubitchv! I've added this to Code Review and we'll take a look soon.

Can you add a few notes here about how we'd test this? I think much of this is self explanatory, but I want to make sure we're not missing any details.

Tagging @jggautier for his thoughts as well.

@coveralls
Copy link

coveralls commented Aug 9, 2019

Coverage Status

Coverage decreased (-0.009%) to 19.488% when pulling 2163dc6 on lubitchv:739-html-codebook into 271dbd0 on IQSS:develop.

@djbrooke
Copy link
Contributor

djbrooke commented Aug 9, 2019

@lubitchv, I added some docs about the new export format, feel free to check if these are in the correct spot and feel free to expand on them if necessary. Thanks again for this contribution!!

@lubitchv
Copy link
Contributor Author

lubitchv commented Aug 9, 2019

@djbrooke This can be tested by comparing corresponding XML file from ExportMetadata->DDI with html file from ExportMetadata->DDI html codebook
It should have the same sections, variables etc, such as Document Description (docDscr), Study Description (stdyDscr), File Description (fileDscr), Variable Description (dataDscr)

Copy link
Member

@pdurbin pdurbin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a very cool feature and might make it easier to apply stylesheets other XML output. I emailed @jggautier about #5960 (comment) but I'll probably add some screenshots and thoughts on that issue, even though it's a little off topic. In short https://easy.dans.knaw.nl/oai/?verb=ListMetadataFormats is much prettier than https://demo.dataverse.org/oai?verb=ListMetadataFormats

The code seems fine (some tests would be nice) so I guess I'll go ahead and click "Approve". If we care to, one of us can change "html" to "HTML" in the bundle.

@@ -0,0 +1,2538 @@
<?xml version="1.0" encoding="UTF-8"?>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just wanted to remind everyone that there's a related codebook file at src/test/java/edu/harvard/iq/dataverse/export/ddi/codebook.xsd

@@ -1209,6 +1209,7 @@ dataset.exportBtn.itemLabel.datacite=DataCite
dataset.exportBtn.itemLabel.json=JSON
dataset.exportBtn.itemLabel.oai_ore=OAI_ORE
dataset.exportBtn.itemLabel.dataciteOpenAIRE=OpenAIRE
dataset.exportBtn.itemLabel.html=DDI html codebook
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest capitalizing this to "HTML".

@@ -1610,4 +1632,47 @@ private static boolean checkParentElement(XMLStreamWriter xmlw, String elementNa
return true;
}

public static void datasetHtmlDDI(JsonObject datasetDtoAsJson, DatasetVersion version, OutputStream outputStream) throws XMLStreamException {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How much effort would it be to write a test for this "datasetHtmlDDI" method?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To write a test to see that conversion from xml to html is successful would not be difficult. It can be added to DdiExportUtilTest. But to parse the resulting html file to see that the fields of xml converted properly to html would be quite cumbersome and time consuming.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lubitchv ok, if you'd like to write the easier test, please let us know if you'd like us to pull this out of QA until you're done. Or we can leave it in QA and tests can be added in a future pull request. It's up to you. 😄

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I will try to write the easier test on Monday. I think this PR can wait till I finish that test.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lubitchv sounds good. Please ping us when you're ready for us to move it back to code review. Have a nice weekend!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @lubitchv - thanks for the commit adding the test. Is this ready for another round of code review and then QA?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@djbrooke Not yet, I need to check some things out. I will let you know when its ready.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@djbrooke It is now ready for code review and QA.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @lubitchv! We'll take another look.

@pdurbin
Copy link
Member

pdurbin commented Aug 14, 2019

I was curious how "DDI HTML codebook" looks and works. Here are some screenshots as of 2163dc6 showing how a new tab is opened:

Screen Shot 2019-08-14 at 7 18 11 AM

Screen Shot 2019-08-14 at 7 18 20 AM

Copy link
Member

@pdurbin pdurbin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great but does it work on S3 or Swift? Do you know @lubitchv or can you tell @landreev ?

@@ -1209,6 +1209,7 @@ dataset.exportBtn.itemLabel.datacite=DataCite
dataset.exportBtn.itemLabel.json=JSON
dataset.exportBtn.itemLabel.oai_ore=OAI_ORE
dataset.exportBtn.itemLabel.dataciteOpenAIRE=OpenAIRE
dataset.exportBtn.itemLabel.html=DDI HTML codebook
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Title case is used for buttons, like this:

Suggested change
dataset.exportBtn.itemLabel.html=DDI HTML codebook
dataset.exportBtn.itemLabel.html=DDI HTML Codebook

@Override
public void exportDataset(DatasetVersion version, JsonObject json, OutputStream outputStream) throws ExportException {
try {
Path cachedMetadataFilePath = Paths.get(version.getDataset().getFileSystemDirectory().toString(), "export_ddi" + ".cached");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getFileSystemDirectory makes me suspect that this might not work on S3 or Swift.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pdurbin You are probably right. I did not test it for S3 or Swift, it probably will not work with it. That is why I just replaced getFileSystemDirectory with getExport from ExportService. As I understand it should work for S3 and Swift.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lubitchv perfect, thanks! I just moved this to QA.

@pdurbin pdurbin removed their assignment Aug 14, 2019
@pdurbin
Copy link
Member

pdurbin commented Aug 14, 2019

I feel like this issue is close (screenshots above show it working, even) so I'm leaving it in code review but here's what I'm thinking.

  • If @lubitchv could merge my suggestion for the button label I'd appreciate it. I made it "DDI HTML Codebook" for consistency and to match the docs.
  • There a call to the filesystem that I'm not sure will work on S3 or Swift. This should be tested. Maybe @landreev can take a look. It's in HtmlCodeBookExporter.java.

@pdurbin
Copy link
Member

pdurbin commented Aug 14, 2019

@landreev nevermind, @lubitchv just made a change to support S3 and Swift too so I'm approved this pull request, which moved it to QA.

@kcondon kcondon self-assigned this Aug 14, 2019
@kcondon kcondon merged commit 842197d into IQSS:develop Aug 14, 2019
@djbrooke
Copy link
Contributor

@lubitchv @kcondon @pdurbin - do I need to put anything in the release notes about re-running exports or will existing datasets get this new export automatically? Note we are already advising ReExportall for a citation metadata block change, so we may be covered anyway. Thanks for any guidance!

@pdurbin
Copy link
Member

pdurbin commented Aug 16, 2019

@djbrooke I think just talking about the new feature in the release note is enough. The first time someone tries to export metadata as an HTML codebook for a given dataset, it will be created on the fly and cached. That's my understanding anyway. In addition, according to this, missing exports will be created automatically nightly by a timer: http://guides.dataverse.org/en/4.15.1/admin/metadataexport.html#automatic-exports . Apologies if I'm getting any of this wrong.

@djbrooke
Copy link
Contributor

@pdurbin perfect, thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Make a HTML version of the DDI Codebook XML
5 participants