Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MyData metadata validity facet violates Don't Make Me Think #9836

Closed
pdurbin opened this issue Aug 28, 2023 · 4 comments · Fixed by #9964
Closed

MyData metadata validity facet violates Don't Make Me Think #9836

pdurbin opened this issue Aug 28, 2023 · 4 comments · Fixed by #9964
Milestone

Comments

@pdurbin
Copy link
Member

pdurbin commented Aug 28, 2023

I was just playing around with the "Creating datasets with incomplete metadata through API" feature developed in this PR:

I'm on Dataverse 5.14.

It's a cool feature, but I think MyData suffers a bit from it.

Imagine the following scenario:

  • a user creates a collection (dataverse), a dataset, and uploads a file
  • time passes
  • the user is looking for their dataset and files

Below is what might happen.

User clicks MyData

Screen Shot 2023-08-28 at 12 37 05 PM

User sees "sorry no, results are found"

Screen Shot 2023-08-28 at 12 37 19 PM

Now what? 🤔

Realistically, at this point the user probably contacts support. Where's my data? Help!

A user with time on their hands and a penchant for clicking things might explore further.

What if I uncheck "incomplete metadata"?

Nope, nothing.

Screen Shot 2023-08-28 at 12 39 04 PM

What if I uncheck "incomplete metadata" AND "valid"?

Finally, we are getting somewhere! There's my dataset and my collection (dataverse)! But where's my file?

Screen Shot 2023-08-28 at 12 39 14 PM

Check files to see files

As usual, files are not selected by default but if you check "files" you can see them. This is out of scope for this issue.

Screen Shot 2023-08-28 at 12 39 19 PM

In conclusion

When a user visits MyData they should see their data, not "Sorry, no results were found." They shouldn't have to uncheck boxes to see their data.

@ErykKul
Copy link
Collaborator

ErykKul commented Sep 14, 2023

@pdurbin, It looks like you are missing the solr field for metadata validity. Updating schema and reindexing should fix it. I hope that the documentation mentions this, at least the release notes should. I have verified this on our pilot, which is on 5.14:

image

@ErykKul
Copy link
Collaborator

ErykKul commented Sep 22, 2023

@pdurbin
I was revisiting the code for that feature, and I might have been too defensive with the programming:
IndexServiceBean.java#L814-L816

        if (JvmSettings.API_ALLOW_INCOMPLETE_METADATA.lookupOptional(Boolean.class).orElse(false)) {
            solrInputDocument.addField(SearchFields.DATASET_VALID, valid);
        }

This has a small advantage in performance when you are not using that feature. It also makes the schema update and reindexing unnecessary in that case. However, you need to reindex all dataset after enabling the feature. If you disable it, add some dataset, and enable it again, then you also need to reindex the datasets as the newly added datasets do not have the necessary solr field. The field is only added to datasets while the feature is enabled. It might make more sense to remove the check and always add the completeness field to the solr document. This would throw exceptions if your schema is not up-to-date, so you know you need to update the schema (and reindex), and remove the complexity of always reindexing all datasets after enabling this feature. What do you think? I can also simply add this information in the documentation for that feature.

@ErykKul
Copy link
Collaborator

ErykKul commented Sep 28, 2023

@pdurbin The PR should have simplified it. Can you check the description of the PR to see if it is clear what it does?

@pdurbin
Copy link
Member Author

pdurbin commented Sep 28, 2023

@ErykKul I just took a quick look at the PR you opened:

This helps explain the behavior I was seeing. Thanks!

@pdurbin pdurbin added this to the 6.2 milestone Jan 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.

2 participants