Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create Dataset: Allow data set to be created even if not able to index. Manage index failures for later correction. #229

Closed
eaquigley opened this issue Jul 9, 2014 · 7 comments
Assignees
Labels
Type: Feature a feature request

Comments

@eaquigley
Copy link
Contributor


Author Name: Kevin Condon (@kcondon)
Original Redmine Issue: 3643, https://redmine.hmdc.harvard.edu/issues/3643
Original Date: 2014-03-05
Original Assignee: Gustavo Durand


Currently, if you are not able to index a dataset on save, it does not save / create a dataset and throws an index error in log.

This should ideally be separate actions: create should work as long as data passes validation and index may or may not fail but can be managed like we do in 3.x, last index time column and an index sweeper/ timer job.

@eaquigley eaquigley added this to the Dataverse 4.0: In Review milestone Jul 9, 2014
@eaquigley eaquigley modified the milestones: Dataverse 4.0: Beta 4, Dataverse 4.0: In Review Jul 15, 2014
@eaquigley
Copy link
Contributor Author

@scolapasta and @kcondon is this still an issue?

@pdurbin
Copy link
Member

pdurbin commented Jul 15, 2014

Please note that I'd like to re-implement indexing with recovery from failure in mind in #702.

@pdurbin
Copy link
Member

pdurbin commented Jan 7, 2015

Passing to QA for the part about how you should now be able to create a dataset when Solr is down (no exceptions thrown, etc.). #702 is the main ticket about keeping the database and Solr in sync once Solr is back up.

@pdurbin pdurbin removed their assignment Jan 7, 2015
@esotiri esotiri self-assigned this Jan 8, 2015
@pdurbin
Copy link
Member

pdurbin commented Jan 8, 2015

In addition, I'd be happy to get some feed back on the new "status" option when looking at the Solr index. @esotiri and I looked a scenario where we created a dataset while Solr was down and noted that once we brought Solr back up, Solr is out of sync with the database (for both content and permissions of the newly created dataset):

$ curl http://localhost:8080/api/index/status | jq .
{
  "data": {
    "permissionsInIndexButNotDatabase": {
      "dvobjects": []
    },
    "permissionsInDatabaseButMissingFromSolr": {
      "dvobjects": [
        13
      ]
    },
    "contentInIndexButNotDatabase": {
      "datasets": [],
      "dataverses": []
    },
    "contentInDatabaseButStaleInOrMissingFromIndex": {
      "datasets": [
        13
      ],
      "dataverses": []
    }
  },
  "status": "OK"
}

Please see also remarks by @kcondon under "Admin Stale Index Notification" in a Google Doc about Solr: https://docs.google.com/a/harvard.edu/document/d/1EGXOdvudBL3xhRpFaqAq1s9Zizm1jgVXwtO9FhAQKxc/edit?usp=sharing

I'm definitely willing to change how the status looks (and of course we'll build a GUI some day). @scolapasta this is the "status" I was telling you about the other day.

@pdurbin
Copy link
Member

pdurbin commented Jan 8, 2015

Also, as @esotiri and I talked about, this individual dataset can be indexed into Solr (once it's back up) with curl http://localhost:8080/api/index/datasets/13. Then the status should be good/clean.

@esotiri
Copy link
Contributor

esotiri commented Jan 9, 2015

solr down > add dataset or dataverse > verified:
{ "status": "OK", "data": { "contentInDatabaseButStaleInOrMissingFromIndex": { "dataverses": [], "datasets": [ 12 ] }, "contentInIndexButNotDatabase": { "dataverses": [], "datasets": [] }, "permissionsInDatabaseButMissingFromSolr": { "dvobjects": [ 12 ] }, "permissionsInIndexButNotDatabase": { "dvobjects": [] } } }
running index shows status: ok

@esotiri
Copy link
Contributor

esotiri commented Jan 12, 2015

In the future It would be desirable to have solr detect the object that was not indexed and index it.

issue resolved

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Type: Feature a feature request
Projects
None yet
Development

No branches or pull requests

4 participants