Skip to content

Commit

Permalink
Merge pull request #10651 from IQSS/10611-harvested-datasets-display
Browse files Browse the repository at this point in the history
Fixes an issue with search cards for harvested records
  • Loading branch information
sekmiller authored Jun 26, 2024
2 parents 94b15e2 + c37b0f4 commit a41f61f
Show file tree
Hide file tree
Showing 5 changed files with 40 additions and 5 deletions.
10 changes: 10 additions & 0 deletions doc/release-notes/10611-harvested-origin-facet.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
NOTE that this release note supercedes the 10464-add-name-harvesting-client-facet.md note from the PR 10464.

An option has been added to index the name of the Harvesting Client as the "Metadata Source" of harvested datasets and files; if enabled, the Metadata Source facet will be showing separate entries for the content harvested from different sources, instead of the current, default behavior where there is one "Harvested" facet for all such content.


TODO: for the v6.3 release note:
If you choose to enable the extended "Metadata Souce" facet for harvested content, set the optional feature flage (jvm option) `dataverse.feature.index-harvested-metadata-source=true` before reindexing.

[Please note that the upgrade instruction in 6.3 will contain a suggestion to run full reindex, as part of the Solr upgrade, so the sentence above will need to be added to that section]

3 changes: 3 additions & 0 deletions doc/sphinx-guides/source/installation/config.rst
Original file line number Diff line number Diff line change
Expand Up @@ -3277,6 +3277,9 @@ please find all known feature flags below. Any of these flags can be activated u
* - reduce-solr-deletes
- Avoids deleting and recreating solr documents for dataset files when reindexing.
- ``Off``
* - index-harvested-metadata-source
- If enabled, this will index the name of the Harvesting Client as the "Metadata Source" of harvested datasets and files; so that the Metadata Source facet on the collection page will be showing separate entries for the content harvested from different sources/via different clients, instead of the current, default behavior where there is one "Harvested" facet for all such content. Requires a reindex.
- ``Off``

**Note:** Feature flags can be set via any `supported MicroProfile Config API source`_, e.g. the environment variable
``DATAVERSE_FEATURE_XXX`` (e.g. ``DATAVERSE_FEATURE_API_SESSION_AUTH=1``). These environment variables can be set in your shell before starting Payara. If you are using :doc:`Docker for development </container/dev-usage>`, you can set them in the `docker compose <https://docs.docker.com/compose/environment-variables/set-environment-variables/>`_ file.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -987,8 +987,14 @@ public SolrInputDocuments toSolrDocs(IndexableDataset indexableDataset, Set<Long

if (dataset.isHarvested()) {
solrInputDocument.addField(SearchFields.IS_HARVESTED, true);
solrInputDocument.addField(SearchFields.METADATA_SOURCE,
if (FeatureFlags.INDEX_HARVESTED_METADATA_SOURCE.enabled()) {
// New - as of 6.3 - option of indexing the actual origin of
// harvested objects as the metadata source:
solrInputDocument.addField(SearchFields.METADATA_SOURCE,
dataset.getHarvestedFrom() != null ? dataset.getHarvestedFrom().getName() : HARVESTED);
} else {
solrInputDocument.addField(SearchFields.METADATA_SOURCE, HARVESTED);
}
} else {
solrInputDocument.addField(SearchFields.IS_HARVESTED, false);
solrInputDocument.addField(SearchFields.METADATA_SOURCE, rdvName); //rootDataverseName);
Expand Down Expand Up @@ -1495,7 +1501,14 @@ public SolrInputDocuments toSolrDocs(IndexableDataset indexableDataset, Set<Long
}
if (datafile.isHarvested()) {
datafileSolrInputDocument.addField(SearchFields.IS_HARVESTED, true);
datafileSolrInputDocument.addField(SearchFields.METADATA_SOURCE, HARVESTED);
if (FeatureFlags.INDEX_HARVESTED_METADATA_SOURCE.enabled()) {
// New - as of 6.3 - option of indexing the actual origin of
// harvested objects as the metadata source:
datafileSolrInputDocument.addField(SearchFields.METADATA_SOURCE,
dataset.getHarvestedFrom() != null ? dataset.getHarvestedFrom().getName() : HARVESTED);
} else {
datafileSolrInputDocument.addField(SearchFields.METADATA_SOURCE, HARVESTED);
}
} else {
datafileSolrInputDocument.addField(SearchFields.IS_HARVESTED, false);
datafileSolrInputDocument.addField(SearchFields.METADATA_SOURCE, rdvName);
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -577,8 +577,7 @@ public SolrQueryResponse search(
solrSearchResult.setDvTree(dvTree);
solrSearchResult.setDatasetValid(datasetValid);

String originSource = (String) solrDocument.getFieldValue(SearchFields.METADATA_SOURCE);
if (IndexServiceBean.HARVESTED.equals(originSource)) {
if (Boolean.TRUE.equals((Boolean) solrDocument.getFieldValue(SearchFields.IS_HARVESTED))) {
solrSearchResult.setHarvested(true);
}

Expand Down Expand Up @@ -1110,7 +1109,7 @@ private String getPermissionFilterQuery(DataverseRequest dataverseRequest, SolrQ
}

String ret = sb.toString();
logger.info("Returning experimental query: " + ret);
logger.fine("Returning experimental query: " + ret);
return ret;
}

Expand Down
10 changes: 10 additions & 0 deletions src/main/java/edu/harvard/iq/dataverse/settings/FeatureFlags.java
Original file line number Diff line number Diff line change
Expand Up @@ -58,6 +58,16 @@ public enum FeatureFlags {
* @since Dataverse 6.3
*/
ADD_PUBLICOBJECT_SOLR_FIELD("add-publicobject-solr-field"),
/**
* With this flag set, Dataverse will index the actual origin of harvested
* metadata records, instead of the "Harvested" string in all cases.
*
* @apiNote Raise flag by setting
* "dataverse.feature.index-harvested-metadata-source"
* @since Dataverse 6.3
*/
INDEX_HARVESTED_METADATA_SOURCE("index-harvested-metadata-source"),

/**
* Dataverse normally deletes all solr documents related to a dataset's files
* when the dataset is reindexed. With this flag enabled, additional logic is
Expand Down

0 comments on commit a41f61f

Please sign in to comment.