Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Indexing / Add the capability to index the catalogue content in Elasticsearch. #1966

Closed
wants to merge 4 commits into from

Conversation

fxprunayre
Copy link
Member

@fxprunayre fxprunayre commented Apr 26, 2017

Overview

Main goal of this proposal is to be able to push GeoNetwork content in a remote index in order to build dashboards on the content of the catalogue. This work will also help on the future migration from Lucene to Elasticsearch.

Proposal

Add the capability to index the catalogue content in a remote elasticsearch index. Indexing is similar to actual indexing process ie.:

  • collect information in database about the record
  • extract record fields using XSLT
  • send document to the index

This will expose all metadata records in the Elasticsearch index (you've to take care of restricting access to not public record if needed).

From the admin trigger the indexation process:

image

API allows to remove index content and index the full catalogue or a selection:

image

Usages

Creating dashboards using Kibana

Dahsboard can be created on metadata content and on "internal" fields like validation status, publication groups, ... (which is a limitation of user of DAOBS project which only harvest records from CSW)

image

Dashboards can then be added to the admin console:

image

Reference document

@fxprunayre fxprunayre added this to the Future release milestone Apr 26, 2017
@fxprunayre fxprunayre changed the title [WIP] Indexing / Add the capability to index the catalogue content in Elasticsearch. Indexing / Add the capability to index the catalogue content in Elasticsearch. Jun 12, 2017
@fxprunayre fxprunayre changed the base branch from es to develop June 13, 2017 15:01
@@ -566,7 +568,7 @@ public void indexMetadata(final List<String> metadataIds) throws Exception {
/**
* TODO javadoc.
*/
public void indexMetadata(final String metadataId, boolean forceRefreshReaders) throws Exception {
public void indexMetadata(final String metadataId, boolean forceRefreshReaders, ISearchManager searchManager) throws Exception {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think would be better to keep a method also with metadataId and forceRefreshReaders only that calls this one sending searchManager as null, that is used in all places changed to send searchManager to null. Wth the change there's quite changes where searchManager is send as null.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once we move totally to ES there will no need of this - having only one SearchManager. See what has been done last year in Solr POC https://github.com/titellus/core-geonetwork/blob/solr/core/src/main/java/org/fao/geonet/kernel/search/SolrSearchManager.java. I would say that we can tolerate this for the time being ?

@@ -75,6 +76,9 @@
public static final String TREE_FIELD_SUFFIX = "_tree";
public static final String FEATURE_FIELD_PREFIX = "ft_";

@Value("es.index.features")
private String index = "features";

@Autowired
private EsClient client;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this work with multinode setup?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Multinode mode could you use the same Elasticsearch index for features (and possibly catalog and search stats). BTW we should maybe ask if we want to keep this mode and support it, but that another issue.

@fxprunayre
Copy link
Member Author

Replaced by #2082

@fxprunayre fxprunayre closed this Jul 24, 2017
@fxprunayre fxprunayre deleted the es-indexing branch August 19, 2021 12:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants