Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using Elasticsearch 7.17.3, GMS generates a lot of log entries "[ignore_throttled] parameter is deprecated" #9745

Closed
githendrik opened this issue Jan 30, 2024 · 4 comments
Labels
bug Bug report stale

Comments

@githendrik
Copy link
Contributor

Describe the bug

I'm running ElasticSearch 7.17.3, as specified in the datahub helm charts. This works, but GMS produces a lot of WARNINGS in the logs:

org.opensearch.client.RestClient:85 - request [POST http://elasticsearch:9200/datahubingestionsourceindex_v2/_search?typed_keys=true&max_concurrent_shard_requests=5&ignore_unavailable=false&expand_wildcards=open&allow_no_indices=true&ignore_throttled=true&search_type=query_then_fetch&batched_reduce_size=512&ccs_minimize_roundtrips=true] returned 1 warnings: [299 Elasticsearch-7.17.3-5ad023604c8d7416c9eb6c0eadb62b14e766caff "[ignore_throttled] parameter is deprecated because frozen indices have been deprecated. Consider cold or frozen tiers in place of frozen indices."]

Apparently it's recommended to update the ElasticSearch / Opensearch java clients to remove these warnings. I've tried updating the opensearch client to the latest version (2.11.1), but unfortunately this has breaking changes and doesn't compile.

It doesn't happen when running datahub docker quickstart, as the ES version in the quickstart docker compose files is still on 7.10.1

To Reproduce
Steps to reproduce the behavior:

  1. Set DATAHUB_SEARCH_TAG to 7.17.3
  2. Start Datahub using Docker quickstart
  3. Observe GMS logs

Expected behavior
No deprecation warnings in logs

Additional info
I'm more than happy to contribute to a fix. However I'm missing a bit of context on the implementation, and as stated this wasn't a "zero-touch" upgrade unfortunately.
If someone could give me some pointers, I'd be happy to try and find a fix.

@githendrik githendrik added the bug Bug report label Jan 30, 2024
Copy link

github-actions bot commented Mar 1, 2024

This issue is stale because it has been open for 30 days with no activity. If you believe this is still an issue on the latest DataHub release please leave a comment with the version that you tested it with. If this is a question/discussion please head to https://slack.datahubproject.io. For feature requests please use https://feature-requests.datahubproject.io

@github-actions github-actions bot added the stale label Mar 1, 2024
@Masterchen09
Copy link
Contributor

In the Elasticsearch repository there is a pull request which prevents that the ignore_throttled parameter is added to the search request, when it is set to the default value of true (elastic/elasticsearch#84827). However DataHub is using the OpenSearch client (see here= and unfortunately there isn't such a logic which prevents the parameter to be added to the search request (see here). I am not even sure whether you can use Elasticsearch 7.17.3 with the OpenSearch client, because it seems the OpenSearch client is only guaranteed to be compatible up to version 7.10.2 of Elasticsearch (see here). It also seems to be that frozen tiers are not deprecated in OpenSearch...? Maybe the Helm chart should also have an OpenSearch deployment instead of Elasticsearch if the OpenSearch client is used?

Nonetheless a quick, but maybe not nice solution would be to filter the corresponding log message using the logback.xml here:

<filter class="com.linkedin.metadata.utils.log.LogMessageFilter">
	<excluded>scanned from multiple locations</excluded>
	<excluded>[ignore_throttled] parameter is deprecated because frozen indices have been deprecated</excluded>
</filter>

Copy link

This issue is stale because it has been open for 30 days with no activity. If you believe this is still an issue on the latest DataHub release please leave a comment with the version that you tested it with. If this is a question/discussion please head to https://slack.datahubproject.io. For feature requests please use https://feature-requests.datahubproject.io

@github-actions github-actions bot added the stale label Apr 14, 2024
Copy link

This issue was closed because it has been inactive for 30 days since being marked as stale.

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale May 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Bug report stale
Projects
None yet
Development

No branches or pull requests

2 participants