Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement DiffableStringMap's get and containsKey in terms of the wrapped innerMap #77965

Merged
merged 1 commit into from
Sep 17, 2021

Conversation

joegallo
Copy link
Contributor

@joegallo joegallo commented Sep 16, 2021

In practice, the wrapped innerMap is either empty or a HashMap, so we can get much better performance here by delegating to it for these methods -- as opposed to using the default implementations from AbstractMap (javadoc link) which do a linear scan through via an iterator() of the entrySet().

Totally unscientific (because I'm using it for the first time) JMH results from before:

Benchmark                                           Mode  Cnt     Score    Error  Units
LifecycleExecutionStateBenchmark.fromIndexMetadata  avgt   30  1358.084 ± 34.200  ns/op

and after:

Benchmark                                           Mode  Cnt    Score   Error  Units
LifecycleExecutionStateBenchmark.fromIndexMetadata  avgt   30  332.168 ± 5.678  ns/op

This ends up mattering because LifecycleExecutionState#fromIndexMetadata ends up loading all the keys it's interested in by way of LifecycleExecutionState#fromCustomMetadata and there's a lot of them. The customData map there is a DiffableStringMap, so we're doing a scan through all the keys in fromCustomMetadata invoking get each time, and get itself was doing a further scan through all the keys (well, half the keys, on average) internally, so fromIndexMetadata was accidentally quadratic.

In practice, the wrapped innerMap is either empty or a HashMap, so we
can get much better performance here by delegating to it for these
methods -- as opposed to using the default implementations from
AbstractMap which do a linear scan through via an iterator() of the
entrySet().
@joegallo joegallo added >bug :Data Management/ILM+SLM Index and Snapshot lifecycle management v8.0.0 v7.16.0 labels Sep 16, 2021
@elasticmachine elasticmachine added the Team:Data Management Meta label for data/management team label Sep 16, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-data-management (Team:Data Management)

Copy link
Contributor

@probakowski probakowski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks @joegallo!

@joegallo joegallo merged commit 6c4c5e2 into elastic:master Sep 17, 2021
@joegallo joegallo deleted the accidentally-quadratic branch September 17, 2021 10:54
wjp719 added a commit to wjp719/elasticsearch that referenced this pull request Sep 18, 2021
* master: (185 commits)
  Implement get and containsKey in terms of the wrapped innerMap (elastic#77965)
  Adjust Lucene version and enable BWC tests (elastic#77933)
  Disable BWC to upgrade to Lucene-8.10-snapshot
  Reenable MlDistributedFailureIT
  [DOCS] Fix typo for `script.painless.regex.enabled` setting value (elastic#77853)
  Upgrade to Lucene-8.10.0-snapshot-bf2fcb53079 (elastic#77801)
  [DOCS] Fix ESS install lead-in (elastic#77887)
  Resolve thirdparty gradle plugin artifacts from mavencentral (elastic#77865)
  Reduce the number of times that `LifecycleExecutionState` is parsed when running a policy. (elastic#77863)
  Utility methods to add and remove backing indices from data streams (elastic#77778)
  Use Objects.equals() instead of == to compare strings (elastic#77840)
  [ML] prefer least allocated model when a new node is added to the cluster (elastic#77756)
  Deprecate ignore_throttled parameter (elastic#77479)
  Improve LifecycleExecutionState parsing. (elastic#77855)
  [DOCS] Removes deprecated word from HLRC title. (elastic#77851)
  Remove legacy geo code from AggregationResultUtils (elastic#77702)
  Adjust SearchableSnapshotsBlobStoreCacheIntegTests.testBlobStoreCache (elastic#77758)
  Laxify SecureSM to allow creation of the JDK's innocuous threads (elastic#77789)
  [Test] Reduce concurrency when testing creation of security index (elastic#75293)
  Refactor metric PipelineAggregation integration test (elastic#77548)
  ...

# Conflicts:
#	server/src/main/java/org/elasticsearch/cluster/metadata/DataStream.java
@martijnvg
Copy link
Member

Thanks @joegallo, this is a great fix! Maybe also backport this change to the 7.15 branch?

@joegallo
Copy link
Contributor Author

Thanks @joegallo, this is a great fix! Maybe also backport this change to the 7.15 branch?

✅, 2fdd5dc

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug :Data Management/ILM+SLM Index and Snapshot lifecycle management Team:Data Management Meta label for data/management team v7.15.1 v7.16.0 v8.0.0-beta1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants