Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve FieldFetcher retrieval of fields #66160

Merged
merged 3 commits into from
Dec 14, 2020

Conversation

cbuescher
Copy link
Member

Currently FieldFetcher stores all the FieldContexts that are later used to
retrieve the fields in a List. This has the disadvantage that the same field
path can be retrieved several times (e.g. if multiple patterns match the same
path or if similar paths are defined several times e.g. with different formats).
Currently the last value to be retrieved "wins" and gets returned. We might as
well de-duplicate the FieldContexts by using a map internally, keyed by the
field path that is going to be retrieved, to avoid more work later.

Currently FieldFetcher stores all the FieldContexts that are later used to
retrieve the fields in a List. This has the disadvantage that the same field
path can be retrieved several times (e.g. if multiple patterns match the same
path or if similar paths are defined several times e.g. with different formats).
Currently the last value to be retrieved "wins" and gets returned. We might as
well de-duplicate the FieldContexts by using a map internally, keyed by the
field path that is going to be retrieved, to avoid more work later.
@cbuescher cbuescher added >enhancement :Search Foundations/Mapping Index mappings, including merging and defining field types v8.0.0 v7.11.0 v7.12.0 labels Dec 10, 2020
@elasticmachine elasticmachine added the Team:Search Meta label for search team label Dec 10, 2020
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-search (Team:Search)

Copy link
Contributor

@jtibshirani jtibshirani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this is a nice simplification.

@@ -48,7 +48,7 @@ public static FieldFetcher create(QueryShardContext context,
SearchLookup searchLookup,
Collection<FieldAndFormat> fieldAndFormats) {

List<FieldContext> fieldContexts = new ArrayList<>();
Map<String, FieldContext> fieldContexts = new HashMap<>();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it'd make sense to use a LinkedHashMap here -- then the fields would come back in the order requested. For example if you passed "fields": [ "name", "title"], then it'd be nice to return

"fields": {
  "name": [ "christoph"],
  "title": [ "engineer" ]
}

We wouldn't make a formal guarantee about order (since it's a JSON map), I think it helps readability for debugging, etc.

@@ -48,7 +48,7 @@ public static FieldFetcher create(QueryShardContext context,
SearchLookup searchLookup,
Collection<FieldAndFormat> fieldAndFormats) {

List<FieldContext> fieldContexts = new ArrayList<>();
Map<String, FieldContext> fieldContexts = new HashMap<>();
List<String> unmappedFetchPattern = new ArrayList<>();
Set<String> mappedToExclude = new HashSet<>();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we remove this set, and instead rely on fieldContexts.containsKey(...)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, great unintended side effect

@cbuescher
Copy link
Member Author

@jtibshirani thanks for the review, I pushed the changes you requested.

@cbuescher
Copy link
Member Author

@elasticmachine update branch

@cbuescher
Copy link
Member Author

@elasticmachine run elasticsearch-ci/1
@elasticmachine run elasticsearch-ci/eql-correctness

@cbuescher cbuescher merged commit 852f6a4 into elastic:master Dec 14, 2020
cbuescher pushed a commit to cbuescher/elasticsearch that referenced this pull request Dec 14, 2020
Currently FieldFetcher stores all the FieldContexts that are later used to
retrieve the fields in a List. This has the disadvantage that the same field
path can be retrieved several times (e.g. if multiple patterns match the same
path or if similar paths are defined several times e.g. with different formats).
Currently the last value to be retrieved "wins" and gets returned. We might as
well de-duplicate the FieldContexts by using a map internally, keyed by the
field path that is going to be retrieved, to avoid more work later.
cbuescher pushed a commit that referenced this pull request Dec 14, 2020
Currently FieldFetcher stores all the FieldContexts that are later used to
retrieve the fields in a List. This has the disadvantage that the same field
path can be retrieved several times (e.g. if multiple patterns match the same
path or if similar paths are defined several times e.g. with different formats).
Currently the last value to be retrieved "wins" and gets returned. We might as
well de-duplicate the FieldContexts by using a map internally, keyed by the
field path that is going to be retrieved, to avoid more work later.
jasontedor added a commit to jasontedor/elasticsearch that referenced this pull request Dec 14, 2020
* elastic/master: (33 commits)
  Add searchable snapshot cache folder to NodeEnvironment (elastic#66297)
  [DOCS] Add dynamic runtime fields to docs (elastic#66194)
  Add HDFS searchable snapshot integration (elastic#66185)
  Support canceling cross-clusters search requests (elastic#66206)
  Mute testCacheSurviveRestart (elastic#66289)
  Fix cat tasks api params in spec and handler (elastic#66272)
  Snapshot of a searchable snapshot should be empty (elastic#66162)
  [ML] DFA _explain API should not fail when none field is included (elastic#66281)
  Add action to decommission legacy monitoring cluster alerts (elastic#64373)
  move rollup_index param out of RollupActionConfig (elastic#66139)
  Improve FieldFetcher retrieval of fields (elastic#66160)
  Remove unsed fields in `RestAnalyzeAction` (elastic#66215)
  Simplify searchable snapshot CacheKey (elastic#66263)
  Autoscaling remove feature flags (elastic#65973)
  Improve searchable snapshot mount time (elastic#66198)
  [ML] Report cause when datafeed extraction encounters error (elastic#66167)
  Remove suggest reference in some API specs (elastic#66180)
  Fix warning when installing a plugin for different ESversion (elastic#66146)
  [ML] make `xpack.ml.max_ml_node_size` and `xpack.ml.use_auto_machine_memory_percent` dynamically settable (elastic#66132)
  [DOCS] Add `require_alias` to Bulk API (elastic#66259)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>enhancement :Search Foundations/Mapping Index mappings, including merging and defining field types Team:Search Meta label for search team v7.11.0 v7.12.0 v8.0.0-alpha1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants