Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix NPE on composite aggregation with sub-aggregations that need scores #28129

Merged
merged 3 commits into from
Jan 15, 2018

Conversation

jimczi
Copy link
Contributor

@jimczi jimczi commented Jan 8, 2018

The composite aggregation defers the collection of sub-aggregations to a second pass that visits documents only if they
appear in the top buckets. Though the scorer for sub-aggregations is not set on this second pass and generates an NPE if any sub-aggregation
tries to access the score. This change creates a scorer for the second pass and makes sure that sub-aggs can use it safely to check the score of
the collected documents.

@@ -95,7 +104,22 @@ public InternalAggregation buildAggregation(long zeroBucket) throws IOException
final CompositeValuesSource.Collector collector =
array.getLeafCollector(context.ctx, getSecondPassCollector(context.subCollector));
int docID;
DocIdSetIterator docIt = null;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe give it a name that is a bit more specific, eg. scorerIt?

while ((docID = docIdSetIterator.nextDoc()) != DocIdSetIterator.NO_MORE_DOCS) {
if (needsScores) {
if (docIt.docID() < docID) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this could be an assertion I think: assert docIt.docID() < docID? the scorer can't match fewer documents than docIdSetIterator?

@@ -95,7 +104,22 @@ public InternalAggregation buildAggregation(long zeroBucket) throws IOException
final CompositeValuesSource.Collector collector =
array.getLeafCollector(context.ctx, getSecondPassCollector(context.subCollector));
int docID;
DocIdSetIterator docIt = null;
if (needsScores()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe reuse the local variable needsScores rather than call again needsScores()?

The composite aggregation defers the collection of sub-aggregations to a second pass that visits documents only if they
appear in the top buckets. Though the scorer for sub-aggregations is not set on this second pass and generates an NPE if any sub-aggregation
tries to access the score. This change creates a scorer for the second pass and makes sure that sub-aggs can use it safely to check the score of
the collected documents.
@jimczi jimczi force-pushed the bug/composite_sub_agg_scores branch from 436d369 to 90b7e20 Compare January 15, 2018 12:39
@jimczi jimczi merged commit bd11e6c into elastic:master Jan 15, 2018
@jimczi jimczi deleted the bug/composite_sub_agg_scores branch January 15, 2018 17:30
jimczi added a commit that referenced this pull request Jan 15, 2018
…es (#28129)

The composite aggregation defers the collection of sub-aggregations to a second pass that visits documents only if they
appear in the top buckets. Though the scorer for sub-aggregations is not set on this second pass and generates an NPE if any sub-aggregation
tries to access the score. This change creates a scorer for the second pass and makes sure that sub-aggs can use it safely to check the score of
the collected documents.
jasontedor added a commit to jasontedor/elasticsearch that referenced this pull request Jan 15, 2018
* master: (21 commits)
  [GEO] Add WKT Support to GeoBoundingBoxQueryBuilder
  Painless: Add whitelist extensions (elastic#28161)
  Fix daitch_mokotoff phonetic filter to use the dedicated Lucene filter (elastic#28225)
  Avoid doing redundant work when checking for self references. (elastic#26927)
  Fix casts in HotThreads. (elastic#27578)
  Ignore the `-snapshot` suffix when comparing the Lucene version in the build and the docs. (elastic#27927)
  Allow update of `eager_global_ordinals` on `_parent`. (elastic#28014)
  Fix NPE on composite aggregation with sub-aggregations that need scores (elastic#28129)
  `MockTcpTransport` to connect asynchronously (elastic#28203)
  Fix synonym phrase query expansion for cross_fields parsing (elastic#28045)
  Introduce elasticsearch-core jar (elastic#28191)
  elastic#28218: Update the Lucene version for 6.2.0 after backport
  upgrade to lucene 7.2.1 (elastic#28218)
  [Docs] Fix an error in painless-types.asciidoc (elastic#28221)
  Adds metadata to rewritten aggregations (elastic#28185)
  Update version of TaskInfo header serialization after backport
  TEST: Tightens file-based condition in peer-recovery
  Correct backport replica rollback to 6.2 (elastic#28181)
  Backport replica rollback to 6.2 (elastic#28181)
  Rename deleteLocalTranslog to createNewTranslog
  ...
martijnvg added a commit that referenced this pull request Jan 16, 2018
* es/6.x: (31 commits)
  Fix eclipse build. (#28236)
  Never return null from Strings.tokenizeToStringArray (#28224)
  Fallback to TransportMasterNodeAction for cluster health retries (#28195)
  [Docs] Changes to ingest.asciidoc (#28212)
  TEST: Update logging for testAckedIndexing
  [GEO] Deprecate field parameter in GeoBoundingBoxQueryBuilder
  [GEO] Add WKT Support to GeoBoundingBoxQueryBuilder
  Avoid doing redundant work when checking for self references. (#26927)
  Fix casts in HotThreads. (#27578)
  Ignore the `-snapshot` suffix when comparing the Lucene version in the build and the docs. (#27927)
  Allow update of `eager_global_ordinals` on `_parent`. (#28014)
  Painless: Add whitelist extensions (#28161)
  Fix daitch_mokotoff phonetic filter to use the dedicated Lucene filter (#28225)
  Fix NPE on composite aggregation with sub-aggregations that need scores (#28129)
  #28045 restore removed import after backport
  Fix synonym phrase query expansion for cross_fields parsing (#28045)
  Introduce elasticsearch-core jar (#28191)
  upgrade to lucene 7.2.1 (#28218)
  [Docs] Fix an error in painless-types.asciidoc (#28221)
  Consistent updates of IndexShardSnapshotStatus (#28130)
  ...
martijnvg added a commit that referenced this pull request Jan 16, 2018
* es/master: (30 commits)
  [Docs] Fix Java Api index administration usage (#28133)
  Fix eclipse build. (#28236)
  Never return null from Strings.tokenizeToStringArray (#28224)
  Fallback to TransportMasterNodeAction for cluster health retries (#28195)
  [Docs] Changes to ingest.asciidoc (#28212)
  TEST: Update logging for testAckedIndexing
  [GEO] Add WKT Support to GeoBoundingBoxQueryBuilder
  Painless: Add whitelist extensions (#28161)
  Fix daitch_mokotoff phonetic filter to use the dedicated Lucene filter (#28225)
  Avoid doing redundant work when checking for self references. (#26927)
  Fix casts in HotThreads. (#27578)
  Ignore the `-snapshot` suffix when comparing the Lucene version in the build and the docs. (#27927)
  Allow update of `eager_global_ordinals` on `_parent`. (#28014)
  Fix NPE on composite aggregation with sub-aggregations that need scores (#28129)
  `MockTcpTransport` to connect asynchronously (#28203)
  Fix synonym phrase query expansion for cross_fields parsing (#28045)
  Introduce elasticsearch-core jar (#28191)
  #28218: Update the Lucene version for 6.2.0 after backport
  upgrade to lucene 7.2.1 (#28218)
  [Docs] Fix an error in painless-types.asciidoc (#28221)
  ...
@jimczi jimczi added v7.0.0-beta1 and removed v7.0.0 labels Feb 7, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants