Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Aggregations/HL Rest client fix: missing scores #32774

Merged
merged 3 commits into from
Aug 14, 2018
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,7 @@
import org.elasticsearch.index.query.MatchQueryBuilder;
import org.elasticsearch.index.query.QueryBuilders;
import org.elasticsearch.index.query.ScriptQueryBuilder;
import org.elasticsearch.index.query.TermQueryBuilder;
import org.elasticsearch.index.query.TermsQueryBuilder;
import org.elasticsearch.join.aggregations.Children;
import org.elasticsearch.join.aggregations.ChildrenAggregationBuilder;
Expand All @@ -59,6 +60,9 @@
import org.elasticsearch.search.aggregations.BucketOrder;
import org.elasticsearch.search.aggregations.bucket.range.Range;
import org.elasticsearch.search.aggregations.bucket.range.RangeAggregationBuilder;
import org.elasticsearch.search.aggregations.bucket.significant.SignificantTerms;
import org.elasticsearch.search.aggregations.bucket.significant.SignificantTermsAggregationBuilder;
import org.elasticsearch.search.aggregations.bucket.significant.heuristics.PercentageScore;
import org.elasticsearch.search.aggregations.bucket.terms.Terms;
import org.elasticsearch.search.aggregations.bucket.terms.TermsAggregationBuilder;
import org.elasticsearch.search.aggregations.matrix.stats.MatrixStats;
Expand Down Expand Up @@ -267,6 +271,33 @@ public void testSearchWithTermsAgg() throws IOException {
assertEquals(2, type2.getDocCount());
assertEquals(0, type2.getAggregations().asList().size());
}

public void testSearchWithSignificantTermsAgg() throws IOException {
SearchRequest searchRequest = new SearchRequest();
SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
searchSourceBuilder.query(new MatchQueryBuilder("num","50"));
searchSourceBuilder.aggregation(new SignificantTermsAggregationBuilder("agg1", ValueType.STRING)
.field("type.keyword")
.minDocCount(1)
.significanceHeuristic(new PercentageScore()));
searchSourceBuilder.size(0);
searchRequest.source(searchSourceBuilder);
SearchResponse searchResponse = execute(searchRequest, highLevelClient()::search, highLevelClient()::searchAsync);
assertSearchHeader(searchResponse);
assertNull(searchResponse.getSuggest());
assertEquals(Collections.emptyMap(), searchResponse.getProfileResults());
assertEquals(0, searchResponse.getHits().getHits().length);
assertEquals(0f, searchResponse.getHits().getMaxScore(), 0f);
SignificantTerms significantTermsAgg = searchResponse.getAggregations().get("agg1");
assertEquals("agg1", significantTermsAgg.getName());
assertEquals(1, significantTermsAgg.getBuckets().size());
SignificantTerms.Bucket type1 = significantTermsAgg.getBucketByKey("type1");
assertEquals(1, type1.getDocCount());
assertEquals(1, type1.getSubsetDf());
assertEquals(1, type1.getSubsetSize());
assertEquals(3, type1.getSupersetDf());
assertEquals(1d/3d, type1.getSignificanceScore(), 0d);
}

public void testSearchWithRangeAgg() throws IOException {
{
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -175,7 +175,7 @@ static <B extends ParsedBucket> B parseSignificantTermsBucketXContent(final XCon
bucket.subsetDf = value;
bucket.setDocCount(value);
} else if (InternalSignificantTerms.SCORE.equals(currentFieldName)) {
bucket.score = parser.longValue();
bucket.score = parser.doubleValue();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ops :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would it maybe be possible to recreate this in a unit test as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there any examples of existing unit tests for "toXContent -> fromXContent" serialization on aggregation buckets I can base this on?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as far as I can see BaseAggregationTestCase has a testFromXContent method. So SignficantTermsTests sounds like the appropriate test.

Copy link
Contributor Author

@markharwood markharwood Aug 13, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BaseAggregationTestCase has a testFromXContent method

In which case we might be missing some test infrastructure for aggs here. That method is for testing agg requests but we have no equivalent for testing responses. I guess toXcontent->fromXContent is a transformation of response objects that didn't exist before high level rest client.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, that is for requests, my bad, but we have tests for responses too, for instance InternalTermsTestCase

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Had a quick chat with @colings86 and it looks like we need a base class for aggs with an abstract assertParsedResponse(InternalX, ParsedX). InternalX here is the object used to hold the response on the server and ParsedX is the client-side equivalent.

That work's beyond the scope of this fix so I suggest we open another issue to address that broader set of work and put this PR to bed as-is?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've just looked and actually InternalAggregationTestCase already has a method of ensuringthe serialisation is correct in it's testFromXContent() method. So I think this might be a bug in InternalSignificantTermsTestCase which is not producing random instance that exercise this. We can however produce a new test method to make sure this is tested using parseAndAssert()andassertFromXContent()` even if we can't make the randomised test instances exercise the code path directly

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, @colings86. The buckets need updateScore calling on them in the test code to derive scores from all the frequency stats. Otherwise tests were always testing for score=0 conditions and failing to highlight the bug.
When I remove fromXContent fix and run with the beefed up tests it now reproduces the original error so it looks like the test infra is there.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

glad to hear, am a bit rusty on this topic but I kind of knew/hoped that we have added tests when adding parsing for the high-level REST client.

} else if (InternalSignificantTerms.BG_COUNT.equals(currentFieldName)) {
bucket.supersetDf = parser.longValue();
}
Expand Down