scan_type is search & small result set gives SearchContextMissingException #5345

jillesvangurp · 2014-03-05T14:17:41Z

may be duplicate of or related to #5170

using elasticsearch 1.0.0

I have a query where I would like to use search_type=scan to scroll through all contacts owned by a user. It works fine if the user has enough contacts but I have one user where there are only two contacts and this fails.

So I do a GET on
http://localhost:9200/users/contact/_search?search_type=scan&scroll=60m&size=100
{
"query": {
"term": {
"userId": {
"value": "1o"
}
}
}
}

I get back the following response
{
"_scroll_id":"c2Nhbjs1OzIwMTpxS1hwWW80MFJvV0hwbjdBcm5JRkF3OzIwNDpxS1hwWW80MFJvV0hwbjdBcm5JRkF3OzIwMzpxS1hwWW80MFJvV0hwbjdBcm5JRkF3OzIwMjpxS1hwWW80MFJvV0hwbjdBcm5JRkF3OzIwNTpxS1hwWW80MFJvV0hwbjdBcm5JRkF3OzE7dG90YWxfaGl0czoyOw==",
"took":1,
"timed_out":false,
"_shards":{
"total":5,
"successful":5,
"failed":0
},
"hits":{
"total":2,
"max_score":0.0,
"hits":[

]
}

}

and then http://localhost:9200/_search/scroll?scroll=60m
c2Nhbjs1OzIwMTpxS1hwWW80MFJvV0hwbjdBcm5JRkF3OzIwNDpxS1hwWW80MFJvV0hwbjdBcm5JRkF3OzIwMzpxS1hwWW80MFJvV0hwbjdBcm5JRkF3OzIwMjpxS1hwWW80MFJvV0hwbjdBcm5JRkF3OzIwNTpxS1hwWW80MFJvV0hwbjdBcm5JRkF3OzE7dG90YWxfaGl0czoyOw==

fails with a SearchContextMissingException

The same query for a user with 30000 contacts works fine and pages through the results like I would expect. The above query returns the two results normally if I query without search_type=scan

So it only fails if the result set is smaller than the page size

The text was updated successfully, but these errors were encountered:

jillesvangurp · 2014-03-05T19:01:52Z

This bug may be false but there is an underlying problem that might cause me to see this that I discovered doing some more analysis on this:

I had a bit of code that used search_type that worked fine with 0.90 but broke when I tried it with 1.0.1 (upgraded this afternoon) today in several ways:

problem #1: the exit condition changed.

I used to parse the scrollId from the response and stop fetching new results when it was no longer included. Now I get the following response for the final page of results:

{
"_scroll_id":"c2NhbjswOzE7dG90YWxfaGl0czo3NDk2Ow==",
"took":3,
"timed_out":false,
"_shards":{
"total":5,
"successful":4,
"failed":1,
"failures":[
{
"status":500,
"reason":"SearchContextMissingException[No search context found for id [199]]"
}
]
},
"hits":{
"total":7496,
"max_score":0.0
"hits":[...]

}

So it is actually reporting an error for the next page of results from one of the shards and includes the final results. This looks weird to me. The only way I have of deducing that this is the final page is to look at the failures object or to keep track of the number hits I've processed. That can't be right.

problem #2: the size parameter seems to work in a weird way.

I'm not actually sure if that is a change or whether this was always broken. In any case, this seems to be per shard. So if I specify size=100, I actually get back 500 results per page, which would be the number of shards times the size.

So getting back to my original bug, I probably am getting the results but my code fails trying to fetch another page of results because of the broken exit condition.

I would expect either the old behavior where the last result fetched no longer includes the scrollid. Alternatively, the API could be improved by explicitly including a next url and omitting that on the last page. My interpretation of the old behavior was to use the scrollid like this indeed. In any case, asking for the last page should not return any errors from any shard.

jillesvangurp · 2014-03-05T19:15:01Z

actually looking closer it turns out it was a pilot error after all. I was passing in the same scrollid instead of using the one in the request.

The minor issue of the pageSize above may be valid but that's explainable and probably not a big issue. So, closing this one.

clintongormley · 2014-03-05T19:48:33Z

@jillesvangurp to explain: with the scan search type, there is no reduce phase, so size is actually per-shard, rather than per-request. Each shard returns a max of size results.

jillesvangurp closed this as completed Mar 5, 2014

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

scan_type is search & small result set gives SearchContextMissingException #5345

scan_type is search & small result set gives SearchContextMissingException #5345

jillesvangurp commented Mar 5, 2014

jillesvangurp commented Mar 5, 2014

jillesvangurp commented Mar 5, 2014

clintongormley commented Mar 5, 2014

scan_type is search & small result set gives SearchContextMissingException #5345

scan_type is search & small result set gives SearchContextMissingException #5345

Comments

jillesvangurp commented Mar 5, 2014

jillesvangurp commented Mar 5, 2014

jillesvangurp commented Mar 5, 2014

clintongormley commented Mar 5, 2014