-
Notifications
You must be signed in to change notification settings - Fork 24.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Get document API can specify an alias, but will return documents that are not part of that alias (as defined by the filter for that alias) #3861
Comments
The tricky bit here is that the GET api is realtime, as we can either get document by id from lucene or from the transaction log, before the next refresh happens and the newly indexed documents are made searchable. |
I would like to know what people think about the following options:
I am personally afraid of the complexity and slowness that would be introduced with option 1. Do comment if you have better ideas around this. |
We discussed this, we'd rather prefer to reject get requests performed against a filtered alias, given that we cannot provide users the correct answer. If the get request has the |
Documents that don't belong to the filtered alias might get returned as the filter cannot be taken into account when the document is retrieved from the transaction log, we simply reject the request given that we cannot answer it properly. This change affects also apis that use the get api internally, like: multi_get, term_vector, multi_term_vector, explain and update. Percolator is not affected as it already knows how to execute alias filters against the lucene index in non real-time when needed. Closes elastic#3861
Remarking for discussion... I attempted to solve this as explained above, but it's more complicated than it initially seemed, see comment here. We have to discuss how we want to move forward here, we have again a few options:
other opinions are welcome. |
My preference is number 2. I don't think we can make filtered aliases behave exactly like indices, We would just end up creating complexity, poorer performance, and still have edge cases. Rather just explain how filtered aliases work and leave it at that. If users need the functionality of indices, then they should use a real index instead. |
Closing this issue and leaving things as they are today |
My feeling is that this has not been a big problem for users. Plus there is a workaround: they can retrieve docs with a search-by-id against the alias instead of using GET (ie opt in to doing what you're suggesting themselves). As we know from security in x-pack, there's a lot more to security than just making filtered aliases work for GET. I think we should leave things as they are. |
Discussed in FixItFriday - we're good to implement this. |
Was this limitation ever documented (at least in the 2.3 docs)? Led to a lot of confusion for me while trying to handle an edge case. |
Pinging @elastic/es-core-infra |
I believe this can be closed as get-by-id no longer retrieves documents from the translog, so filtered aliases should always work. |
@dakrone do we have tests for this functionality? |
@javanna not that I know of, are you concerned about the behavior? We can re-open this to add tests if you'd like? |
yea I don't feel like "it should work" is a good resolution to this long standing issue. I am super happy if it's fixed, but I think we should add tests that demonstrate that it is fixed. |
Is this supposed to work in ES 6.8.9 or only in newer releases? I'm still seeing the behavior with 6.8.9 - that is I have two aliases for an index, one alias with a term filter ( If I use the get API to get a specific document, the filter is not respected and I get the same document with both aliases:
If I use the search API, the filter is respected:
|
Linking another report of this (#88425), for |
Create an index and populate it with two documents. Create an aliases with a filter, such that the alias contains one document.
A search using the alias will return one result.
A get document request using the alias can retrieve both documents.
This may confuse users: "why does this document not turn up in my search results", and makes it hard to implement a security model using aliases.
The UidField.loadDocIdAndVersion() method already uses a filter to check whether the document has been deleted. It would be possible to pass in the alias filter, if defined.
The text was updated successfully, but these errors were encountered: