Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for aliases in queries on _index. #46640

Merged
merged 6 commits into from
Sep 20, 2019

Conversation

jtibshirani
Copy link
Contributor

@jtibshirani jtibshirani commented Sep 11, 2019

Previously, queries on the _index field were not able to specify index aliases. This was a regression in functionality compared to the 'indices' query, which was deprecated and removed in 6.0.

Now queries on _index can specify an alias, which is resolved to the concrete index names when we check whether an index matches. To match a remote shard target, the pattern needs to be of the form 'cluster:index' to match the fully-qualified index name. Aliases can be specified in the following query types: term, terms, prefix, and wildcard.

There are two complexities that I was hoping to get your thoughts on:

  • After this change, in order to match a remote index name, the provided pattern must contain the separator :. This allows us to consistently identify what represents the cluster name vs. local index name, and matches the behavior when specifying indices in a cross-cluster search. However, the previous behavior for these queries did not require a colon to be present -- for example you could match cluster:index using the pattern *index. I'm not sure it's possible to avoid this breaking change.
  • Currently, aliases can't be specified in regexp queries. I didn't think it was worth the extra logic to support these queries. I actually added support for regexp speculatively to _index in Support 'string'-style queries on metadata fields when reasonable. #34089 (it wasn't actually requested by users), and am now doubting the decision. I would prefer that we deprecate + remove support for regexp on _index and avoid adding the extra logic.

Addresses #23306.

Previously, queries on the _index field were not able to specify index aliases.
This was a regression in functionality compared to the 'indices' query that was
deprecated and removed in 6.0.

Now queries on _index can specify an alias, which is resolved to the concrete
index names when we check whether an index matches. To match a remote shard
target, the pattern needs to be of the form 'cluster:index' to match the
fully-qualified index name. Aliases can be specified in the following query
types: term, terms, prefix, and wildcard.
@jtibshirani jtibshirani added >bug :Search/Search Search-related issues that do not fall into other categories labels Sep 11, 2019
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-search

Copy link
Contributor

@jimczi jimczi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

However, the previous behavior for these queries did not require a colon to be present -- for example you could match cluster:index using the pattern *index. I'm not sure it's possible to avoid this breaking change.

IMO it is ok, we need a way to ensure that we can detect remote cluster name easily otherwise any pattern could potentially match so I am +1 to introduce this breaking change. We never really documented this feature so we could also backport it to 7x ?

Currently, aliases can't be specified in regexp queries. I didn't think it was worth the extra logic to support these queries. I actually added support for regexp speculatively to _index in #34089 (it wasn't actually requested by users), and am now doubting the decision. I would prefer that we deprecate + remove support for regexp on _index and avoid adding the extra logic.

+1 to deprecate/remove support for regexp on the _index field. This shouldn't be needed.

@jimczi jimczi added the v8.0.0 label Sep 12, 2019
@javanna
Copy link
Member

javanna commented Sep 12, 2019

I tend to see the breaking change for remote indices more as a bugfix. When we resolve remote clusters *:index would not match the local cluster, so it was a bug that querying the _index field would, I think. Your change makes the cluster alias resolution more consistent, and I see that we have nice unit tests for this logic in SearchIndexNameMatcherTests which is great. Thanks!

@jtibshirani
Copy link
Contributor Author

jtibshirani commented Sep 20, 2019

Thank you both for the review! I also think it's okay to backport this to 7.x -- I'll open a PR for that shortly. I also plan to follow-up with a change to remove support for regex queries on the _index field.

@jtibshirani jtibshirani merged commit 127b8d0 into elastic:master Sep 20, 2019
@jtibshirani jtibshirani deleted the index-query-with-aliases branch September 20, 2019 18:19
jtibshirani added a commit that referenced this pull request Sep 24, 2019
We speculatively added support for `regexp` queries on the `_index` field in
#34089 (this functionality was not actually requested by a user). Supporting
regex logic adds complexity to the `_index` field for not much gain, so we
would like to remove it.

From an end-to-end test it turns out this functionality never even worked in
the first place because of an error in how regex flags were interpreted! For
this reason, we can remove support for `regexp` on `_index` without a
deprecation period.

Relates to #46640.
jtibshirani added a commit that referenced this pull request Sep 24, 2019
We speculatively added support for `regexp` queries on the `_index` field in
#34089 (this functionality was not actually requested by a user). Supporting
regex logic adds complexity to the `_index` field for not much gain, so we
would like to remove it.

From an end-to-end test it turns out this functionality never even worked in
the first place because of an error in how regex flags were interpreted! For
this reason, we can remove support for `regexp` on `_index` without a
deprecation period.

Relates to #46640.
@OskarPersson
Copy link

What is the workaround for us stuck in Elasticsearch 6.x that have indices named "foo-*"?

@jtibshirani
Copy link
Contributor Author

jtibshirani commented Oct 1, 2019

@OskarPersson unfortunately we are not planning to backport the fix to 6.x. I'm not totally clear on what you mean by "have indices named "foo-*"", I assume you have the same problem where queries on _index don't work with aliases? On 6.x, the only workaround I see is to issue a preliminary request to resolve the aliases to their concrete indices (through https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-get-alias.html), then use the concrete index names in the query on _index.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>breaking >bug :Search/Search Search-related issues that do not fall into other categories v7.5.0 v8.0.0-alpha1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants