Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support 'string'-style queries on metadata fields when reasonable. #34089

Merged
merged 3 commits into from
Sep 28, 2018

Conversation

jtibshirani
Copy link
Contributor

@jtibshirani jtibshirani commented Sep 26, 2018

This change has two components:

  • Support prefix and regexp queries on _index fields for consistency, since we already support wildcard queries.
  • Update ignored and routing field types to inherit from StringFieldType, which allows them to support 'string'-style queries like prefix, regex, and wildcard. We temporarily removed support for wildcard as part of the refactor in Delegate wildcard query creation to MappedFieldType. #34062, and this PR restores that support for compatibility, and also makes sure the other query types are implemented for consistency.

Note that we may eventually deprecate support for 'string'-style queries on ignored and routing fields, pending a separate discussion.

@jtibshirani jtibshirani added >bug :Search/Search Search-related issues that do not fall into other categories v7.0.0 v6.5.0 labels Sep 26, 2018
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-search-aggs

Copy link
Contributor

@jimczi jimczi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left some comments and am wondering why you didn't include the IdFieldType ?


Query expected = new WildcardQuery(new Term("field", new BytesRef("foo*")));
assertEquals(expected, ft.wildcardQuery("foo*", null));
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you also add a test for prefix query ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍


Query expected = new WildcardQuery(new Term("field", new BytesRef("foo*")));
assertEquals(expected, ft.wildcardQuery("foo*", null));
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@jtibshirani
Copy link
Contributor Author

Sorry, should've explained that in the PR description. _id fields are actually encoded (see IdFieldType#termsQuery), so it didn't seem to make sense to support queries like wildcard + prefix.

@jimczi
Copy link
Contributor

jimczi commented Sep 26, 2018

Sorry I meant IndexFieldType for the _index field. It seems that we don't support prefix query on this field even though we have support for patterns in IndexFieldType#termQuery so it might be nice to fix.

@jpountz
Copy link
Contributor

jpountz commented Sep 26, 2018

Even though we could add support for these queries to these fields, should we? The _id field is a good example: if we had supported range/prefix/wildcard queries then we couldn't have optimized storage by detecting the format of these ids. Are there use-cases for prefix/wildcard queries on those fields? For instance I can't think of use-cases for searching the _routing field at all except maybe debugging, for which term queries are sufficient?

@jpountz
Copy link
Contributor

jpountz commented Sep 26, 2018

It looks like I might have commented too quickly and this PR is only restoring support that already existed before #34062?

@jtibshirani
Copy link
Contributor Author

jtibshirani commented Sep 26, 2018

@jimczi got it, I will look into adding support for the appropriate queries to _index.

@jpountz you're right, and I maybe should've done this as part of that PR to avoid confusion. This PR restores wildcard support to those field types, and adds support for the other 'string' query types like prefix for consistency. We could instead consider properly deprecating/ removing support for wildcard queries (and not adding more supported queries), if we would be happier with that direction.

@jpountz
Copy link
Contributor

jpountz commented Sep 27, 2018

Thanks @jtibshirani, I don't have objections then. +1 to discuss deprecating wildcard/prefix/range on those special fields.

@jtibshirani jtibshirani changed the title Make sure 'ignored' and 'routing' field types inherit from StringFieldType. Support prefix and wildcard queries on metadata fields where it makes sense. Sep 27, 2018
@jtibshirani jtibshirani changed the title Support prefix and wildcard queries on metadata fields where it makes sense. Support prefix and wildcard queries on metadata fields when reasonable. Sep 27, 2018
@jtibshirani jtibshirani changed the title Support prefix and wildcard queries on metadata fields when reasonable. Support 'string'-style queries on metadata fields when reasonable. Sep 27, 2018
@jtibshirani
Copy link
Contributor Author

jtibshirani commented Sep 27, 2018

@jimczi I think this is ready for another look. I added support for prefix and regexp queries to _index fields, and added more unit tests for the other types. I updated the title + PR description to reflect our discussion (and was being a bit indecisive about the title :))

…dType.

This allows them to support additional query types like prefix and wildcard.
Copy link
Contributor

@jimczi jimczi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes looks good and thanks for handling the _index field as well.

After taking a closer look, I am actually a little unclear on why we want to add support for prefix to _index fields (given it's already a bit inconsistent as we don't support regexp or fuzzy queries).

It is inconsistent, I agree and I was wondering why we accept wildcard queries but not the others. It also appears that term and terms query accept wildcard on the _index field. I was not aware of that. It's not documented but now that we accept prefix and wildcard queries we should maybe discuss the deprecation/removal this use case. There's also a question regarding aliases but that's out of scope here.

@jtibshirani
Copy link
Contributor Author

jtibshirani commented Sep 27, 2018

Sounds good -- I'll file an issue to start the discussion around deprecation.

As an update, I filed two deprecation issues here: #34160, #34162

@jtibshirani jtibshirani merged commit 9cd4f70 into elastic:master Sep 28, 2018
@jtibshirani jtibshirani deleted the metadata-field-types branch September 28, 2018 03:59
jasontedor added a commit to jasontedor/elasticsearch that referenced this pull request Sep 28, 2018
* master:
  Use more precise does S3 bucket exist method (elastic#34123)
  LLREST: Introduce a strict mode (elastic#33708)
  [CCR] Adjust list retryable errors (elastic#33985)
  Fix AggregationFactories.Builder equality and hash regarding order (elastic#34005)
  MINOR: Remove some deadcode in NodeEnv and Related (elastic#34133)
  Rest-Api-Spec: Correct spelling in filter_path description (elastic#33154)
  Core: Don't rely on java time for epoch seconds formatting (elastic#34086)
  Retry errors when fetching follower global checkpoint. (elastic#34019)
  Watcher: Reenable watcher stats REST tests (elastic#34107)
  Remove special-casing of Synonym filters in AnalysisRegistry (elastic#34034)
  Rename CCR APIs (elastic#34027)
  Fixed CCR stats api serialization issues and (elastic#33983)
  Support 'string'-style queries on metadata fields when reasonable. (elastic#34089)
  Logging: Drop Settings from security logger get calls (elastic#33940)
  SQL: Internal refactoring of operators as functions (elastic#34097)
jtibshirani added a commit that referenced this pull request Sep 28, 2018
…34089)

* Make sure 'ignored' and 'routing' field types inherit from StringFieldType.
* Add tests for prefix and regexp queries.
* Support prefix and regexp queries on _index fields.
kcm pushed a commit that referenced this pull request Oct 30, 2018
…34089)

* Make sure 'ignored' and 'routing' field types inherit from StringFieldType.
* Add tests for prefix and regexp queries.
* Support prefix and regexp queries on _index fields.
jtibshirani added a commit that referenced this pull request Sep 24, 2019
We speculatively added support for `regexp` queries on the `_index` field in
#34089 (this functionality was not actually requested by a user). Supporting
regex logic adds complexity to the `_index` field for not much gain, so we
would like to remove it.

From an end-to-end test it turns out this functionality never even worked in
the first place because of an error in how regex flags were interpreted! For
this reason, we can remove support for `regexp` on `_index` without a
deprecation period.

Relates to #46640.
jtibshirani added a commit that referenced this pull request Sep 24, 2019
We speculatively added support for `regexp` queries on the `_index` field in
#34089 (this functionality was not actually requested by a user). Supporting
regex logic adds complexity to the `_index` field for not much gain, so we
would like to remove it.

From an end-to-end test it turns out this functionality never even worked in
the first place because of an error in how regex flags were interpreted! For
this reason, we can remove support for `regexp` on `_index` without a
deprecation period.

Relates to #46640.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug :Search/Search Search-related issues that do not fall into other categories v6.5.0 v7.0.0-beta1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants