Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Drop name from TokenizerFactory #24869

Merged
merged 1 commit into from
May 30, 2017

Conversation

nik9000
Copy link
Member

@nik9000 nik9000 commented May 24, 2017

Drops TokenizerFactory#name, replacing it with
CustomAnalyzer#getTokenizerName which is much better targeted at
its single use case inside the analysis API.

Drops a test that I would have had to refactor which is duplicated by
AnalysisModuleTests.

To keep this change from blowing up in size I've left two mostly
mechanical changes to be done in followups:

  1. TokenizerFactory can now be entirely dropped and replaced with
    Supplier<Tokenizer>.
  2. AbstractTokenizerFactory's ctor still takes a String parameter
    where the name once was.

Drops `TokenizerFactory#name`, replacing it with
`CustomAnalyzer#getTokenizerName` which is much better targeted at
its single use case inside the analysis API.

Drops a test that I would have had to refactor which is duplicated by
`AnalysisModuleTests`.

To keep this change from blowing up in size I've left two mostly
mechanical changes to be done in followups:
1. `TokenizerFactory` can now be entirely dropped and replaced with
`Supplier<Tokenizer>`.
2. `AbstractTokenizerFactory`'s ctor still takes a `String` parameter
where the name once was.
Copy link
Member

@rjernst rjernst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, I just left one comment.

/**
* The name of the tokenizer as configured by the user.
*/
public String getTokenizerName() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need this? The TokenizerFactory.name() was never used here before.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was used in the analyze action.

@nik9000 nik9000 merged commit 6d9ce95 into elastic:master May 30, 2017
@nik9000
Copy link
Member Author

nik9000 commented May 30, 2017

Thanks for the review @rjernst! I merged and added the followups to my list.

jasontedor added a commit to jasontedor/elasticsearch that referenced this pull request May 30, 2017
* master:
  Fix typo in comment in ReplicationOperation.java
  Prevent Index & Delete request primaryTerm getter/setter, setShardId setter
  Drop name from TokenizerFactory (elastic#24869)
  Correctly set doc_count when MovAvg "predicts" values on existing buckets (elastic#24892)
  Handle primary failure handling replica response
  Add missing word to terms-query.asciidoc (elastic#24960)
  Correct some spelling in match-phrase-prefix docs (elastic#24956)
  testConcurrentWriteViewsAndSnapshot shouldn't flush concurrently
  [TEST] Fix FieldSortIT failures
  Add doc_count to ParsedMatrixStats (elastic#24952)
  Add document count to Matrix Stats aggregation response (elastic#24776)
  Fix script field sort returning Double.MAX_VALUE for all documents (elastic#24942)
nik9000 added a commit to nik9000/elasticsearch that referenced this pull request May 30, 2017
A continuation of elastic#24869, this drops the `name()` method from
all remaining analysis component factories, moving name members
to `CustomAnalayzer` in service of the `_analyze` action.

Like elastic#24869 I've kept a `String` in the method signature of the
ctor of some abstract components so that back these analysis
component factories. It is *much* more convenient to remove all
the names at once in a followup. That change will be large but
purely mechanical.

This breaks the API for plugins adding analyzers because they
no longer must implement fairly redundant feeline method.
@nik9000 nik9000 deleted the tokenizer_factory_drop_name branch June 7, 2017 14:51
original-brownbear added a commit to original-brownbear/elasticsearch that referenced this pull request Jul 15, 2018
Handles TODOs from elastic#24869
* Replaces all occurences of TokenizerFactory with Supplier<Tokenizer>
* Remove unused parameter from constructor
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants