Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better plural stemmer than minimal_english #4738

Merged

Conversation

nknize
Copy link
Collaborator

@nknize nknize commented Oct 11, 2022

This originated from elastic/elasticsearch#42892 originally authored by @markharwood (nice job Mark!). The PR is licensed ALv2 and never found a home in Elasticsearch. The improvements over the buggy lucene implementation are substantial so I'm opening a PR to include it in OpenSearch 2.4.

@nknize nknize added enhancement Enhancement or improvement to existing feature or request Indexing & Search v3.0.0 Issues and PRs related to version 3.0.0 backport 2.x Backport to 2.x branch labels Oct 11, 2022
@nknize nknize requested review from a team and reta as code owners October 11, 2022 16:38
@nknize
Copy link
Collaborator Author

nknize commented Oct 11, 2022

@markharwood - do you think this TokenFilter should eventually make it's way back into lucene?

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@nknize nknize force-pushed the enhancement/englishMinimalPluralStemming branch from a37e76e to ad69f59 Compare October 11, 2022 16:55
@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@markharwood
Copy link
Contributor

@markharwood - do you think this TokenFilter should eventually make it's way back into lucene?

Thanks for picking this up, Nick. Yes, I think this would be a natural fit as a Lucene contribution.

nknize and others added 2 commits October 12, 2022 11:22
Drops the trailing "e" in taxes, dresses, watches, dishes etc that otherwise
cause mismatches with plural and singular forms.

Signed-off-by: Nicholas Walter Knize <nknize@apache.org>

Co-authored-by: Mark Harwood <markharwood@gmail.com>
Co-authored-by: Nicholas Walter Knize <nknize@apache.org>
Signed-off-by: Nicholas Walter Knize <nknize@apache.org>
@nknize nknize force-pushed the enhancement/englishMinimalPluralStemming branch from ad69f59 to 41e7377 Compare October 12, 2022 16:23
@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@nknize nknize merged commit c92846d into opensearch-project:main Oct 12, 2022
@opensearch-trigger-bot
Copy link
Contributor

The backport to 2.x failed:

The process '/usr/bin/git' failed with exit code 1

To backport manually, run these commands in your terminal:

# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add .worktrees/backport-2.x 2.x
# Navigate to the new working tree
cd .worktrees/backport-2.x
# Create a new branch
git switch --create backport/backport-4738-to-2.x
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 c92846d0ae121db54ce740790df75b36a24f804e
# Push it to GitHub
git push --set-upstream origin backport/backport-4738-to-2.x
# Go back to the original working tree
cd ../..
# Delete the working tree
git worktree remove .worktrees/backport-2.x

Then, create a pull request where the base branch is 2.x and the compare/head branch is backport/backport-4738-to-2.x.

@nknize
Copy link
Collaborator Author

nknize commented Oct 12, 2022

Yes, I think this would be a natural fit as a Lucene contribution.

Thanks @markharwood!!! Maybe we incubate this here in OpenSearch and then look at contributing it back to lucene. Let us know if you want to contribute it back, or someone here can and we'll copy you on the PR.

nknize added a commit to nknize/OpenSearch that referenced this pull request Oct 19, 2022
Drops the trailing "e" in taxes, dresses, watches, dishes etc that otherwise
cause mismatches with plural and singular forms.

Signed-off-by: Nicholas Walter Knize <nknize@apache.org>

Co-authored-by: Mark Harwood <markharwood@gmail.com>
Co-authored-by: Nicholas Walter Knize <nknize@apache.org>
(cherry picked from commit c92846d)
nknize added a commit to nknize/OpenSearch that referenced this pull request Oct 19, 2022
Drops the trailing "e" in taxes, dresses, watches, dishes etc that otherwise
cause mismatches with plural and singular forms.

Signed-off-by: Nicholas Walter Knize <nknize@apache.org>

Co-authored-by: Mark Harwood <markharwood@gmail.com>
Co-authored-by: Nicholas Walter Knize <nknize@apache.org>
(cherry picked from commit c92846d)
nknize added a commit that referenced this pull request Oct 19, 2022
Drops the trailing "e" in taxes, dresses, watches, dishes etc that otherwise
cause mismatches with plural and singular forms.

Signed-off-by: Nicholas Walter Knize <nknize@apache.org>

Co-authored-by: Mark Harwood <markharwood@gmail.com>
Co-authored-by: Nicholas Walter Knize <nknize@apache.org>
(cherry picked from commit c92846d)
ashking94 pushed a commit to ashking94/OpenSearch that referenced this pull request Nov 7, 2022
Drops the trailing "e" in taxes, dresses, watches, dishes etc that otherwise
cause mismatches with plural and singular forms.

Signed-off-by: Nicholas Walter Knize <nknize@apache.org>

Co-authored-by: Mark Harwood <markharwood@gmail.com>
Co-authored-by: Nicholas Walter Knize <nknize@apache.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 2.x Backport to 2.x branch enhancement Enhancement or improvement to existing feature or request Indexing & Search v3.0.0 Issues and PRs related to version 3.0.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants