Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test: lower low boundary for accuracy in test_calculate_context_similarity_on_non_matching_contexts #3199

Merged
merged 3 commits into from
Sep 13, 2022

Conversation

ZanSara
Copy link
Contributor

@ZanSara ZanSara commented Sep 12, 2022

Related Issues

Proposed Changes:

  • Lower low boundary for accuracy in test_calculate_context_similarity_on_non_matching_contexts from 0.99 to 0.98

How did you test it?

  • The test passes locally (with accuracy=1) and on CI (with accuracy=0.9860646599777034)

Notes for the reviewer

  • I attempted to use a more refined solution with pytest.approx, but that works with equality checks only, not with larger-than checks.

Checklist

@ZanSara ZanSara marked this pull request as ready for review September 12, 2022 09:10
@ZanSara ZanSara requested a review from a team as a code owner September 12, 2022 09:10
@ZanSara ZanSara requested review from vblagoje and tstadel and removed request for a team and vblagoje September 12, 2022 09:10
Copy link
Member

@tstadel tstadel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For a quick fix of main this is fine.
However, that's interesting. When did that start to fail?
I suppose there has been a new rapidfuzz version, that is causing this. Let's make sure this is a feature and not a bug

@tstadel
Copy link
Member

tstadel commented Sep 12, 2022

Ok, in the failing tests we're using rapidfuzz 2.8.0. Last passing test had rapidfuzz 2.6.1.

Copy link
Member

@tstadel tstadel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As this is clearly caused by the new rapidfuzz version 2.8.0, let's better add a version pin restriction of <2.8.0. We also need to create an issue at https://github.com/maxbachmann/RapidFuzz. I'll take care of this. After the issue has been resolved or it turns out this is a feature we can remove the version restriction.

Copy link
Member

@tstadel tstadel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I looked into the changes of rapidfuzz 2.8.0. This is actually a feature and should enable us to get rid of the code I wrote to boost split overlaps (calculate_context_similarity's boost_split_overlap option). I created an issue for that as it would need some tinkering regarding a proper threshold: #3202
Let's reference this issue in the pin comment and merge :-)

Copy link
Member

@tstadel tstadel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just saw that this dependency entry is a duplicate. Let's merge it with the existing one and add a comment about the issue

pyproject.toml Outdated Show resolved Hide resolved
@ZanSara ZanSara merged commit 49b1c88 into main Sep 13, 2022
@ZanSara ZanSara deleted the fix-context-similarity-test branch September 13, 2022 07:32
brandenchan pushed a commit that referenced this pull request Sep 21, 2022
…larity_on_non_matching_contexts` (#3199)

* Change min value

* revert test change and pin rapidfuzz<2.8.0

* duplicate
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants