Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MRG: fix calculate_gather_stats threshold=0 bug #3052

Merged
merged 8 commits into from
Feb 29, 2024

Conversation

bluegenes
Copy link
Contributor

@bluegenes bluegenes commented Feb 29, 2024

The error shows up when we try to calculate abundance info with a match_size of 0:

match: GCF_002819265.1 Choclo virus, match_size: 0, query_size: 273, query_trackabund: true
thread '<unnamed>' panicked at 'called `Option::unwrap()` on a `None` value', /home/ntpierce/sourmash/src/core/src/index/mod.rs:301:55

...we should never be trying to calculate statistics on match_size of 0, but the logic is currently match_size >= threshold.This includes a check to break if match_size is 0.

Copy link

codecov bot commented Feb 29, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 86.78%. Comparing base (c7a1265) to head (16597b0).

Additional details and impacted files
@@           Coverage Diff           @@
##           latest    #3052   +/-   ##
=======================================
  Coverage   86.78%   86.78%           
=======================================
  Files         136      136           
  Lines       15523    15524    +1     
  Branches     2626     2626           
=======================================
+ Hits        13471    13472    +1     
  Misses       1751     1751           
  Partials      301      301           
Flag Coverage Δ
hypothesis-py 25.53% <ø> (ø)
python 92.86% <ø> (ø)
rust 61.21% <100.00%> (+0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@bluegenes bluegenes changed the title WIP: handle calculate_gather_stats bug MRG: fix calculate_gather_stats bug Feb 29, 2024
@bluegenes bluegenes changed the title MRG: fix calculate_gather_stats bug WIP: fix calculate_gather_stats bug Feb 29, 2024
@bluegenes bluegenes changed the title WIP: fix calculate_gather_stats bug MRG: fix calculate_gather_stats threshold=0 bug Feb 29, 2024
@bluegenes
Copy link
Contributor Author

@sourmash-bio/devs ready for review!

@ctb
Copy link
Contributor

ctb commented Feb 29, 2024

Do we need a test? I'm ok if the answer is "no".

@bluegenes
Copy link
Contributor Author

bluegenes commented Feb 29, 2024

Do we need a test? I'm ok if the answer is "no".

We don't currently have tests for any of disk_revindex.rs (though I do have python tests for running + results in sourmash_plugin_branchwater), so adding one here would require a fair bit of setup. Can we put this to an issue to handle later? I can add a --threshold 0 test to sourmash_plugin_branchwater.

@ctb
Copy link
Contributor

ctb commented Feb 29, 2024

Do we need a test? I'm ok if the answer is "no".

We don't currently have tests for any of disk_revindex.rs (though I do have python tests for running + results in sourmash_plugin_branchwater), so adding one here would require a fair bit of setup. Can we put this to an issue to handle later? I can add a --threshold 0 test to sourmash_plugin_branchwater.

sure!

@bluegenes bluegenes merged commit fe4790f into latest Feb 29, 2024
41 checks passed
@bluegenes bluegenes deleted the fix-rocksdbgather-bug branch February 29, 2024 22:01
@bluegenes bluegenes added the rust label Feb 29, 2024
@luizirber
Copy link
Member

We don't currently have tests for any of disk_revindex.rs (though I do have python tests for running + results in sourmash_plugin_branchwater), so adding one here would require a fair bit of setup.

Tests for disk_revindex are at the revindex module, because sourmash::index::revindex::RevIndex acts as an unified interface for disk and mem revindex. Tho only the disk one is really used =]

@ctb ctb mentioned this pull request Mar 21, 2024
ctb added a commit that referenced this pull request Mar 21, 2024
# Release notes for sourmash v4.8.7

Note: This release changes the way `sourmash multigather` names output
files in some situations. Please see
#2722 for details.

Minor new features:

* support proper manifest creation with `--relpath` for `sig check` and
`sig collect` (#3054)
* fix `multigather` output by adding md5sum along with
`-U/--output-add-query-md5sum` (#2722)
* enable loading lineages from annotated gather with match_name instead
of name (#3078)

Bug fixes:

* fix output for `sketch ... --singleton` (#3066)
* fix `calculate_gather_stats` `threshold=0` bug (#3052)

Cleanup and documentation updates:

* adjust protein ksize for record/manifest (#3019)
* Resolve `sourmash gather --help` issue (#3032)
* rework the manifest documentation; do misc cleanup (#3027)
* add branchwater web to docs (#3018)

Developer updates:

* make core Manifest booleans python compatible (core) (#3007)
* safer ksize selection while still accommodating k=k*3 (#3028)
* fix clippy beta issues (#3088)
* tell dependabot to ignore upgrades to `byteorder`, `chrono`,
`once_cell`, and `wasm-bindgen` (#3065)
* update rust changelog for r0.13.0 in preparation for release (#3033)
* Allow changing storage location for a collection in RevIndex (#3015)
* Fix tox and nix configs so all tox tests execute correctly (#2992)
* Calculate all gather stats in rust; use for rocksdb gather (#2943)
* bump screed req to 1.1.3 (#3067)
* bump to v4.8.7-dev (#2989)

Dependabot updates:

* Bump DeterminateSystems/magic-nix-cache-action from 1 to 3 (#2994)
* Bump DeterminateSystems/magic-nix-cache-action from 3 to 4 (#3085)
* Bump DeterminateSystems/nix-installer-action from 4 to 9 (#2995)
* Bump DeterminateSystems/nix-installer-action from 9 to 10 (#3083)
* Bump chrono from 0.4.33 to 0.4.34 (#3000)
* Bump conda-incubator/setup-miniconda from 3.0.1 to 3.0.2 (#3046)
* Bump conda-incubator/setup-miniconda from 3.0.2 to 3.0.3 (#3057)
* Bump histogram from 0.9.0 to 0.9.1 (#3002)
* Bump itertools from 0.12.0 to 0.12.1 (#3043)
* Bump log from 0.4.20 to 0.4.21 (#3062)
* Bump num-iter from 0.1.43 to 0.1.44 (#2997)
* Bump pypa/cibuildwheel from 2.16.5 to 2.17.0 (#3084)
* Bump rayon from 1.8.1 to 1.9.0 (#3058)
* Bump roaring from 0.10.2 to 0.10.3 (#3014)
* Bump serde from 1.0.196 to 1.0.197 (#3045)
* Bump serde_json from 1.0.113 to 1.0.114 (#3044)
* Bump tempfile from 3.10.0 to 3.10.1 (#3059)
* Bump thiserror from 1.0.56 to 1.0.57 (#2999)
* Bump thiserror from 1.0.57 to 1.0.58 (#3082)
* Bump wasm-bindgen from 0.2.91 to 0.2.92 (#3060)
* Bump wasm-bindgen-test from 0.3.40 to 0.3.41 (#2996)
* Bump wasm-bindgen-test from 0.3.41 to 0.3.42 (#3063)
* Bump web-sys from 0.3.67 to 0.3.68 (#2998)
* Bump web-sys from 0.3.68 to 0.3.69 (#3061)
* Revert "Bump wasm-bindgen from 0.2.91 to 0.2.92 (#3060)" (#3064)
* Update asv to 0.6.2 (#3025)
* Update pytest requirement from <8.1.0,>=6.2.4 to >=6.2.4,<8.2.0
(#3075)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

bug in calculate_gather_stats (rocksdb gather)
3 participants