-
Notifications
You must be signed in to change notification settings - Fork 62
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Retrieving top-k results for k >= 257 produces corrupted outputs #317
Comments
It looks like the outputs when k > 256 are indeed top-k results, but they are simply unsorted. |
Hi there, |
I guess technically it is not a bug, but from the user's point of view, the expected behavior is that it should be sorted (and indeed it is sorted for k <= 256). So, I think being a bit more transparent (e.g., in the documentation, tutorials, etc.) could be a great help. |
Hi @NTT123 i was out of the office for a few days last week and lost track of this discussion. What you are asking for is very reasonable, so at the very least I think we could add something to the docs for the algos as to when they should be assumed to be sorted. I also think we could prioritize a feature (likely for our December release) to allow finer user control over the sorting of the results. Let’s keep this one open and I’m going to mark it for the appropriate release. As always, if you are interested in diving into the code, we can guide you along the change as well. Contributions always welcome! |
Describe the bug
Retrieving top-k results with
ivf_flat
for k >= 257 produces corrupted outputs. I also encountered the same issue when usingivf_pq
.Steps/Code to reproduce the bug
I followed the tutorial at this link.
When setting k = 257, the outputs are corrupted. I then plotted the distance values:
Expected behavior
For reference, this is the result when k = 256:
Environment details
Installed packages:
Additional context
Interestingly, performing a search with k > 256 followed by a refinement step with k <= 256 works correctly. However, if I attempt to refine with k > 256, the issue reoccurs.
The text was updated successfully, but these errors were encountered: