Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New solution to handle IVF_FLAT backward compatibility #76

Merged

Conversation

cydrain
Copy link
Collaborator

@cydrain cydrain commented Sep 12, 2023

Issue: #30

@sre-ci-robot
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: cydrain

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@mergify
Copy link

mergify bot commented Sep 12, 2023

@cydrain 🔍 Important: PR Classification Needed!

For efficient project management and a seamless review process, it's essential to classify your PR correctly. Here's how:

  1. If you're fixing a bug, label it as kind/bug.
  2. For small tweaks (less than 20 lines without altering any functionality), please use kind/improvement.
  3. Significant changes that don't modify existing functionalities should be tagged as kind/enhancement.
  4. Adjusting APIs or changing functionality? Go with kind/feature.

For any PR outside the kind/improvement category, ensure you link to the associated issue using the format: “issue: #”.

Thanks for your efforts and contribution to the community!.

@cydrain
Copy link
Collaborator Author

cydrain commented Sep 12, 2023

/kind enhancement

@cydrain
Copy link
Collaborator Author

cydrain commented Sep 12, 2023

/hold


uint8_t dummy8;
READ1(dummy8);
uint16_t dummy16;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

8(used for cosine) + 8 + 16 + 32 (rest padding) = 64 ? Pls add some comments here or just use 64 bit to save the is_cosine

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes

@@ -81,9 +81,17 @@ namespace faiss {
static void write_index_header(const Index* idx, IOWriter* f) {
WRITE1(idx->d);
WRITE1(idx->ntotal);
Index::idx_t dummy = 1 << 20;
WRITE1(dummy);
WRITE1(idx->is_cosine);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as upper

{
std::vector<OnDiskInvertedLists::Slot> v(
od->slots.begin(), od->slots.end());
WRITEVECTOR(v);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Has IVFFLAT used these inverted lists? May beyond my knowledge

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is not a read code change

// auto codes_size = ivfl->d * ivfl->ntotal * sizeof(float);

// IVF_FLAT_NM format, need convert to new format
if (remains == invlist_size + ids_size) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

little hack to decide the index type by binary size. Is there any other ways?

Copy link
Collaborator Author

@cydrain cydrain Sep 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I only think out 2 ways to detect IVF_FLAT_NM:

  1. use native IVF_FLAT format to parse a binary, if it throw an exception, then this binary should be IVF_FLAT_NM
  2. the way as this PR, read ivf_flat index header, then calculate the binary size

before this PR, I use method 1, but it's a little bit risky, so I change to method 2

@foxspy
Copy link
Collaborator

foxspy commented Sep 12, 2023

Pls verify the compatibility with existing index binary. A byte difference may occur load fail. @elstic

Signed-off-by: Yudong Cai <yudong.cai@zilliz.com>
@cydrain cydrain force-pushed the caiyd_ivf_flat_nm_compatable_new_sol branch from 19e4384 to b67f931 Compare September 13, 2023 01:54
@mergify mergify bot removed the ci-passed label Sep 13, 2023
@foxspy
Copy link
Collaborator

foxspy commented Sep 13, 2023

/lgtm

@cydrain
Copy link
Collaborator Author

cydrain commented Sep 13, 2023

/unhold

@sre-ci-robot sre-ci-robot merged commit 6bff288 into zilliztech:main Sep 13, 2023
9 checks passed
@cydrain cydrain deleted the caiyd_ivf_flat_nm_compatable_new_sol branch September 13, 2023 03:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants