Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Backport 2.x] Implemented the Streaming Feature to stream vectors from Java to JNI layer to enable creation of larger segments for vector indices (#1604) #1608

Merged
merged 1 commit into from
Apr 10, 2024

Conversation

navneet1v
Copy link
Collaborator

@navneet1v navneet1v commented Apr 10, 2024

Description

This is backport of PR: #1604

Implemented the Streaming Feature to stream vectors from Java to JNI layer to enable creation of larger segments for vector indices (#1604)

Changes include:

  1. Add the interface for streaming the vectors from java to jni layer with initial capacity (Add the interface for streaming the vectors from java to jni layer with initial capacity #1586)
  2. Integrating storeVectors interfaces with createIndex and createIndexTemplate functions. (Integrating storeVectors interfaces with createIndex and createIndexTemplate functions. #1588)
  3. Update KNN80BinaryDocValues reader count live docs and use live docs as initial capacity to initialize vector address(Update KNN80BinaryDocValues reader count live docs and use live docs as initial capacity to initialize vector address #1595)
  4. Move free vectorAddress from Java to JNI layer to reduce the memory footprint for Nmslib (Move free vectorAddress from Java to JNI layer to reduce the memory footprint for Nmslib. #1602)

Issues Resolved

#1506

JNI test

(base) 14:32 ~/workplace/k-NN/jni (2.x)$ ./bin/jni_test 
Running main() from /Users/navneev/workplace/k-NN/jni/googletest-src/googletest/src/gtest_main.cc
[==========] Running 22 tests from 20 test suites.
[----------] Global test environment set-up.
[----------] 1 test from FaissCreateIndexTest
[ RUN      ] FaissCreateIndexTest.BasicAssertions
[       OK ] FaissCreateIndexTest.BasicAssertions (9 ms)
[----------] 1 test from FaissCreateIndexTest (9 ms total)

[----------] 1 test from FaissCreateIndexFromTemplateTest
[ RUN      ] FaissCreateIndexFromTemplateTest.BasicAssertions
[       OK ] FaissCreateIndexFromTemplateTest.BasicAssertions (3 ms)
[----------] 1 test from FaissCreateIndexFromTemplateTest (3 ms total)

[----------] 3 tests from FaissLoadIndexTest
[ RUN      ] FaissLoadIndexTest.BasicAssertions
[       OK ] FaissLoadIndexTest.BasicAssertions (3 ms)
[ RUN      ] FaissLoadIndexTest.HNSWPQDisableSdcTable
WARNING clustering 256 points to 16 centroids: please provide at least 624 training points
[       OK ] FaissLoadIndexTest.HNSWPQDisableSdcTable (410 ms)
[ RUN      ] FaissLoadIndexTest.IVFPQDisablePrecomputeTable
WARNING clustering 256 points to 16 centroids: please provide at least 624 training points
[       OK ] FaissLoadIndexTest.IVFPQDisablePrecomputeTable (387 ms)
[----------] 3 tests from FaissLoadIndexTest (802 ms total)

[----------] 1 test from FaissQueryIndexTest
[ RUN      ] FaissQueryIndexTest.BasicAssertions
[       OK ] FaissQueryIndexTest.BasicAssertions (4 ms)
[----------] 1 test from FaissQueryIndexTest (4 ms total)

[----------] 1 test from FaissQueryIndexWithFilterTest1435
[ RUN      ] FaissQueryIndexWithFilterTest1435.BasicAssertions
[       OK ] FaissQueryIndexWithFilterTest1435.BasicAssertions (10 ms)
[----------] 1 test from FaissQueryIndexWithFilterTest1435 (10 ms total)

[----------] 1 test from FaissQueryIndexWithParentFilterTest
[ RUN      ] FaissQueryIndexWithParentFilterTest.BasicAssertions
[       OK ] FaissQueryIndexWithParentFilterTest.BasicAssertions (6 ms)
[----------] 1 test from FaissQueryIndexWithParentFilterTest (6 ms total)

[----------] 1 test from FaissFreeTest
[ RUN      ] FaissFreeTest.BasicAssertions
[       OK ] FaissFreeTest.BasicAssertions (0 ms)
[----------] 1 test from FaissFreeTest (0 ms total)

[----------] 1 test from FaissInitLibraryTest
[ RUN      ] FaissInitLibraryTest.BasicAssertions
[       OK ] FaissInitLibraryTest.BasicAssertions (0 ms)
[----------] 1 test from FaissInitLibraryTest (0 ms total)

[----------] 1 test from FaissTrainIndexTest
[ RUN      ] FaissTrainIndexTest.BasicAssertions
[       OK ] FaissTrainIndexTest.BasicAssertions (0 ms)
[----------] 1 test from FaissTrainIndexTest (0 ms total)

[----------] 1 test from FaissCreateHnswSQfp16IndexTest
[ RUN      ] FaissCreateHnswSQfp16IndexTest.BasicAssertions
[       OK ] FaissCreateHnswSQfp16IndexTest.BasicAssertions (6 ms)
[----------] 1 test from FaissCreateHnswSQfp16IndexTest (6 ms total)

[----------] 1 test from FaissIsSharedIndexStateRequired
[ RUN      ] FaissIsSharedIndexStateRequired.BasicAssertions
[       OK ] FaissIsSharedIndexStateRequired.BasicAssertions (0 ms)
[----------] 1 test from FaissIsSharedIndexStateRequired (0 ms total)

[----------] 1 test from FaissInitAndSetSharedIndexState
[ RUN      ] FaissInitAndSetSharedIndexState.BasicAssertions
WARNING clustering 256 points to 16 centroids: please provide at least 624 training points
[       OK ] FaissInitAndSetSharedIndexState.BasicAssertions (372 ms)
[----------] 1 test from FaissInitAndSetSharedIndexState (372 ms total)

[----------] 1 test from IDGrouperBitMapTest
[ RUN      ] IDGrouperBitMapTest.BasicAssertions
[       OK ] IDGrouperBitMapTest.BasicAssertions (0 ms)
[----------] 1 test from IDGrouperBitMapTest (0 ms total)

[----------] 1 test from NmslibIndexWrapperSearchTest
[ RUN      ] NmslibIndexWrapperSearchTest.BasicAssertions
[       OK ] NmslibIndexWrapperSearchTest.BasicAssertions (0 ms)
[----------] 1 test from NmslibIndexWrapperSearchTest (0 ms total)

[----------] 1 test from NmslibCreateIndexTest
[ RUN      ] NmslibCreateIndexTest.BasicAssertions
[       OK ] NmslibCreateIndexTest.BasicAssertions (1 ms)
[----------] 1 test from NmslibCreateIndexTest (1 ms total)

[----------] 1 test from NmslibLoadIndexTest
[ RUN      ] NmslibLoadIndexTest.BasicAssertions
[       OK ] NmslibLoadIndexTest.BasicAssertions (1 ms)
[----------] 1 test from NmslibLoadIndexTest (1 ms total)

[----------] 1 test from NmslibQueryIndexTest
[ RUN      ] NmslibQueryIndexTest.BasicAssertions
[       OK ] NmslibQueryIndexTest.BasicAssertions (2 ms)
[----------] 1 test from NmslibQueryIndexTest (2 ms total)

[----------] 1 test from NmslibFreeTest
[ RUN      ] NmslibFreeTest.BasicAssertions
[       OK ] NmslibFreeTest.BasicAssertions (0 ms)
[----------] 1 test from NmslibFreeTest (0 ms total)

[----------] 1 test from NmslibInitLibraryTest
[ RUN      ] NmslibInitLibraryTest.BasicAssertions
[       OK ] NmslibInitLibraryTest.BasicAssertions (0 ms)
[----------] 1 test from NmslibInitLibraryTest (0 ms total)

[----------] 1 test from CommonsTests
[ RUN      ] CommonsTests.BasicAssertions
[       OK ] CommonsTests.BasicAssertions (0 ms)
[----------] 1 test from CommonsTests (0 ms total)

[----------] Global test environment tear-down
[==========] 22 tests from 20 test suites ran. (1225 ms total)
[  PASSED  ] 22 tests.

Check List

  • New functionality includes testing.
    • All tests pass
  • New functionality has been documented.
    • New functionality has javadoc added
  • Commits are signed as per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

…layer to enable creation of larger segments for vector indices (opensearch-project#1604)

Changes include:
1. Add the interface for streaming the vectors from java to jni layer with initial capacity (opensearch-project#1586)
2. Integrating storeVectors interfaces with createIndex and createIndexTemplate functions. (opensearch-project#1588)
3. Update KNN80BinaryDocValues reader count live docs and use live docs as initial capacity to initialize vector address(opensearch-project#1595)
4. Move free vectorAddress from Java to JNI layer to reduce the memory footprint for Nmslib (opensearch-project#1602)

Signed-off-by: Navneet Verma <navneev@amazon.com>
@navneet1v navneet1v added the Enhancements Increases software capabilities beyond original client specifications label Apr 10, 2024
@navneet1v navneet1v merged commit 172dc84 into opensearch-project:2.x Apr 10, 2024
90 of 93 checks passed
navneet1v added a commit to navneet1v/k-NN that referenced this pull request Apr 11, 2024
…layer to enable creation of larger segments for vector indices (opensearch-project#1604) (opensearch-project#1608)

Changes include:
1. Add the interface for streaming the vectors from java to jni layer with initial capacity (opensearch-project#1586)
2. Integrating storeVectors interfaces with createIndex and createIndexTemplate functions. (opensearch-project#1588)
3. Update KNN80BinaryDocValues reader count live docs and use live docs as initial capacity to initialize vector address(opensearch-project#1595)
4. Move free vectorAddress from Java to JNI layer to reduce the memory footprint for Nmslib (opensearch-project#1602)

Signed-off-by: Navneet Verma <navneev@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancements Increases software capabilities beyond original client specifications
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

3 participants