Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Forward-merge branch-24.04 into branch-24.06 [skip ci] #2229

Merged
merged 8 commits into from
Mar 19, 2024

Conversation

rapids-bot[bot]
Copy link

@rapids-bot rapids-bot bot commented Mar 15, 2024

Forward-merge triggered by push to branch-24.04 that creates a PR to keep branch-24.06 up-to-date. If this PR is unable to be immediately merged due to conflicts, it will remain open for the team to manually merge. See forward-merger docs for more info.

This will avoid confusion for users launching only `./build.sh pylibraft`.

Authors:
  - Micka (https://github.com/lowener)
  - Corey J. Nolet (https://github.com/cjnolet)

Approvers:
  - Peter Andreas Entschev (https://github.com/pentschev)
  - Corey J. Nolet (https://github.com/cjnolet)
  - Robert Maynard (https://github.com/robertmaynard)
  - Ray Douglass (https://github.com/raydouglass)

URL: #2090
@rapids-bot rapids-bot bot requested review from a team as code owners March 15, 2024 16:25
Copy link

copy-pr-bot bot commented Mar 15, 2024

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

Copy link
Author

rapids-bot bot commented Mar 15, 2024

FAILURE - Unable to forward-merge due to an error, manual merge is necessary. Do not use the Resolve conflicts option in this PR, follow these instructions https://docs.rapids.ai/maintainers/forward-merger/

IMPORTANT: When merging this PR, do not use the auto-merger (i.e. the /merge comment). Instead, an admin must manually merge by changing the merging strategy to Create a Merge Commit. Otherwise, history will be lost and the branches become incompatible.

This PR is based on @seberg work in  #1928 .  

From the PR:


This is a follow up on #1926, since the rank sorting seemed a bit hard to understand.

It does modify the logic in the sense that the host is now sorted by IP as a way to group based on it. But I don't really think that host sorting was ever a goal? 

If the goal is really about being deterministic, then this should be more (or at least clearer) deterministic about order of worker IPs.

OTOH, if the NVML device order doesn't matter, we could just sort the workers directly. 

The original #1587 mentions:

NCCL>1.11 expects a process with rank r to be mapped to r % num_gpus_per_node
which is something that neither approach seems to quite assure, if such a requirement exists, I would want to do one of:

Ensure we can guarantee this, but this requires initializing workers that are not involved in the operation.
At least raise an error, because if NCCL will end up raising the error it will be very confusing.

Authors:
  - Vibhu Jawa (https://github.com/VibhuJawa)
  - Sebastian Berg (https://github.com/seberg)

Approvers:
  - Corey J. Nolet (https://github.com/cjnolet)

URL: #2228
@rapids-bot rapids-bot bot requested a review from a team as a code owner March 15, 2024 21:25
mfoerste4 and others added 2 commits March 16, 2024 04:25
This PR addresses #2204 and #2205. 


* fixes illegal access / test coverage for mean row-wise kernel
* fixes illegal access / test coverage for stdev row-wise kernel
* modified sum kernels to utilize Kahan/Neumaier summation per thread, also increase load per thread to benefit from this

FYI, @tfeher

Authors:
  - Malte Förster (https://github.com/mfoerste4)

Approvers:
  - Tamas Bela Feher (https://github.com/tfeher)

URL: #2223
…2220)

The local `copyright.py` script is bug-prone. Replace it with a more robust centralized script from `pre-commit-hooks`.

Issue: rapidsai/build-planning#30

Authors:
  - Kyle Edwards (https://github.com/KyleFromNVIDIA)

Approvers:
  - Jake Awe (https://github.com/AyodeAwe)

URL: #2220
@github-actions github-actions bot added the ci label Mar 18, 2024
achirkin and others added 4 commits March 18, 2024 16:14
Add a `cagra::compress` function that implements CAGRA-Q (VQ + PQ) compression of a given dataset.
The result, `compressed_dataset`, is supposed to complement the CAGRA graph during `cagra::search` in place of a raw dataset.

### Current state:

  - The code runs and produces a meaningful output (tested internally by running the original prototype search with the generated compressed dataset); the recall levels are approximately the same as with the prototype implementation.
  - No test coverage yet (need to coordinate with the search PR #2206)
  - Full `pq_bits` support ([4,5,6,7,8] - same as in IVF-PQ)
  - Any `pq_dim` values are accepted, but the dataset is not padded and thus `dim` must be a multiple of `pq_dim`.
  - The codebook math type is hardcoded to `half` to match the prototype implementation for now. This could be a runtime (build) parameter as well.
  - All common input data types should work (`uint8_t`, `int8_t`, `half`, and `float` compile), but I tested only `float`.

Authors:
  - Artem M. Chirkin (https://github.com/achirkin)

Approvers:
  - Tamas Bela Feher (https://github.com/tfeher)

URL: #2213
…2212)

Compilation of IVF-PQ search kernels can be time consuming. In `libraft.so` the compilation is done in parallel for kernels without filtering and with `int64_t` index type.

We have test with `uint32_t` index type as well as tests for `bitset_filter` with both 32 and 64 bit index types. This PR adds explicit template instantiations for the test. This way we avoid repeated compilation of the kernels with filter and this also enables parallel compilation of the `compute_similarity` kernel for different template types. The kernels with these additional type parameters are not added to `libraft.so`, only linked together with the test executable. 

Note that this PR does not increase the number of compiled kernels, but it enables to compile them in parallel.

Authors:
  - Tamas Bela Feher (https://github.com/tfeher)

Approvers:
  - Artem M. Chirkin (https://github.com/achirkin)
  - Ben Frederickson (https://github.com/benfred)

URL: #2212
Generating ANN bench ground truth is affected by bug #2171, when k>1024. This PR fixes the issue for the ground truth generation.

Authors:
  - Tamas Bela Feher (https://github.com/tfeher)

Approvers:
  - Corey J. Nolet (https://github.com/cjnolet)

URL: #2180
There was a bug appearing for negative floating point numbers with a max reduce operation. The `std::numeric_limits<T>::min()` is greater than the negative floating point values whereas we want it to be smaller than all representable values.

This PR replaces the `min` with the `lowest`.

Authors:
  - Akif ÇÖRDÜK (https://github.com/akifcorduk)
  - Tamas Bela Feher (https://github.com/tfeher)

Approvers:
  - Bradley Dice (https://github.com/bdice)
  - Tamas Bela Feher (https://github.com/tfeher)

URL: #2226
@AyodeAwe AyodeAwe merged commit bc2513e into branch-24.06 Mar 19, 2024
49 of 50 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants