Update `conda` recipes for Enhanced Compatibility effort #9456

ajschmidt8 · 2021-10-15T20:01:32Z

This PR updates the conda recipe build strings and cudatoolkit version specifications as part of the Enhanced Compatibility efforts.

The build strings have been updated to only include the major CUDA version (i.e. librmm-21.12.00a-cuda11_gc781527_12.tar.bz2) and the cudatoolkit version specifications will now be formatted like cudatoolkit >=x,<y.0a0 (i.e. cudatoolkit >=11,<12.0a0).

Moving forward, we'll build the packages with a single CUDA version (i.e. 11.4) and test them in environments with different CUDA versions (i.e. 11.0, 11.2, 11.4, etc.).

codecov · 2021-10-15T21:35:14Z

Codecov Report

Merging #9456 (a7679c6) into branch-21.12 (ab4bfaa) will increase coverage by 0.04%.
The diff coverage is 0.00%.

❗ Current head a7679c6 differs from pull request most recent head ea5aff9. Consider uploading reports for the commit ea5aff9 to get more accurate results

@@               Coverage Diff                @@
##           branch-21.12    #9456      +/-   ##
================================================
+ Coverage         10.79%   10.83%   +0.04%     
================================================
  Files               116      117       +1     
  Lines             18869    19442     +573     
================================================
+ Hits               2036     2106      +70     
- Misses            16833    17336     +503

Impacted Files	Coverage Δ
python/cudf/cudf/__init__.py	`0.00% <0.00%> (ø)`
python/cudf/cudf/_lib/__init__.py	`0.00% <ø> (ø)`
python/cudf/cudf/io/csv.py	`0.00% <0.00%> (ø)`
python/cudf/cudf/io/hdf.py	`0.00% <0.00%> (ø)`
python/cudf/cudf/io/orc.py	`0.00% <0.00%> (ø)`
python/cudf/cudf/_version.py	`0.00% <0.00%> (ø)`
python/cudf/cudf/core/abc.py	`0.00% <0.00%> (ø)`
python/cudf/cudf/api/types.py	`0.00% <0.00%> (ø)`
python/cudf/cudf/io/dlpack.py	`0.00% <0.00%> (ø)`
python/cudf/cudf/core/frame.py	`0.00% <0.00%> (ø)`
... and 60 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b66f14e...ea5aff9. Read the comment docs.

ajschmidt8 · 2021-10-18T15:04:47Z

rerun tests

kkraus14 · 2021-10-20T03:49:42Z

Moving forward, we'll build the packages with a single CUDA version (i.e. 11.4) and test them in environments with different CUDA versions (i.e. 11.0, 11.2, 11.4, etc.).

It would be good to get confirmation that this is safe to do while including PTX in the libcudf binary. I believe that if we compile with CUDA 11.4 it embeds PTX with a version that matches 11.4 and if you that PTX needs to get loaded or JITed on say an 11.0 machine the driver will error. I believe we decided to do the reverse and only compile with CUDA 11.0 in conda-forge for this same reason. cc @jakirkham

jakirkham · 2021-10-21T18:32:09Z

In conda-forge we had to start with 11.2, but yes this was one issue

There are a few conversations happening internally about how exactly CUDA Enhanced Compatibility (CEC) can be used

ajschmidt8 · 2021-11-08T21:53:18Z

rerun tests

kkraus14

Please share the results of the discussion with regards to handling of PTX before merging

robertmaynard · 2021-11-10T20:46:30Z

Please share the results of the discussion with regards to handling of PTX before merging

@kkraus14

From the C++ side we either emit SASS for all the necessary architectures so they are not concerned about PTX issues, or we have projects like jitify convert to SASS at runtime and therefore don't run into this problem.

In some cases, when using a new runtime with an older driver, Numba will currently emit PTX not understood by the driver. We are working with the Numba team to provide a fix for this before the 21.12 release finalizes. In the interim, users of nightlies will be need to run on a driver new enough to run the code emitted by their runtime or temporarily downgrade to an older runtime.

Sorry to dismiss, but we need this PR to get through in order to proceed with the 11.5 rollout, so we are going to merge it in. Please see Rob's comment above for more info. Numba changes will take a few more days to come out.

ajschmidt8 · 2021-11-11T17:59:10Z

@gpucibot merge

kkraus14 · 2021-11-11T20:25:26Z

No worries on dismissing the review. I wonder if at a certain point we should just stop shipping PTX in the libcudf binaries because one of three things will happen if the driver needs to interact with the libcudf PTX:

The driver errors from getting a newer PTX version than it supports due to us needing to be aggressive in adopting new CUDA Toolkit versions.
We hit a newer than currently supported architecture and the driver tries to JIT compile the PTX into SASS and it presents itself as a process hang to the user because of the size of libcudf.
We hit an older unsupported architecture and the driver tries to JIT compile the PTX into SASS and it presents itself as a process hang to the user because of the size of libcudf. At some point it errors either during this JIT compilation or during runtime from the use of architecture features not present on the older architecture.

All of these feel like a distinctly worse user experience than just not shipping the PTX and having the driver error from failing to find SASS for the current device architecture.

JohnZed · 2021-11-11T21:29:55Z

@kkraus14 #2 is an interesting point in particular.... That's basically the specific usecase that PTX is meant to support, but I agree that the JIT times can be very surprising. I think this is the norm for most shipping CUDA software, though, since when new architectures come out, it's also very unpleasant if there is no ability for them to use popular OSS CUDA code at all. I think the ideal would be if we could detect the need to JIT for a new version and output a clear message saying "first time JIT, this will take a while." I don't know of a way to intercept that JIT and issue a message though.

JohnZed · 2021-11-12T18:16:50Z

Asked around, and it sounds like there is not currently an API to intercept when JIT'ing happens and share a message, unfortunately.

The `dask-cudf` recipe was changed after rapidsai#9456 was opened, so the Enhanced Compatibility changes never made it into this recipe. This PR fixes that.

The `dask-cudf` recipe was changed after #9456 was opened, so the Enhanced Compatibility changes never made it into this recipe. This PR fixes that.

Similar to rapidsai/cudf#9456. This PR updates the `conda` recipe build strings and `cudatoolkit` version specifications as part of the Enhanced Compatibility efforts. The build strings have been updated to only include the major CUDA version (i.e. `librmm-21.12.00a-cuda11_gc781527_12.tar.bz2`) and the `cudatoolkit` version specifications will now be formatted like `cudatoolkit >=x,<y.0a0` (i.e. `cudatoolkit >=11,<12.0a0`). Moving forward, we'll build the packages with a single CUDA version (i.e. `11.5`) and test them in environments with different CUDA versions (i.e. `11.0`, `11.2`, `11.4`, `11.5` etc.).

update build string & version specs

ea5aff9

ajschmidt8 added improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels Oct 15, 2021

github-actions bot added the conda label Oct 15, 2021

ajschmidt8 marked this pull request as ready for review November 8, 2021 21:48

ajschmidt8 requested a review from a team as a code owner November 8, 2021 21:48

raydouglass approved these changes Nov 8, 2021

View reviewed changes

kkraus14 previously requested changes Nov 8, 2021

View reviewed changes

rapids-bot bot merged commit 98f5a28 into rapidsai:branch-21.12 Nov 11, 2021

ajschmidt8 deleted the update-recipe branch November 11, 2021 17:59

ajschmidt8 mentioned this pull request Nov 19, 2021

Fix dask-cudf recipe for Enhanced Compatibility #9733

Merged

ajschmidt8 added a commit that referenced this pull request Nov 19, 2021

Fix dask-cudf recipe for Enhanced Compatibility (#9733)

cb894e0

The `dask-cudf` recipe was changed after #9456 was opened, so the Enhanced Compatibility changes never made it into this recipe. This PR fixes that.

ajschmidt8 mentioned this pull request Nov 22, 2021

Update conda recipes for Enhanced Compatibility effort rapidsai/cucim#164

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update `conda` recipes for Enhanced Compatibility effort #9456

Update `conda` recipes for Enhanced Compatibility effort #9456

ajschmidt8 commented Oct 15, 2021

codecov bot commented Oct 15, 2021 •

edited

Loading

ajschmidt8 commented Oct 18, 2021

kkraus14 commented Oct 20, 2021

jakirkham commented Oct 21, 2021

ajschmidt8 commented Nov 8, 2021

kkraus14 left a comment

robertmaynard commented Nov 10, 2021 •

edited

Loading

ajschmidt8 commented Nov 11, 2021

kkraus14 commented Nov 11, 2021

JohnZed commented Nov 11, 2021

JohnZed commented Nov 12, 2021

Update conda recipes for Enhanced Compatibility effort #9456

Update conda recipes for Enhanced Compatibility effort #9456

Conversation

ajschmidt8 commented Oct 15, 2021

codecov bot commented Oct 15, 2021 • edited Loading

Codecov Report

ajschmidt8 commented Oct 18, 2021

kkraus14 commented Oct 20, 2021

jakirkham commented Oct 21, 2021

ajschmidt8 commented Nov 8, 2021

kkraus14 left a comment

Choose a reason for hiding this comment

robertmaynard commented Nov 10, 2021 • edited Loading

ajschmidt8 commented Nov 11, 2021

kkraus14 commented Nov 11, 2021

JohnZed commented Nov 11, 2021

JohnZed commented Nov 12, 2021

Update `conda` recipes for Enhanced Compatibility effort #9456

Update `conda` recipes for Enhanced Compatibility effort #9456

codecov bot commented Oct 15, 2021 •

edited

Loading

robertmaynard commented Nov 10, 2021 •

edited

Loading