Skip to content

Commit

Permalink
Add build and test support for CUDA 12 (#606)
Browse files Browse the repository at this point in the history
* Add build and test support for CUDA 12

* Auto update version

* Fix name

* Avoid overstepping with CUDA 11 on 12

* Fix wheel output naming

* Remove unneeded strings

* Add missing 0 to CUDA 11 install

* Trigger CI

* Update to allow CUDA 11 wheels to build also

* Update fromJSON usage

* Update int to str in json map

* Aim to keep cuda 11.5 wheels for now

* Remove CUDA 11

* Migrate windows tests to clangcl to avoid failures

* Ensure VS generator used

* Update for auditwheel 6.0 changes

* Ensure devtoolset values are used in CUDA 12 build

* Remove unneeded deref

* Retry installing cuda 12.0

* Fix auditwheel arch check

* Fix auditwheel arch check again

* Revert windows tests

* Lower overhead of Windows CI tests (#610)

* Cache vcpkg libs and reuse

* Auto update version

* Trigger CI

* Fix matrix tests for windows

* Add excluded modules for OpenCPPCoverage

* Convert dash to underscore

* Use optimized build for Windows coverage

* Retrigger CI

---------

Co-authored-by: Dev version update bot <github-actions[bot]@users.noreply.github.com>

* Update changelog

* Remove trainability from stateprep in test

* Update dev reqs to use cu12

* Fix formatting

* Update the MPI modules in CI

* Add ls to module dirs

* Remove additional cu11 deps

* Use mpirun from PATH

* Update to cu12 in missing locations

* Remove opt path specifics

* Ensure cuda version env vars are inlcuded on Python tests for MPI

---------

Co-authored-by: Dev version update bot <github-actions[bot]@users.noreply.github.com>
  • Loading branch information
mlxd and github-actions[bot] committed Feb 12, 2024
1 parent f3beabc commit 3527765
Show file tree
Hide file tree
Showing 17 changed files with 198 additions and 143 deletions.
6 changes: 6 additions & 0 deletions .github/CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,8 +7,14 @@

### Breaking changes

* Migrate `lightning.gpu` to CUDA 12.
[(#606)](https://github.com/PennyLaneAI/pennylane-lightning/pull/606)

### Improvements

* Lower the overheads of Windows CI tests.
[(#610)](https://github.com/PennyLaneAI/pennylane-lightning/pull/610)

* Decouple LightningQubit memory ownership from numpy and migrate it to Lightning-Qubit managed state-vector class.
[(#601)](https://github.com/PennyLaneAI/pennylane-lightning/pull/601)

Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/compat-check-latest-latest.yml
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ jobs:
pennylane-version: latest
tests_lgpu_gpu:
name: Lightning Compatibility test (tests_lgpu_gpu) - latest/latest
uses: ./.github/workflows/tests_gpu_cu11.yml
uses: ./.github/workflows/tests_gpu_cuda.yml
with:
lightning-version: latest
pennylane-version: latest
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/compat-check-latest-stable.yml
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ jobs:
pennylane-version: stable
tests_lgpu_gpu:
name: Lightning Compatibility test (tests_lgpu_gpu) - latest/stable
uses: ./.github/workflows/tests_gpu_cu11.yml
uses: ./.github/workflows/tests_gpu_cuda.yml
with:
lightning-version: latest
pennylane-version: stable
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/compat-check-release-release.yml
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ jobs:
pennylane-version: release
tests_lgpu_gpu:
name: Lightning Compatibility test (tests_lgpu_gpu) - release/release
uses: ./.github/workflows/tests_gpu_cu11.yml
uses: ./.github/workflows/tests_gpu_cuda.yml
with:
lightning-version: release
pennylane-version: release
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/compat-check-stable-latest.yml
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ jobs:
pennylane-version: latest
tests_lgpu_gpu:
name: Lightning Compatibility test (tests_lgpu_gpu) - stable/latest
uses: ./.github/workflows/tests_gpu_cu11.yml
uses: ./.github/workflows/tests_gpu_cuda.yml
with:
lightning-version: stable
pennylane-version: latest
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/compat-check-stable-stable.yml
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ jobs:
pennylane-version: stable
tests_lgpu_gpu:
name: Lightning Compatibility test (tests_lgpu_gpu) - stable/stable
uses: ./.github/workflows/tests_gpu_cu11.yml
uses: ./.github/workflows/tests_gpu_cuda.yml
with:
lightning-version: stable
pennylane-version: stable
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -39,25 +39,27 @@ jobs:
matrix:
os: [ubuntu-22.04]
pl_backend: ["lightning_gpu"]
cuda_version: ["12"]

steps:
- name: Validate GPU version and installed compiler
run: |
source /etc/profile.d/modules.sh
module use /opt/modules
module load cuda/11.8
module load cuda/${{ matrix.cuda_version }}
echo "${PATH}" >> $GITHUB_PATH
echo "LD_LIBRARY_PATH=${LD_LIBRARY_PATH}" >> $GITHUB_ENV
nvcc --version
nvidia-smi
cpptestswithLGPU_cu11:
cpptestswithLGPU:
if: ${{ !contains(fromJSON('["schedule", "workflow_dispatch"]'), github.event_name) }}
needs: [builddeps]
strategy:
matrix:
os: [ubuntu-22.04]
pl_backend: ["lightning_gpu"]
cuda_version: ["12"]

name: C++ tests (Lightning-GPU)
runs-on:
Expand All @@ -70,7 +72,7 @@ jobs:
run: |
source /etc/profile.d/modules.sh
module use /opt/modules
module load cuda/11.8
module load cuda/${{ matrix.cuda_version }}
echo "${PATH}" >> $GITHUB_PATH
echo "LD_LIBRARY_PATH=${LD_LIBRARY_PATH}" >> $GITHUB_ENV
nvcc --version
Expand Down Expand Up @@ -117,7 +119,7 @@ jobs:
- name: Install required packages
run: |
python -m pip install ninja cmake custatevec-cu11
python -m pip install ninja cmake custatevec-cu${{ matrix.cuda_version }}
sudo apt-get -y -q install liblapack-dev
- name: Build and run unit tests
Expand Down Expand Up @@ -161,6 +163,7 @@ jobs:
os: [ubuntu-22.04]
pl_backend: ["lightning_gpu"]
default_backend: ["lightning_qubit"]
cuda_version: ["12"]

name: Python tests with LGPU
runs-on:
Expand All @@ -173,7 +176,7 @@ jobs:
run: |
source /etc/profile.d/modules.sh
module use /opt/modules
module load cuda/11.8
module load cuda/${{ matrix.cuda_version }}
echo "${PATH}" >> $GITHUB_PATH
echo "LD_LIBRARY_PATH=${LD_LIBRARY_PATH}" >> $GITHUB_ENV
nvcc --version
Expand Down Expand Up @@ -238,7 +241,7 @@ jobs:
run: |
cd main
python -m pip install -r requirements-dev.txt
python -m pip install cmake custatevec-cu11 openfermionpyscf
python -m pip install cmake custatevec-cu${{ matrix.cuda_version }} openfermionpyscf
- name: Checkout PennyLane for release build
if: inputs.pennylane-version == 'release'
Expand Down Expand Up @@ -344,7 +347,7 @@ jobs:
token: ${{ secrets.CODECOV_TOKEN }}

upload-to-codecov-linux-cpp:
needs: [cpptestswithLGPU_cu11]
needs: [cpptestswithLGPU]
name: Upload coverage data to codecov
runs-on: ubuntu-latest
steps:
Expand Down
37 changes: 21 additions & 16 deletions .github/workflows/tests_linux_x86_mpi_gpu.yml
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,8 @@ jobs:
max-parallel: 1
matrix:
mpilib: ["mpich", "openmpi"]
cuda_version_maj: ["12"]
cuda_version_min: ["2"]
timeout-minutes: 30

steps:
Expand Down Expand Up @@ -92,28 +94,29 @@ jobs:
- name: Install required packages
run: |
python -m pip install -r requirements-dev.txt
python -m pip install cmake custatevec-cu11
python -m pip install cmake custatevec-cu12
sudo apt-get -y -q install liblapack-dev
- name: Validate GPU version and installed compiler
- name: Validate GPU version and installed compiler and modules
run: |
source /etc/profile.d/modules.sh && module use /opt/modules && module load cuda/11.8
source /etc/profile.d/modules.sh && module use /opt/modules && module load cuda/${{ matrix.cuda_version_maj }}
which -a nvcc
nvcc --version
ls -R /opt/modules
- name: Validate Multi-GPU packages
run: |
source /etc/profile.d/modules.sh && module use /opt/modules/ && module load ${{ matrix.mpilib }}
source /etc/profile.d/modules.sh && module use /opt/modules/ && module load ${{ matrix.mpilib }}/cuda-${{ matrix.cuda_version_maj }}.${{ matrix.cuda_version_min }}
echo 'Checking for ${{ matrix.mpilib }}'
which -a mpirun
mpirun --version
which -a mpicxx
mpicxx --version
module unload ${{ matrix.mpilib }}
module unload ${{ matrix.mpilib }}/cuda-${{ matrix.cuda_version_maj }}.${{ matrix.cuda_version_min }}
- name: Build and run unit tests
run: |
source /etc/profile.d/modules.sh && module use /opt/modules/ && module load ${{ matrix.mpilib }}
source /etc/profile.d/modules.sh && module use /opt/modules/ && module load ${{ matrix.mpilib }}/cuda-${{ matrix.cuda_version_maj }}.${{ matrix.cuda_version_min }}
export CUQUANTUM_SDK=$(python -c "import site; print( f'{site.getsitepackages()[0]}/cuquantum/lib')")
cmake . -BBuild \
-DPL_BACKEND=lightning_gpu \
Expand All @@ -123,15 +126,15 @@ jobs:
-DBUILD_TESTS=ON \
-DENABLE_LAPACK=ON \
-DCMAKE_CXX_COMPILER=mpicxx \
-DCMAKE_CUDA_COMPILER="/usr/local/cuda/bin/nvcc" \
-DCMAKE_CUDA_COMPILER=$(which nvcc) \
-DCMAKE_CUDA_ARCHITECTURES="86" \
-DPython_EXECUTABLE:FILE="${{ steps.python_path.outputs.python }}" \
-G Ninja
cmake --build ./Build
cd ./Build
mkdir -p ./tests/results
for file in *runner ; do ./$file --order lex --reporter junit --out ./tests/results/report_$file.xml; done;
for file in *runner_mpi ; do /opt/mpi/${{ matrix.mpilib }}/bin/mpirun -np 2 ./$file --order lex --reporter junit --out ./tests/results/report_$file.xml; done;
for file in *runner_mpi ; do mpirun -np 2 ./$file --order lex --reporter junit --out ./tests/results/report_$file.xml; done;
lcov --directory . -b ../pennylane_lightning/src --capture --output-file coverage.info
lcov --remove coverage.info '/usr/*' --output-file coverage.info
mv coverage.info coverage-${{ github.job }}-lightning_gpu_${{ matrix.mpilib }}.info
Expand Down Expand Up @@ -171,6 +174,8 @@ jobs:
max-parallel: 1
matrix:
mpilib: ["mpich", "openmpi"]
cuda_version_maj: ["12"]
cuda_version_min: ["2"]
timeout-minutes: 30

steps:
Expand Down Expand Up @@ -232,9 +237,9 @@ jobs:
- name: Install required packages
run: |
source /etc/profile.d/modules.sh && module use /opt/modules/ && module load ${{ matrix.mpilib }}
source /etc/profile.d/modules.sh && module use /opt/modules/ && module load ${{ matrix.mpilib }}/cuda-${{ matrix.cuda_version_maj }}.${{ matrix.cuda_version_min }}
python -m pip install -r requirements-dev.txt
python -m pip install custatevec-cu11 mpi4py openfermionpyscf
python -m pip install custatevec-cu${{ matrix.cuda_version_maj }} mpi4py openfermionpyscf
SKIP_COMPILATION=True PL_BACKEND=lightning_qubit python -m pip install -e . -vv
- name: Checkout PennyLane for release build
Expand All @@ -256,25 +261,25 @@ jobs:
env:
CUQUANTUM_SDK: $(python -c "import site; print( f'{site.getsitepackages()[0]}/cuquantum/lib')")
run: |
source /etc/profile.d/modules.sh && module use /opt/modules/ && module load ${{ matrix.mpilib }}
CMAKE_ARGS="-DCMAKE_C_COMPILER=mpicc -DCMAKE_CXX_COMPILER=mpicxx -DENABLE_MPI=ON -DCMAKE_CUDA_COMPILER=/usr/local/cuda/bin/nvcc -DCMAKE_CUDA_ARCHITECTURES=${{ env.CI_CUDA_ARCH }} -DPython_EXECUTABLE=${{ steps.python_path.outputs.python }}" \
source /etc/profile.d/modules.sh && module use /opt/modules/ && module load ${{ matrix.mpilib }}/cuda-${{ matrix.cuda_version_maj }}.${{ matrix.cuda_version_min }}
CMAKE_ARGS="-DCMAKE_C_COMPILER=mpicc -DCMAKE_CXX_COMPILER=mpicxx -DENABLE_MPI=ON -DCMAKE_CUDA_COMPILER=$(which nvcc) -DCMAKE_CUDA_ARCHITECTURES=${{ env.CI_CUDA_ARCH }} -DPython_EXECUTABLE=${{ steps.python_path.outputs.python }}" \
PL_BACKEND=lightning_gpu python -m pip install -e . --verbose
# There are issues running py-cov with MPI. A solution is to use coverage as reported
# [here](https://github.com/pytest-dev/pytest-cov/issues/237#issuecomment-544824228)
- name: Run unit tests for MPI-enabled lightning.gpu device
run: |
source /etc/profile.d/modules.sh && module use /opt/modules/ && module load ${{ matrix.mpilib }}
PL_DEVICE=lightning.gpu /opt/mpi/${{ matrix.mpilib }}/bin/mpirun -np 2 \
source /etc/profile.d/modules.sh && module use /opt/modules/ && module load ${{ matrix.mpilib }}/cuda-${{ matrix.cuda_version_maj }}.${{ matrix.cuda_version_min }}
PL_DEVICE=lightning.gpu mpirun -np 2 \
coverage run --rcfile=.coveragerc --source=pennylane_lightning -p -m mpi4py -m pytest ./mpitests --tb=native
coverage combine
coverage xml -o coverage-${{ github.job }}-lightning_gpu_${{ matrix.mpilib }}-main.xml
coverage xml -o coverage-${{ github.job }}-lightning_gpu_${{ matrix.mpilib }}_cu${{ matrix.cuda_version_maj }}-main.xml
- name: Upload code coverage results
uses: actions/upload-artifact@v3
with:
name: ubuntu-codecov-results-python
path: coverage-${{ github.job }}-lightning_gpu_${{ matrix.mpilib }}-*.xml
path: coverage-${{ github.job }}-lightning_gpu_${{ matrix.mpilib }}_cu${{ matrix.cuda_version_maj }}-*.xml
if-no-files-found: error

- name: Cleanup
Expand Down
Loading

0 comments on commit 3527765

Please sign in to comment.