Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add build and test support for CUDA 12 #606

Merged
merged 35 commits into from
Feb 12, 2024
Merged
Show file tree
Hide file tree
Changes from 29 commits
Commits
Show all changes
35 commits
Select commit Hold shift + click to select a range
110b4f2
Add build and test support for CUDA 12
mlxd Feb 2, 2024
6681f48
Auto update version
github-actions[bot] Feb 2, 2024
ddc0a65
Fix name
mlxd Feb 2, 2024
aa5d5dc
Merge branch 'update/lgpu_cuda12' of github.com:PennyLaneAI/pennylane…
mlxd Feb 2, 2024
0c4471a
Avoid overstepping with CUDA 11 on 12
mlxd Feb 2, 2024
eb10a24
Fix wheel output naming
mlxd Feb 2, 2024
b82b4d7
Remove unneeded strings
mlxd Feb 2, 2024
8570193
Add missing 0 to CUDA 11 install
mlxd Feb 2, 2024
115a844
Trigger CI
mlxd Feb 2, 2024
3f8c8f7
Update to allow CUDA 11 wheels to build also
mlxd Feb 5, 2024
a2aaac3
Update fromJSON usage
mlxd Feb 5, 2024
f2d4115
Update int to str in json map
mlxd Feb 5, 2024
198c396
Aim to keep cuda 11.5 wheels for now
mlxd Feb 5, 2024
6566608
Remove CUDA 11
mlxd Feb 5, 2024
8c0e238
Migrate windows tests to clangcl to avoid failures
mlxd Feb 6, 2024
44fcb00
Ensure VS generator used
mlxd Feb 6, 2024
47e7387
Update for auditwheel 6.0 changes
mlxd Feb 6, 2024
e98f183
Ensure devtoolset values are used in CUDA 12 build
mlxd Feb 7, 2024
baedf47
Remove unneeded deref
mlxd Feb 7, 2024
69ddde1
Retry installing cuda 12.0
mlxd Feb 7, 2024
c285319
Fix auditwheel arch check
mlxd Feb 7, 2024
9a21ea9
Fix auditwheel arch check again
mlxd Feb 7, 2024
38450e9
Revert windows tests
mlxd Feb 7, 2024
a52e40f
Lower overhead of Windows CI tests (#610)
mlxd Feb 7, 2024
9cc12d4
Update changelog
mlxd Feb 7, 2024
a9feabf
Remove trainability from stateprep in test
mlxd Feb 9, 2024
49f3d6e
Update dev reqs to use cu12
mlxd Feb 9, 2024
0f7caa2
Fix formatting
mlxd Feb 9, 2024
655d966
Update the MPI modules in CI
mlxd Feb 9, 2024
7f0bf00
Add ls to module dirs
mlxd Feb 9, 2024
a7e7eae
Remove additional cu11 deps
mlxd Feb 9, 2024
29427b4
Use mpirun from PATH
mlxd Feb 9, 2024
12e4f56
Update to cu12 in missing locations
mlxd Feb 9, 2024
4812a93
Remove opt path specifics
mlxd Feb 9, 2024
c0c5a25
Ensure cuda version env vars are inlcuded on Python tests for MPI
mlxd Feb 9, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions .github/CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,8 +7,14 @@

### Breaking changes

* Migrate `lightning.gpu` to CUDA 12.
[(#606)](https://github.com/PennyLaneAI/pennylane-lightning/pull/606)

### Improvements

* Lower the overheads of Windows CI tests.
[(#610)](https://github.com/PennyLaneAI/pennylane-lightning/pull/610)

* Decouple LightningQubit memory ownership from numpy and migrate it to Lightning-Qubit managed state-vector class.
[(#601)](https://github.com/PennyLaneAI/pennylane-lightning/pull/601)

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -39,25 +39,27 @@ jobs:
matrix:
os: [ubuntu-22.04]
pl_backend: ["lightning_gpu"]
cuda_version: ["12"]

steps:
- name: Validate GPU version and installed compiler
run: |
source /etc/profile.d/modules.sh
module use /opt/modules
module load cuda/11.8
module load cuda/${{ matrix.cuda_version }}
echo "${PATH}" >> $GITHUB_PATH
echo "LD_LIBRARY_PATH=${LD_LIBRARY_PATH}" >> $GITHUB_ENV
nvcc --version
nvidia-smi

cpptestswithLGPU_cu11:
cpptestswithLGPU:
if: ${{ !contains(fromJSON('["schedule", "workflow_dispatch"]'), github.event_name) }}
needs: [builddeps]
strategy:
matrix:
os: [ubuntu-22.04]
pl_backend: ["lightning_gpu"]
cuda_version: ["12"]

name: C++ tests (Lightning-GPU)
runs-on:
Expand All @@ -70,7 +72,7 @@ jobs:
run: |
source /etc/profile.d/modules.sh
module use /opt/modules
module load cuda/11.8
module load cuda/${{ matrix.cuda_version }}
echo "${PATH}" >> $GITHUB_PATH
echo "LD_LIBRARY_PATH=${LD_LIBRARY_PATH}" >> $GITHUB_ENV
nvcc --version
Expand Down Expand Up @@ -117,7 +119,7 @@ jobs:

- name: Install required packages
run: |
python -m pip install ninja cmake custatevec-cu11
python -m pip install ninja cmake custatevec-cu${{ matrix.cuda_version }}
sudo apt-get -y -q install liblapack-dev

- name: Build and run unit tests
Expand Down Expand Up @@ -161,6 +163,7 @@ jobs:
os: [ubuntu-22.04]
pl_backend: ["lightning_gpu"]
default_backend: ["lightning_qubit"]
cuda_version: ["12"]

name: Python tests with LGPU
runs-on:
Expand All @@ -173,7 +176,7 @@ jobs:
run: |
source /etc/profile.d/modules.sh
module use /opt/modules
module load cuda/11.8
module load cuda/${{ matrix.cuda_version }}
echo "${PATH}" >> $GITHUB_PATH
echo "LD_LIBRARY_PATH=${LD_LIBRARY_PATH}" >> $GITHUB_ENV
nvcc --version
Expand Down Expand Up @@ -238,7 +241,7 @@ jobs:
run: |
cd main
python -m pip install -r requirements-dev.txt
python -m pip install cmake custatevec-cu11 openfermionpyscf
python -m pip install cmake custatevec-cu${{ matrix.cuda_version }} openfermionpyscf

- name: Checkout PennyLane for release build
if: inputs.pennylane-version == 'release'
Expand Down
27 changes: 15 additions & 12 deletions .github/workflows/tests_linux_x86_mpi_gpu.yml
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,8 @@ jobs:
max-parallel: 1
matrix:
mpilib: ["mpich", "openmpi"]
cuda_version_maj: ["12"]
vincentmr marked this conversation as resolved.
Show resolved Hide resolved
cuda_version_min: ["2"]
timeout-minutes: 30

steps:
Expand Down Expand Up @@ -97,23 +99,23 @@ jobs:

- name: Validate GPU version and installed compiler
run: |
source /etc/profile.d/modules.sh && module use /opt/modules && module load cuda/11.8
source /etc/profile.d/modules.sh && module use /opt/modules && module load cuda/${{ matrix.cuda_version_maj }}
which -a nvcc
nvcc --version

- name: Validate Multi-GPU packages
run: |
source /etc/profile.d/modules.sh && module use /opt/modules/ && module load ${{ matrix.mpilib }}
source /etc/profile.d/modules.sh && module use /opt/modules/ && module load ${{ matrix.mpilib }}/cuda-${{ matrix.cuda_version_maj }}.${{ matrix.cuda_version_min }}
echo 'Checking for ${{ matrix.mpilib }}'
which -a mpirun
mpirun --version
which -a mpicxx
mpicxx --version
module unload ${{ matrix.mpilib }}
module unload ${{ matrix.mpilib }}/cuda-${{ matrix.cuda_version_maj }}.${{ matrix.cuda_version_min }}

- name: Build and run unit tests
run: |
source /etc/profile.d/modules.sh && module use /opt/modules/ && module load ${{ matrix.mpilib }}
source /etc/profile.d/modules.sh && module use /opt/modules/ && module load ${{ matrix.mpilib }}/cuda-${{ matrix.cuda_version_maj }}.${{ matrix.cuda_version_min }}
export CUQUANTUM_SDK=$(python -c "import site; print( f'{site.getsitepackages()[0]}/cuquantum/lib')")
cmake . -BBuild \
-DPL_BACKEND=lightning_gpu \
Expand All @@ -123,7 +125,7 @@ jobs:
-DBUILD_TESTS=ON \
-DENABLE_LAPACK=ON \
-DCMAKE_CXX_COMPILER=mpicxx \
-DCMAKE_CUDA_COMPILER="/usr/local/cuda/bin/nvcc" \
-DCMAKE_CUDA_COMPILER=$(which nvcc) \
-DCMAKE_CUDA_ARCHITECTURES="86" \
-DPython_EXECUTABLE:FILE="${{ steps.python_path.outputs.python }}" \
-G Ninja
Expand All @@ -135,6 +137,7 @@ jobs:
lcov --directory . -b ../pennylane_lightning/src --capture --output-file coverage.info
lcov --remove coverage.info '/usr/*' --output-file coverage.info
mv coverage.info coverage-${{ github.job }}-lightning_gpu_${{ matrix.mpilib }}.info
module unload ${{ matrix.mpilib }}/cuda-${{ matrix.cuda_version_maj }}.${{ matrix.cuda_version_min }}

- name: Upload test results
uses: actions/upload-artifact@v3
Expand Down Expand Up @@ -232,9 +235,9 @@ jobs:

- name: Install required packages
run: |
source /etc/profile.d/modules.sh && module use /opt/modules/ && module load ${{ matrix.mpilib }}
source /etc/profile.d/modules.sh && module use /opt/modules/ && module load ${{ matrix.mpilib }}/cuda-${{ matrix.cuda_version_maj }}.${{ matrix.cuda_version_min }}
python -m pip install -r requirements-dev.txt
python -m pip install custatevec-cu11 mpi4py openfermionpyscf
python -m pip install custatevec-cu${{ matrix.cuda_version_maj }} mpi4py openfermionpyscf
SKIP_COMPILATION=True PL_BACKEND=lightning_qubit python -m pip install -e . -vv

- name: Checkout PennyLane for release build
Expand All @@ -256,25 +259,25 @@ jobs:
env:
CUQUANTUM_SDK: $(python -c "import site; print( f'{site.getsitepackages()[0]}/cuquantum/lib')")
run: |
source /etc/profile.d/modules.sh && module use /opt/modules/ && module load ${{ matrix.mpilib }}
CMAKE_ARGS="-DCMAKE_C_COMPILER=mpicc -DCMAKE_CXX_COMPILER=mpicxx -DENABLE_MPI=ON -DCMAKE_CUDA_COMPILER=/usr/local/cuda/bin/nvcc -DCMAKE_CUDA_ARCHITECTURES=${{ env.CI_CUDA_ARCH }} -DPython_EXECUTABLE=${{ steps.python_path.outputs.python }}" \
source /etc/profile.d/modules.sh && module use /opt/modules/ && module load ${{ matrix.mpilib }}/cuda-${{ matrix.cuda_version_maj }}.${{ matrix.cuda_version_min }}
CMAKE_ARGS="-DCMAKE_C_COMPILER=mpicc -DCMAKE_CXX_COMPILER=mpicxx -DENABLE_MPI=ON -DCMAKE_CUDA_COMPILER=$(which nvcc) -DCMAKE_CUDA_ARCHITECTURES=${{ env.CI_CUDA_ARCH }} -DPython_EXECUTABLE=${{ steps.python_path.outputs.python }}" \
PL_BACKEND=lightning_gpu python -m pip install -e . --verbose

# There are issues running py-cov with MPI. A solution is to use coverage as reported
# [here](https://github.com/pytest-dev/pytest-cov/issues/237#issuecomment-544824228)
- name: Run unit tests for MPI-enabled lightning.gpu device
run: |
source /etc/profile.d/modules.sh && module use /opt/modules/ && module load ${{ matrix.mpilib }}
source /etc/profile.d/modules.sh && module use /opt/modules/ && module load ${{ matrix.mpilib }}/cuda-${{ matrix.cuda_version_maj }}.${{ matrix.cuda_version_min }}
PL_DEVICE=lightning.gpu /opt/mpi/${{ matrix.mpilib }}/bin/mpirun -np 2 \
coverage run --rcfile=.coveragerc --source=pennylane_lightning -p -m mpi4py -m pytest ./mpitests --tb=native
coverage combine
coverage xml -o coverage-${{ github.job }}-lightning_gpu_${{ matrix.mpilib }}-main.xml
coverage xml -o coverage-${{ github.job }}-lightning_gpu_${{ matrix.mpilib }}_cu${{ matrix.cuda_version_maj }}-main.xml

- name: Upload code coverage results
uses: actions/upload-artifact@v3
with:
name: ubuntu-codecov-results-python
path: coverage-${{ github.job }}-lightning_gpu_${{ matrix.mpilib }}-*.xml
path: coverage-${{ github.job }}-lightning_gpu_${{ matrix.mpilib }}_cu${{ matrix.cuda_version_maj }}-*.xml
if-no-files-found: error

- name: Cleanup
Expand Down
Loading
Loading