Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for multi-arch docker images #399

Merged
merged 14 commits into from
Jun 28, 2024
Merged
Show file tree
Hide file tree
Changes from 8 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
58 changes: 28 additions & 30 deletions .github/workflows/CondaLock.yml
Original file line number Diff line number Diff line change
Expand Up @@ -37,39 +37,37 @@ jobs:
# Could run as single step in parallel, but would complicate logs...
- name: Run conda-lock ${{ matrix.IMAGE }}
run: |
cd ${{ matrix.IMAGE }}
if [ ${{ matrix.IMAGE }} = "base-notebook" ]; then
conda-lock lock -f environment.yml -p linux-64
elif [ ${{ matrix.IMAGE }} = "pangeo-notebook" ]; then
conda-lock lock -f environment.yml -f ../base-notebook/environment.yml -p linux-64
else
conda-lock lock -f environment.yml -f ../pangeo-notebook/environment.yml -f ../base-notebook/environment.yml -p linux-64
fi
conda-lock render -k explicit
../generate-packages-list.py conda-linux-64.lock > packages.txt

- name: Upload lockfiles
uses: actions/upload-artifact@v4
with:
name: ${{ matrix.image }}
path: ${{ matrix.image }}
cd base-notebook
conda-lock lock -p linux-64 -p linux-aarch64 -p osx-64 -p osx-arm64
conda-lock render -k explicit -p linux-64
../generate-packages-list.py conda-linux-64.lock > packages.txt
scottyhq marked this conversation as resolved.
Show resolved Hide resolved

- name: Run conda-lock pangeo-notebook
timeout-minutes: 5
continue-on-error: true
run: |
cd pangeo-notebook
conda-lock lock -f environment.yml -f ../base-notebook/environment.yml -p linux-64 -p linux-aarch64 -p osx-64 -p osx-arm64
conda-lock render -k explicit -p linux-64
../generate-packages-list.py conda-linux-64.lock > packages.txt

# Each job will commit files, so we know it succeeds based on commits
commit-lockfiles:
needs: condalock
runs-on: ubuntu-latest
steps:
- name: Checkout Repository
uses: actions/checkout@v4
with:
token: ${{ secrets.PANGEOBOT_TOKEN }}
# These lines are critical, otherwise Pangeo-bot pushes changes directly to master from PRs!
repository: ${{ github.event.client_payload.pull_request.head.repo.full_name }}
ref: ${{ github.event.client_payload.pull_request.head.ref }}
- name: Run conda-lock ml-notebook
timeout-minutes: 5
continue-on-error: true
run: |
cd ml-notebook
conda-lock lock -f environment.yml -f ../pangeo-notebook/environment.yml -f ../base-notebook/environment.yml -p linux-64 -p linux-aarch64 -p osx-64 -p osx-arm64
Copy link
Member

@weiji14 weiji14 Jun 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tensorflow doesn't have a linux-aarch64 package yet - conda-forge/tensorflow-feedstock#136

Could not lock the environment for platform linux-aarch64
Could not solve for environment specs
The following package could not be installed
└─ tensorflow >=2.15.0 cuda120* does not exist (perhaps a typo or a missing channel).

There are osx-64/osx-arm64 packages on conda-forge, though only CPU builds. Not sure if the osx-arm64 build actually uses the Apple Sillicon GPU.

conda-lock render -k explicit -p linux-64
../generate-packages-list.py conda-linux-64.lock > packages.txt

# Download all artifacts from previous matrix job
- uses: actions/download-artifact@v4
- name: Run conda-lock pytorch-notebook
timeout-minutes: 5
continue-on-error: true
run: |
cd pytorch-notebook
conda-lock lock -f environment.yml -f ../pangeo-notebook/environment.yml -f ../base-notebook/environment.yml -p linux-64 -p linux-aarch64 -p osx-64 -p osx-arm64
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pytorch does have linux-aarch64/osx-64/osx-arm64 packages on conda-forge, but we set a cuda120 pin in the environment.yml file so locking fails:

Could not lock the environment for platform linux-aarch64
Could not solve for environment specs
The following packages are incompatible
├─ pytorch >=2.1.0 cuda120* is not installable because it requires
│  └─ __cuda, which is missing on the system;
└─ torchvision >=0.16.1 cuda120* does not exist (perhaps a typo or a missing channel).

conda-lock render -k explicit -p linux-64
../generate-packages-list.py conda-linux-64.lock > packages.txt

- name: Commit condalock files to PR
run: |
Expand Down
12 changes: 8 additions & 4 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -11,31 +11,35 @@ base-image :
.PHONY: base-notebook
base-notebook : base-image
cd base-notebook ; \
conda-lock lock --mamba -k explicit -f environment.yml -p linux-64; \
conda-lock lock -p linux-64 -p linux-aarch64 -p osx-64 -p osx-arm64; \
conda-lock render -k explicit -p linux-64; \
../generate-packages-list.py conda-linux-64.lock > packages.txt; \
docker build -t pangeo/base-notebook:master . ; \
docker run -w $(TESTDIR) -v $(PWD):$(TESTDIR) pangeo/base-notebook:master ./run_tests.sh base-notebook

.PHONY: pangeo-notebook
pangeo-notebook : base-image
cd pangeo-notebook ; \
conda-lock lock --mamba -k explicit -f environment.yml -f ../base-notebook/environment.yml -p linux-64; \
conda-lock lock -f environment.yml -f ../base-notebook/environment.yml -p linux-64 -p linux-aarch64 -p osx-64 -p osx-arm64; \
conda-lock render -k explicit -p linux-64; \
../generate-packages-list.py conda-linux-64.lock > packages.txt; \
docker build -t pangeo/pangeo-notebook:master . ; \
docker run -w $(TESTDIR) -v $(PWD):$(TESTDIR) pangeo/pangeo-notebook:master ./run_tests.sh pangeo-notebook

.PHONY: ml-notebook
ml-notebook : base-image
cd ml-notebook ; \
conda-lock lock --mamba -k explicit -f environment.yml -f ../pangeo-notebook/environment.yml -f ../base-notebook/environment.yml -p linux-64; \
conda-lock lock -f environment.yml -f ../pangeo-notebook/environment.yml -f ../base-notebook/environment.yml -p linux-64 -p linux-aarch64 -p osx-64 -p osx-arm64; \
conda-lock render -k explicit -p linux-64; \
../generate-packages-list.py conda-linux-64.lock > packages.txt; \
docker build -t pangeo/ml-notebook:master . ; \
docker run -w $(TESTDIR) -v $(PWD):$(TESTDIR) pangeo/ml-notebook:master ./run_tests.sh ml-notebook

.PHONY: pytorch-notebook
pytorch-notebook : base-image
cd pytorch-notebook ; \
conda-lock lock --mamba -k explicit -f environment.yml -f ../pangeo-notebook/environment.yml -f ../base-notebook/environment.yml -p linux-64; \
conda-lock lock -f environment.yml -f ../pangeo-notebook/environment.yml -f ../base-notebook/environment.yml -p linux-64 -p linux-aarch64 -p osx-64 -p osx-arm64; \
conda-lock render -k explicit -p linux-64; \
../generate-packages-list.py conda-linux-64.lock > packages.txt; \
docker build -t pangeo/pytorch-notebook:master . ; \
docker run -w $(TESTDIR) -v $(PWD):$(TESTDIR) pangeo/pytorch-notebook:master ./run_tests.sh pytorch-notebook
20 changes: 10 additions & 10 deletions base-image/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -66,8 +66,8 @@ USER ${NB_USER}
WORKDIR ${HOME}

# Install latest mambaforge in ${CONDA_DIR}
RUN echo "Installing Miniforge..." \
&& URL="https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-Linux-x86_64.sh" \
RUN echo "Installing Mambaforge..." \
&& URL="https://github.com/conda-forge/miniforge/releases/latest/download/Mambaforge-Linux-$(uname -m).sh" \
scottyhq marked this conversation as resolved.
Show resolved Hide resolved
&& wget --quiet ${URL} -O installer.sh \
&& /bin/bash installer.sh -u -b -p ${CONDA_DIR} \
&& rm installer.sh \
Expand Down Expand Up @@ -144,8 +144,10 @@ ONBUILD USER ${NB_USER}
# We want to keep our images as reproducible as possible. If a lock
# file with exact versions of all required packages is present, we use
# it to install packages. conda-lock (https://github.com/conda-incubator/conda-lock)
# is used to generate this conda-linux-64.lock file from a given environment.yml
# file - so we get the exact same versions each time the image is built. This
# is used to generate this conda-lock.yml file from a given environment.yml
# file - so we get the exact same versions each time the image is built. Since this is
# different packages for different CPU architectures
# the same dockerfile can be used to build different architecture images. This
scottyhq marked this conversation as resolved.
Show resolved Hide resolved
# also lets us see what packages have changed between two images by diffing
# the contents of the lock file between those image versions.
# If a lock file is not present, we use the environment.yml file. And
Expand All @@ -157,13 +159,11 @@ ONBUILD USER ${NB_USER}
ONBUILD RUN echo "Checking for 'conda-lock.yml' 'conda-linux-64.lock' or 'environment.yml'..." \
; [ -d binder ] && cd binder \
; [ -d .binder ] && cd .binder \
; if test -f "conda-lock.yml" ; then \
conda-lock install --name ${CONDA_ENV} conda-lock.yml \
; elif test -f "conda-linux-64.lock" ; then \
mamba create --name ${CONDA_ENV} --file conda-linux-64.lock \
; elif test -f "environment.yml" ; then \
; if test -f "conda-lock.yml" ; then echo "Using conda-lock.yml" & \
conda-lock install --name ${CONDA_ENV} \
; elif test -f "environment.yml" ; then echo "Using environment.yml" & \
mamba env create --name ${CONDA_ENV} -f environment.yml \
; else echo "No conda-lock.yml, conda-linux-64.lock, or environment.yml! *creating default env*" ; \
; else echo "No conda-lock.yml or environment.yml! *creating default env*" ; \
mamba create --name ${CONDA_ENV} pangeo-notebook \
; fi \
&& mamba clean -yaf \
Expand Down
2 changes: 1 addition & 1 deletion pangeo-notebook/environment.yml
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ dependencies:
- cdsapi
- cfgrib
- cf_xarray
- ciso
- ciso
scottyhq marked this conversation as resolved.
Show resolved Hide resolved
- cmocean
- dask-ml
- datashader
Expand Down