forked from NVIDIA/spark-rapids
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add CentOS documentation and improve dockerfiles for UCX (NVIDIA#2537)
* Create Dockerfiles for Ubuntu and CentOS for RDMA and basic UCX installs * Add a section specific to CentOS baremetal in docs Signed-off-by: Alessandro Bellina <abellina@nvidia.com> * Fix typos * Update docs/additional-functionality/rapids-shuffle.md Co-authored-by: Jason Lowe <jlowe@nvidia.com> * Update docs/additional-functionality/rapids-shuffle.md Co-authored-by: Jason Lowe <jlowe@nvidia.com> * PR review comments * Add info on where to fetch CUDA rpm from * Fix typos * Update docs/additional-functionality/rapids-shuffle.md Co-authored-by: Jason Lowe <jlowe@nvidia.com> Co-authored-by: Jason Lowe <jlowe@nvidia.com>
- Loading branch information
Showing
5 changed files
with
251 additions
and
44 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
37 changes: 37 additions & 0 deletions
37
docs/additional-functionality/shuffle-docker-examples/Dockerfile.centos_no_rdma
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,37 @@ | ||
# | ||
# Copyright (c) 2021, NVIDIA CORPORATION. All rights reserved. | ||
# | ||
# Licensed under the Apache License, Version 2.0 (the "License"); | ||
# you may not use this file except in compliance with the License. | ||
# You may obtain a copy of the License at | ||
# | ||
# http://www.apache.org/licenses/LICENSE-2.0 | ||
# | ||
# Unless required by applicable law or agreed to in writing, software | ||
# distributed under the License is distributed on an "AS IS" BASIS, | ||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
# See the License for the specific language governing permissions and | ||
# limitations under the License. | ||
|
||
# Sample Dockerfile to install UCX in a CentosOS 7 image | ||
# | ||
# The parameters are: | ||
# - CUDA_VER: 11.0.3 to pick up the latest 11.x CUDA base layer | ||
# - UCX_VER and UCX_CUDA_VER: these are used to pick a package matchin a specific UCX version and | ||
# CUDA runtime from the UCX github repo. | ||
# See: https://github.com/openucx/ucx/releases/ | ||
|
||
ARG CUDA_VER=11.0.3 | ||
ARG UCX_VER=v1.10.1 | ||
ARG UCX_CUDA_VER=11.0 | ||
|
||
FROM nvidia/cuda:${CUDA_VER}-runtime-centos7 | ||
ARG UCX_VER | ||
ARG UCX_CUDA_VER | ||
|
||
RUN yum update -y && yum install -y wget bzip2 | ||
RUN cd /tmp && wget https://github.com/openucx/ucx/releases/download/$UCX_VER/ucx-$UCX_VER-centos7-mofed5.x-cuda$UCX_CUDA_VER.tar.bz2 | ||
RUN cd /tmp && tar -xvf *.bz2 && \ | ||
yum install -y ucx-1.10.1-1.el7.x86_64.rpm && \ | ||
yum install -y ucx-cuda-1.10.1-1.el7.x86_64.rpm && \ | ||
rm -rf /tmp/*.rpm |
70 changes: 70 additions & 0 deletions
70
docs/additional-functionality/shuffle-docker-examples/Dockerfile.centos_rdma
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,70 @@ | ||
# | ||
# Copyright (c) 2021, NVIDIA CORPORATION. All rights reserved. | ||
# | ||
# Licensed under the Apache License, Version 2.0 (the "License"); | ||
# you may not use this file except in compliance with the License. | ||
# You may obtain a copy of the License at | ||
# | ||
# http://www.apache.org/licenses/LICENSE-2.0 | ||
# | ||
# Unless required by applicable law or agreed to in writing, software | ||
# distributed under the License is distributed on an "AS IS" BASIS, | ||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
# See the License for the specific language governing permissions and | ||
# limitations under the License. | ||
|
||
|
||
# Sample Dockerfile to install UCX in a CentosOS 7 image with RDMA support. | ||
# | ||
# The parameters are: | ||
# - RDMA_CORE_VERSION: Set to 32.1 to match the rdma-core line in the latest | ||
# released MLNX_OFED 5.x driver | ||
# - CUDA_VER: 11.0.3 to pick up the latest 11.x CUDA base layer | ||
# - UCX_VER and UCX_CUDA_VER: these are used to pick a package matchin a specific UCX version and | ||
# CUDA runtime from the UCX github repo. | ||
# See: https://github.com/openucx/ucx/releases/ | ||
# | ||
# The Dockerfile first fetches and builds `rdma-core` to satisfy requirements for | ||
# the ucx-ib and ucx-rdma RPMs. | ||
|
||
ARG RDMA_CORE_VERSION=32.1 | ||
ARG CUDA_VER=11.0.3 | ||
ARG UCX_VER=v1.10.1 | ||
ARG UCX_CUDA_VER=11.0 | ||
|
||
# Throw away image to build rdma_core | ||
FROM centos:7 as rdma_core | ||
ARG RDMA_CORE_VERSION | ||
|
||
RUN yum update -y | ||
RUN yum install -y wget gcc cmake libnl3-devel libudev-devel make pkgconfig valgrind-devel epel-release | ||
RUN yum install -y cmake3 ninja-build pandoc rpm-build python-docutils | ||
|
||
RUN wget https://github.com/linux-rdma/rdma-core/releases/download/v${RDMA_CORE_VERSION}/rdma-core-${RDMA_CORE_VERSION}.tar.gz | ||
|
||
# Build RPM | ||
RUN mkdir -p rpmbuild/SOURCES tmp && \ | ||
tar --wildcards -xzf rdma-core*.tar.gz */redhat/rdma-core.spec --strip-components=2 && \ | ||
RPM_SRC=$((rpmspec -P *.spec || grep ^Source: *.spec) | awk '/^Source:/{split($0,a,"[ \t]+");print(a[2])}') && \ | ||
(cd rpmbuild/SOURCES && ln -sf ../../rdma-core*.tar.gz "$RPM_SRC") | ||
RUN rpmbuild --define '_tmppath '$(pwd)'/tmp' --define '_topdir '$(pwd)'/rpmbuild' -bb *.spec | ||
RUN mv rpmbuild/RPMS/x86_64/*.rpm /tmp | ||
|
||
# Now start the main container | ||
FROM nvidia/cuda:${CUDA_VER}-runtime-centos7 | ||
ARG UCX_VER | ||
ARG UCX_CUDA_VER | ||
|
||
COPY --from=rdma_core /tmp/*.rpm /tmp/ | ||
|
||
RUN yum update -y | ||
RUN yum install -y wget bzip2 | ||
RUN cd /tmp && wget https://github.com/openucx/ucx/releases/download/$UCX_VER/ucx-$UCX_VER-centos7-mofed5.x-cuda$UCX_CUDA_VER.tar.bz2 | ||
RUN cd /tmp && \ | ||
yum install -y *.rpm && \ | ||
tar -xvf *.bz2 && \ | ||
yum install -y ucx-1.10.1-1.el7.x86_64.rpm && \ | ||
yum install -y ucx-cuda-1.10.1-1.el7.x86_64.rpm && \ | ||
yum install -y ucx-ib-1.10.1-1.el7.x86_64.rpm && \ | ||
yum install -y ucx-rdmacm-1.10.1-1.el7.x86_64.rpm | ||
RUN rm -rf /tmp/*.rpm && rm /tmp/*.bz2 |
35 changes: 35 additions & 0 deletions
35
docs/additional-functionality/shuffle-docker-examples/Dockerfile.ubuntu_no_rdma
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,35 @@ | ||
# | ||
# Copyright (c) 2021, NVIDIA CORPORATION. All rights reserved. | ||
# | ||
# Licensed under the Apache License, Version 2.0 (the "License"); | ||
# you may not use this file except in compliance with the License. | ||
# You may obtain a copy of the License at | ||
# | ||
# http://www.apache.org/licenses/LICENSE-2.0 | ||
# | ||
# Unless required by applicable law or agreed to in writing, software | ||
# distributed under the License is distributed on an "AS IS" BASIS, | ||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
# See the License for the specific language governing permissions and | ||
# limitations under the License. | ||
|
||
# Sample Dockerfile to install UCX in a Ubuntu 18.04 image | ||
# | ||
# The parameters are: | ||
# - CUDA_VER: 11.0.3 to pick up the latest 11.x CUDA base layer | ||
# - UCX_VER and UCX_CUDA_VER: these are used to pick a package matchin a specific UCX version and | ||
# CUDA runtime from the UCX github repo. | ||
# See: https://github.com/openucx/ucx/releases/ | ||
|
||
ARG CUDA_VER=11.0 | ||
ARG UCX_VER=v1.10.1 | ||
ARG UCX_CUDA_VER=11.0 | ||
|
||
FROM nvidia/cuda:${CUDA_VER}-runtime-ubuntu18.04 | ||
ARG UCX_VER | ||
ARG UCX_CUDA_VER | ||
|
||
RUN apt update | ||
RUN apt-get install -y wget | ||
RUN cd /tmp && wget https://github.com/openucx/ucx/releases/download/$UCX_VER/ucx-$UCX_VER-ubuntu18.04-mofed5.x-cuda$UCX_CUDA_VER.deb | ||
RUN apt install -y /tmp/*.deb && rm -rf /tmp/*.deb |
55 changes: 55 additions & 0 deletions
55
docs/additional-functionality/shuffle-docker-examples/Dockerfile.ubuntu_rdma
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,55 @@ | ||
# | ||
# Copyright (c) 2021, NVIDIA CORPORATION. All rights reserved. | ||
# | ||
# Licensed under the Apache License, Version 2.0 (the "License"); | ||
# you may not use this file except in compliance with the License. | ||
# You may obtain a copy of the License at | ||
# | ||
# http://www.apache.org/licenses/LICENSE-2.0 | ||
# | ||
# Unless required by applicable law or agreed to in writing, software | ||
# distributed under the License is distributed on an "AS IS" BASIS, | ||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
# See the License for the specific language governing permissions and | ||
# limitations under the License. | ||
|
||
|
||
# Sample Dockerfile to install UCX in a Unbuntu 18.04 image with RDMA support. | ||
# | ||
# The parameters are: | ||
# - RDMA_CORE_VERSION: Set to 32.1 to match the rdma-core line in the latest | ||
# released MLNX_OFED 5.x driver | ||
# - CUDA_VER: 11.0.3 to pick up the latest 11.x CUDA base layer | ||
# - UCX_VER and UCX_CUDA_VER: these are used to pick a package matchin a specific UCX version and | ||
# CUDA runtime from the UCX github repo. | ||
# See: https://github.com/openucx/ucx/releases/ | ||
# | ||
# The Dockerfile first fetches and builds `rdma-core` to satisfy requirements for | ||
# the ucx-ib and ucx-rdma RPMs. | ||
|
||
ARG RDMA_CORE_VERSION=32.1 | ||
ARG CUDA_VER=11.0.3 | ||
ARG UCX_VER=v1.10.1 | ||
ARG UCX_CUDA_VER=11.0 | ||
|
||
# Throw away image to build rdma_core | ||
FROM ubuntu:18.04 as rdma_core | ||
ARG RDMA_CORE_VERSION | ||
|
||
RUN apt update | ||
RUN apt-get install -y dh-make wget build-essential cmake gcc libudev-dev libnl-3-dev libnl-route-3-dev ninja-build pkg-config valgrind python3-dev cython3 python3-docutils pandoc | ||
|
||
RUN wget https://github.com/linux-rdma/rdma-core/releases/download/v${RDMA_CORE_VERSION}/rdma-core-${RDMA_CORE_VERSION}.tar.gz | ||
RUN tar -xvf *.tar.gz && cd rdma-core*/ && dpkg-buildpackage -b -d | ||
|
||
# Now start the main container | ||
FROM nvidia/cuda:${CUDA_VER}-runtime-ubuntu18.04 | ||
ARG UCX_VER | ||
ARG UCX_CUDA_VER | ||
|
||
COPY --from=rdma_core /*.deb /tmp/ | ||
|
||
RUN apt update | ||
RUN apt-get install -y cuda-compat-11-0 wget udev dh-make libudev-dev libnl-3-dev libnl-route-3-dev python3-dev cython3 | ||
RUN cd /tmp && wget https://github.com/openucx/ucx/releases/download/$UCX_VER/ucx-$UCX_VER-ubuntu18.04-mofed5.x-cuda$UCX_CUDA_VER.deb | ||
RUN apt install -y /tmp/*.deb && rm -rf /tmp/*.deb |