Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update package version tags to 24.10, switch to nightly 24.10 rapids #746

Draft
wants to merge 1 commit into
base: branch-24.10
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions ci/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,6 @@ RUN wget --quiet https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86
&& conda config --set solver libmamba

# install cuML
ARG CUML_VER=24.08
RUN conda install -y -c rapidsai -c conda-forge -c nvidia cuml=$CUML_VER cuvs=$CUML_VER python=3.9 cuda-version=11.8 \
ARG CUML_VER=24.10
RUN conda install -y -c rapidsai-nightly -c conda-forge -c nvidia cuml=$CUML_VER cuvs=$CUML_VER python=3.10 cuda-version=11.8 numpy~=1.0 \
&& conda clean --all -f -y
2 changes: 1 addition & 1 deletion docs/source/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@
project = 'spark-rapids-ml'
copyright = '2024, NVIDIA'
author = 'NVIDIA'
release = '24.08.0'
release = '24.10.0'

# -- General configuration ---------------------------------------------------
# https://www.sphinx-doc.org/en/master/usage/configuration.html#general-configuration
Expand Down
2 changes: 1 addition & 1 deletion notebooks/databricks/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,7 @@ If you already have a Databricks account, you can run the example notebooks on a
spark.task.resource.gpu.amount 1
spark.databricks.delta.preview.enabled true
spark.python.worker.reuse true
spark.executorEnv.PYTHONPATH /databricks/jars/rapids-4-spark_2.12-24.06.1.jar:/databricks/spark/python
spark.executorEnv.PYTHONPATH /databricks/jars/rapids-4-spark_2.12-24.08.1.jar:/databricks/spark/python
spark.sql.execution.arrow.maxRecordsPerBatch 100000
spark.rapids.memory.gpu.minAllocFraction 0.0001
spark.plugins com.nvidia.spark.SQLPlugin
Expand Down
2 changes: 1 addition & 1 deletion notebooks/databricks/init-pip-cuda-11.8.sh
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ SPARK_RAPIDS_ML_ZIP=/dbfs/path/to/zip/file
# also in general, RAPIDS_VERSION (python) fields should omit any leading 0 in month/minor field (i.e. 23.8.0 and not 23.08.0)
# while SPARK_RAPIDS_VERSION (jar) should have leading 0 in month/minor (e.g. 23.08.2 and not 23.8.2)
RAPIDS_VERSION=24.8.0
SPARK_RAPIDS_VERSION=24.06.1
SPARK_RAPIDS_VERSION=24.08.1

curl -L https://repo1.maven.org/maven2/com/nvidia/rapids-4-spark_2.12/${SPARK_RAPIDS_VERSION}/rapids-4-spark_2.12-${SPARK_RAPIDS_VERSION}-cuda11.jar -o /databricks/jars/rapids-4-spark_2.12-${SPARK_RAPIDS_VERSION}.jar

Expand Down
2 changes: 1 addition & 1 deletion python/benchmark/databricks/gpu_etl_cluster_spec.sh
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ cat <<EOF
"spark.task.cpus": "1",
"spark.databricks.delta.preview.enabled": "true",
"spark.python.worker.reuse": "true",
"spark.executorEnv.PYTHONPATH": "/databricks/jars/rapids-4-spark_2.12-24.06.1.jar:/databricks/spark/python",
"spark.executorEnv.PYTHONPATH": "/databricks/jars/rapids-4-spark_2.12-24.08.1.jar:/databricks/spark/python",
"spark.sql.files.minPartitionNum": "2",
"spark.sql.execution.arrow.maxRecordsPerBatch": "10000",
"spark.executor.cores": "8",
Expand Down
19 changes: 12 additions & 7 deletions python/benchmark/databricks/init-pip-cuda-11.8.sh
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ BENCHMARK_ZIP=/dbfs/path/to/benchmark.zip
# also, in general, RAPIDS_VERSION (python) fields should omit any leading 0 in month/minor field (i.e. 23.8.0 and not 23.08.0)
# while SPARK_RAPIDS_VERSION (jar) should have leading 0 in month/minor (e.g. 23.08.2 and not 23.8.2)
RAPIDS_VERSION=24.8.0
SPARK_RAPIDS_VERSION=24.06.1
SPARK_RAPIDS_VERSION=24.08.1

curl -L https://repo1.maven.org/maven2/com/nvidia/rapids-4-spark_2.12/${SPARK_RAPIDS_VERSION}/rapids-4-spark_2.12-${SPARK_RAPIDS_VERSION}-cuda11.jar -o /databricks/jars/rapids-4-spark_2.12-${SPARK_RAPIDS_VERSION}.jar

Expand All @@ -24,12 +24,17 @@ ln -s /usr/local/cuda-11.8 /usr/local/cuda

# install cudf and cuml
# using ~= pulls in micro version patches
/databricks/python/bin/pip install cudf-cu11~=${RAPIDS_VERSION} \
cuml-cu11~=${RAPIDS_VERSION} \
cuvs-cu11~=${RAPIDS_VERSION} \
pylibraft-cu11~=${RAPIDS_VERSION} \
rmm-cu11~=${RAPIDS_VERSION} \
--extra-index-url=https://pypi.nvidia.com
# /databricks/python/bin/pip install cudf-cu11~=${RAPIDS_VERSION} \
# cuml-cu11~=${RAPIDS_VERSION} \
# cuvs-cu11~=${RAPIDS_VERSION} \
# pylibraft-cu11~=${RAPIDS_VERSION} \
# rmm-cu11~=${RAPIDS_VERSION} \
# --extra-index-url=https://pypi.nvidia.com

/databricks/python/bin/pip install \
--extra-index-url=https://pypi.anaconda.org/rapidsai-wheels-nightly/simple \
"cudf-cu11>=24.10.0a0,<=24.10" "dask-cudf-cu11>=24.10.0a0,<=24.10" \
"cuml-cu11>=24.10.0a0,<=24.10" "dask-cuda>=24.10.0a0,<=24.10"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would cuvs be automatically installed even it is not specified in the pip install command?
Seems we introduce cudf, dask-cudf, and dask-cuda to the pip install command. Any reason for that?


# install spark-rapids-ml
python_ver=`python --version | grep -oP '3\.[0-9]+'`
Expand Down
2 changes: 1 addition & 1 deletion python/pyproject.toml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[project]
name = "spark-rapids-ml"
version = "24.8.0"
version = "24.10.0"
authors = [
{ name="Jinfeng Li", email="jinfeng@nvidia.com" },
{ name="Bobby Wang", email="bobwang@nvidia.com" },
Expand Down
2 changes: 1 addition & 1 deletion python/run_benchmark.sh
Original file line number Diff line number Diff line change
Expand Up @@ -107,7 +107,7 @@ EOF

if [[ $cluster_type == "gpu_etl" ]]
then
SPARK_RAPIDS_VERSION=24.06.1
SPARK_RAPIDS_VERSION=24.08.1
rapids_jar=${rapids_jar:-rapids-4-spark_2.12-$SPARK_RAPIDS_VERSION.jar}
if [ ! -f $rapids_jar ]; then
echo "downloading spark rapids jar"
Expand Down
2 changes: 1 addition & 1 deletion python/src/spark_rapids_ml/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@
# See the License for the specific language governing permissions and
# limitations under the License.
#
__version__ = "24.08.0"
__version__ = "24.10.0"

import pandas as pd
import pyspark
Expand Down
2 changes: 1 addition & 1 deletion python/src/spark_rapids_ml/clustering.py
Original file line number Diff line number Diff line change
Expand Up @@ -483,7 +483,7 @@ def _construct_kmeans() -> CumlT:
kmeans = CumlKMeansMG(output_type="cudf", **cuml_alg_params)
from spark_rapids_ml.utils import cudf_to_cuml_array

kmeans.n_cols = n_cols
kmeans.n_features_in_ = n_cols
kmeans.dtype = np.dtype(dtype)
kmeans.cluster_centers_ = cudf_to_cuml_array(
np.array(cluster_centers_).astype(dtype), order=array_order
Expand Down