Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Run '--packages' only with default cuda11 jar #10279

Merged
merged 2 commits into from
Jan 30, 2024

Conversation

NvTimLiu
Copy link
Collaborator

@NvTimLiu NvTimLiu commented Jan 26, 2024

As '--packages' only works on the default cuda11 jar, it does not support classifier parameter, refer to issue: https://issues.apache.org/jira/browse/SPARK-20075

We can not specify classifier jar to run plugin tests with '--packages', see below error log:
Exception in thread "main" java.lang.IllegalArgumentException:
requirement failed: Provided Maven Coordinates must be in the form 'groupId:artifactId:version'.
The coordinate provided is: com.nvidia:rapids-4-spark_2.12:23.12.0:jar:cuda12

Follow up #10238 (comment)

Signed-off-by: Tim Liu timl@nvidia.com

As '--packages' only works on the default cuda11 jar, it does not support classifier parameter,
refer to issue: https://issues.apache.org/jira/browse/SPARK-20075

We can not specify classifier jar to run plugin tests with '--packages', see below error log:
    Exception in thread "main" java.lang.IllegalArgumentException:
        requirement failed: Provided Maven Coordinates must be in the form 'groupId:artifactId:version'.
        The coordinate provided is: com.nvidia:rapids-4-spark_2.12:23.12.0:jar:cuda12

Signed-off-by: Tim Liu <timl@nvidia.com>
@NvTimLiu NvTimLiu added the test Only impacts tests label Jan 26, 2024
@NvTimLiu NvTimLiu self-assigned this Jan 26, 2024
@NvTimLiu
Copy link
Collaborator Author

build

./run_pyspark_from_build.sh
# As '--packages' only works on the default cuda11 jar, it does not support classifiers
# refer to issue : https://issues.apache.org/jira/browse/SPARK-20075
if [[ "$CLASSIFIER" == "" || "$CLASSIFIER" == "cuda11" ]]; then
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain in a comment why OR-ing with "$CLASSIFIER" == "cuda11" is necessary (if it is)?

As you point out, we know that Spark does not support classifiers in Maven coordinates and it does not look used inside this if branch

Copy link
Collaborator Author

@NvTimLiu NvTimLiu Jan 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gerashegalov , good question.

cuda11 jar the the same as main jar, though we do not put cuda11 as part of the --packages param, "CLASSIFIER==cuda11" is the most common case(e.g.https://github.com/NVIDIA/spark-rapids/blob/branch-24.02/jenkins/version-def.sh#L30-L31) to to run integration tests. If only filter it with "CLASSIFIER==''", we may skip packages test in all CI jobs.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand that the default artifact without a classifier is the same as the cuda11-classified artifact.

IIUC,

"$CLASSIFIER" == ''" is for the case when the script is run by a developer. "$CLASSIFIER" == "cuda11" is for cases when run on CI. If so, please add a comment about this. It is strange to see a check for the cuda11 classifier next to the comment saying it's not supported.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good suggestion, let me add comment for it.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@NvTimLiu
Copy link
Collaborator Author

build

@NvTimLiu
Copy link
Collaborator Author

@gerashegalov @jlowe can you help review? thanks!

Copy link
Collaborator

@gerashegalov gerashegalov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@NvTimLiu NvTimLiu merged commit e375368 into NVIDIA:branch-24.02 Jan 30, 2024
40 checks passed
PYSP_TEST_spark_jars_packages=com.nvidia:rapids-4-spark_${SCALA_BINARY_VER}:${PROJECT_VER} \
PYSP_TEST_spark_jars_repositories=${PROJECT_REPO} \
./run_pyspark_from_build.sh
if
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo 'fi'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
test Only impacts tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants