Skip to content

Commit

Permalink
Dependency info (#1095)
Browse files Browse the repository at this point in the history
* Get cudf/spark dependency from the correct .m2 dir (#1062)

* Get cudf/spark dependency from the correct .m2 dir

'WORKSPACE' & 'M2DIR' vars are needed for shims to gen the correct cudf/spark dependency info in shims.

Below error in 'spark*-info.properties' is due to unset of 'WORKSPACE' & 'M2DIR':
    build/dependency-info.sh: line 30: /jenkins/printJarVersion.sh: No such file or directory
    build/dependency-info.sh: line 33: /jenkins/printJarVersion.sh: No such file or directory

To fix the error, we set the default values for them in 'build/dependency-info.sh':
    'M2DIR=$HOME/.m2/repository'
    'WORKSPACE=../..'

We also need to explicitly set the correct 'M2DIR' path, in case we change it by '-Dmaven.repo.local=$M2DIR'.
Already updated Jenkins scripts to set the correct 'M2DIR'.

Signed-off-by: Tim Liu <timl@nvidia.com>

* let mvn package fails in case the script 'build/dependency-info.sh' runs failure

* Stop mvn build if `build/build-info` fails

Signed-off-by: Tim Liu <timl@nvidia.com>

* Copyright 2020

Signed-off-by: Tim Liu <timl@nvidia.com>

* List the latest SNAPSHOT jar file in local maven repo

Signed-off-by: Tim Liu <timl@nvidia.com>

* Get the path of 'dependency-info.sh', then set 'WORKSPACE' relative to it

Signed-off-by: Tim Liu <timl@nvidia.com>

* Only collect dependency info on Jenkins build

* Only collect timestamped dependency in Jenkins build

Collect snapshot dependency info only in Jenkins build,
In dev build, print 'SNAPSHOT' tag without time stamp, e.g.: cudf-0.17-SNAPSHOT.jar

* simplifying the dependencylogic
  • Loading branch information
NvTimLiu authored Nov 18, 2020
1 parent a614dbf commit b992b39
Show file tree
Hide file tree
Showing 13 changed files with 34 additions and 23 deletions.
3 changes: 2 additions & 1 deletion build/build-info
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
#!/usr/bin/env bash

#
# Copyright (c) 2019, NVIDIA CORPORATION. All rights reserved.
# Copyright (c) 2019-2020, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
Expand All @@ -19,6 +19,7 @@
# This script generates the build info.
# Arguments:
# rapids4spark_version - The current version of spark plugin
set -e

echo_build_properties() {
echo version=$1
Expand Down
9 changes: 7 additions & 2 deletions build/dependency-info.sh
Original file line number Diff line number Diff line change
Expand Up @@ -23,14 +23,19 @@
# SPARK_VER - The version of spark

# Parse cudf and spark dependency versions
set -e

CUDF_VER=$1
CUDA_CLASSIFIER=$2
SERVER_ID=snapshots
${WORKSPACE}/jenkins/printJarVersion.sh "cudf_version" "${HOME}/.m2/repository/ai/rapids/cudf/${CUDF_VER}" "cudf-${CUDF_VER}" "-${CUDA_CLASSIFIER}.jar" $SERVER_ID
# set defualt values for 'M2DIR' & 'WORKSPACE' so that shims can get the correct cudf/spark dependnecy
M2DIR=${M2DIR:-"$HOME/.m2/repository"}
MY_PATH="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )"
WORKSPACE=${WORKSPACE:-"$MY_PATH/.."}
${WORKSPACE}/jenkins/printJarVersion.sh "cudf_version" "${M2DIR}/ai/rapids/cudf/${CUDF_VER}" "cudf-${CUDF_VER}" "-${CUDA_CLASSIFIER}.jar" $SERVER_ID

SPARK_VER=$3
SPARK_SQL_VER=`${WORKSPACE}/jenkins/printJarVersion.sh "spark_version" "${HOME}/.m2/repository/org/apache/spark/spark-sql_2.12/${SPARK_VER}" "spark-sql_2.12-${SPARK_VER}" ".jar" $SERVER_ID`
SPARK_SQL_VER=`${WORKSPACE}/jenkins/printJarVersion.sh "spark_version" "${M2DIR}/org/apache/spark/spark-sql_2.12/${SPARK_VER}" "spark-sql_2.12-${SPARK_VER}" ".jar" $SERVER_ID`

# Split spark version from spark-sql_2.12 jar filename
echo ${SPARK_SQL_VER/"-sql_2.12"/}
3 changes: 2 additions & 1 deletion jenkins/databricks/build.sh
Original file line number Diff line number Diff line change
Expand Up @@ -49,7 +49,8 @@ RAPIDS_BUILT_JAR=rapids-4-spark_$SCALA_VERSION-$SPARK_PLUGIN_JAR_VERSION.jar

echo "Scala version is: $SCALA_VERSION"
mvn -B -P${BUILD_PROFILES} clean package -DskipTests || true
M2DIR=/home/ubuntu/.m2/repository
# export 'M2DIR' so that shims can get the correct cudf/spark dependnecy info
export M2DIR=/home/ubuntu/.m2/repository
CUDF_JAR=${M2DIR}/ai/rapids/cudf/${CUDF_VERSION}/cudf-${CUDF_VERSION}-${CUDA_VERSION}.jar

# pull normal Spark artifacts and ignore errors then install databricks jars, then build again
Expand Down
10 changes: 6 additions & 4 deletions jenkins/printJarVersion.sh
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@
# See the License for the specific language governing permissions and
# limitations under the License.
#
set -e

function print_ver(){
TAG=$1
Expand All @@ -22,11 +23,12 @@ function print_ver(){
SUFFIX=$4
SERVER_ID=$5

if [[ "$VERSION" == *"-SNAPSHOT" ]]; then
# Collect snapshot dependency info only in Jenkins build
# In dev build, print 'SNAPSHOT' tag without time stamp, e.g.: cudf-0.17-SNAPSHOT.jar
if [[ "$VERSION" == *"-SNAPSHOT" && -n "$JENKINS_URL" ]]; then
PREFIX=${VERSION%-SNAPSHOT}
TIMESTAMP=`grep -oP '(?<=timestamp>)[^<]+' < $REPO/maven-metadata-$SERVER_ID.xml`
BUILD_NUM=`grep -oP '(?<=buildNumber>)[^<]+' < $REPO/maven-metadata-$SERVER_ID.xml`
echo $TAG=$PREFIX-$TIMESTAMP-$BUILD_NUM$SUFFIX
# List the latest SNAPSHOT jar file in the maven repo
echo $TAG=`ls -t $REPO/$PREFIX-[0-9]*$SUFFIX | head -1 | xargs basename`
else
echo $TAG=$VERSION$SUFFIX
fi
Expand Down
14 changes: 8 additions & 6 deletions jenkins/spark-nightly-build.sh
Original file line number Diff line number Diff line change
Expand Up @@ -19,12 +19,14 @@ set -ex

. jenkins/version-def.sh

mvn -U -B -Pinclude-databricks,snapshot-shims clean deploy $MVN_URM_MIRROR -Dmaven.repo.local=$WORKSPACE/.m2
export 'M2DIR' so that shims can get the correct cudf/spark dependnecy info
export M2DIR="$WORKSPACE/.m2"
mvn -U -B -Pinclude-databricks,snapshot-shims clean deploy $MVN_URM_MIRROR -Dmaven.repo.local=$M2DIR
# Run unit tests against other spark versions
mvn -U -B -Pspark301tests,snapshot-shims test $MVN_URM_MIRROR -Dmaven.repo.local=$WORKSPACE/.m2
mvn -U -B -Pspark302tests,snapshot-shims test $MVN_URM_MIRROR -Dmaven.repo.local=$WORKSPACE/.m2
mvn -U -B -Pspark310tests,snapshot-shims test $MVN_URM_MIRROR -Dmaven.repo.local=$WORKSPACE/.m2
mvn -U -B -Pspark301tests,snapshot-shims test $MVN_URM_MIRROR -Dmaven.repo.local=$M2DIR
mvn -U -B -Pspark302tests,snapshot-shims test $MVN_URM_MIRROR -Dmaven.repo.local=$M2DIR
mvn -U -B -Pspark310tests,snapshot-shims test $MVN_URM_MIRROR -Dmaven.repo.local=$M2DIR

# Parse cudf and spark files from local mvn repo
jenkins/printJarVersion.sh "CUDFVersion" "${WORKSPACE}/.m2/ai/rapids/cudf/${CUDF_VER}" "cudf-${CUDF_VER}" "-${CUDA_CLASSIFIER}.jar" $SERVER_ID
jenkins/printJarVersion.sh "SPARKVersion" "${WORKSPACE}/.m2/org/apache/spark/spark-core_2.12/${SPARK_VER}" "spark-core_2.12-${SPARK_VER}" ".jar" $SERVER_ID
jenkins/printJarVersion.sh "CUDFVersion" "$M2DIR/ai/rapids/cudf/${CUDF_VER}" "cudf-${CUDF_VER}" "-${CUDA_CLASSIFIER}.jar" $SERVER_ID
jenkins/printJarVersion.sh "SPARKVersion" "$M2DIR/org/apache/spark/spark-core_2.12/${SPARK_VER}" "spark-core_2.12-${SPARK_VER}" ".jar" $SERVER_ID
4 changes: 2 additions & 2 deletions pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -536,8 +536,8 @@
<target>
<mkdir dir="${project.build.directory}/extra-resources"/>
<mkdir dir="${project.build.directory}/tmp"/>
<exec executable="bash" output="${project.build.directory}/extra-resources/rapids4spark-version-info.properties">
<arg value="${project.basedir}/../build/build-info"/>
<exec executable="bash" failonerror="true" output="${project.build.directory}/extra-resources/rapids4spark-version-info.properties">
<arg value="${user.dir}/build/build-info"/>
<arg value="${project.version}"/>
<arg value="${cudf.version}"/>
</exec>
Expand Down
2 changes: 1 addition & 1 deletion shims/spark300/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,7 @@
<configuration>
<target>
<mkdir dir="${project.build.directory}/extra-resources"/>
<exec executable="bash" output="${project.build.directory}/extra-resources/spark-${spark300.version}-info.properties">
<exec executable="bash" failonerror="true" output="${project.build.directory}/extra-resources/spark-${spark300.version}-info.properties">
<arg value="${user.dir}/build/dependency-info.sh"/>
<arg value="${cudf.version}"/>
<arg value="${cuda.version}"/>
Expand Down
2 changes: 1 addition & 1 deletion shims/spark300emr/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,7 @@
<configuration>
<target>
<mkdir dir="${project.build.directory}/extra-resources"/>
<exec executable="bash" output="${project.build.directory}/extra-resources/spark-${spark300emr.version}-info.properties">
<exec executable="bash" failonerror="true" output="${project.build.directory}/extra-resources/spark-${spark300emr.version}-info.properties">
<arg value="${user.dir}/build/dependency-info.sh"/>
<arg value="${cudf.version}"/>
<arg value="${cuda.version}"/>
Expand Down
2 changes: 1 addition & 1 deletion shims/spark301/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,7 @@
<configuration>
<target>
<mkdir dir="${project.build.directory}/extra-resources"/>
<exec executable="bash" output="${project.build.directory}/extra-resources/spark-${spark301.version}-info.properties">
<exec executable="bash" failonerror="true" output="${project.build.directory}/extra-resources/spark-${spark301.version}-info.properties">
<arg value="${user.dir}/build/dependency-info.sh"/>
<arg value="${cudf.version}"/>
<arg value="${cuda.version}"/>
Expand Down
2 changes: 1 addition & 1 deletion shims/spark301db/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,7 @@
<configuration>
<target>
<mkdir dir="${project.build.directory}/extra-resources"/>
<exec executable="bash" output="${project.build.directory}/extra-resources/spark-${spark301db.version}-info.properties">
<exec executable="bash" failonerror="true" output="${project.build.directory}/extra-resources/spark-${spark301db.version}-info.properties">
<arg value="${user.dir}/build/dependency-info.sh"/>
<arg value="${cudf.version}"/>
<arg value="${cuda.version}"/>
Expand Down
2 changes: 1 addition & 1 deletion shims/spark301emr/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,7 @@
<configuration>
<target>
<mkdir dir="${project.build.directory}/extra-resources"/>
<exec executable="bash" output="${project.build.directory}/extra-resources/spark-${spark301emr.version}-info.properties">
<exec executable="bash" failonerror="true" output="${project.build.directory}/extra-resources/spark-${spark301emr.version}-info.properties">
<arg value="${user.dir}/build/dependency-info.sh"/>
<arg value="${cudf.version}"/>
<arg value="${cuda.version}"/>
Expand Down
2 changes: 1 addition & 1 deletion shims/spark302/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,7 @@
<configuration>
<target>
<mkdir dir="${project.build.directory}/extra-resources"/>
<exec executable="bash" output="${project.build.directory}/extra-resources/spark-${spark302.version}-info.properties">
<exec executable="bash" failonerror="true" output="${project.build.directory}/extra-resources/spark-${spark302.version}-info.properties">
<arg value="${user.dir}/build/dependency-info.sh"/>
<arg value="${cudf.version}"/>
<arg value="${cuda.version}"/>
Expand Down
2 changes: 1 addition & 1 deletion shims/spark310/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,7 @@
<configuration>
<target>
<mkdir dir="${project.build.directory}/extra-resources"/>
<exec executable="bash" output="${project.build.directory}/extra-resources/spark-${spark310.version}-info.properties">
<exec executable="bash" failonerror="true" output="${project.build.directory}/extra-resources/spark-${spark310.version}-info.properties">
<arg value="${user.dir}/build/dependency-info.sh"/>
<arg value="${cudf.version}"/>
<arg value="${cuda.version}"/>
Expand Down

0 comments on commit b992b39

Please sign in to comment.