Add dynamic Spark configuration for Databricks #2116

NvTimLiu · 2021-04-10T11:48:40Z

We need a way to set spark confs dynamically for the Databricks,
e.g., when we test cuDF sonatype release jars, we need to disable cudf-rapids version match by adding
"--conf spark.rapids.cudfVersionOverride=true", or enable/disable AQE, or anything else.

By adding the parameter -f " spark.foo=1,spark.bar=2 ......" for the script 'run-tests.py',
we can dynamically add whatever confs for the Databricks cluster.

Signed-off-by: Tim Liu timl@nvidia.com

We need a way to set spark confs dynamically for the Databricks, e.g., when we test cuDF sonatype release jars, we need to disable cudf-rapids version match by adding "--conf spark.rapids.cudfVersionOverride=true", or enable/disable AQE, or anything else. By adding the parameter spark_conf="--conf spark.xxx.xxx=xxx --conf ......" for the script 'run-tests.py', we can dynamically add whatever confs for the Databricks cluster. Signed-off-by: Tim Liu <timl@nvidia.com>

NvTimLiu · 2021-04-10T12:04:26Z

jenkins/databricks/run-tests.py

@@ -35,7 +35,7 @@ def main():
  print("rsync command: %s" % rsync_command)
  subprocess.check_call(rsync_command, shell = True)

-  ssh_command = "ssh -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null ubuntu@%s -p 2200 -i %s %s %s 2>&1 | tee testout; if [ `echo ${PIPESTATUS[0]}` -ne 0 ]; then false; else true; fi" % (master_addr, params.private_key_file, params.script_dest, params.jar_path)
+  ssh_command = "ssh -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null ubuntu@%s -p 2200 -i %s %s %s '%s' 2>&1 | tee testout; if [ `echo ${PIPESTATUS[0]}` -ne 0 ]; then false; else true; fi" % (master_addr, params.private_key_file, params.script_dest, params.jar_path, params.spark_conf)


Here add the single quotas '%s', because we need to escape the backspaces to make the confs string --conf spark.xxx.xxx=xxx --conf spark.xxx.yyy=zzz ...... as one parameter

Otherwise the config string will be parsed as several parameters asperated by the backspaces

NvTimLiu · 2021-04-10T12:10:33Z

jenkins/databricks/test.sh

@@ -38,21 +39,32 @@ CUDF_UDF_TEST_ARGS="--conf spark.python.daemon.module=rapids.daemon_databricks \
    --conf spark.rapids.python.memory.gpu.allocFraction=0.1 \
    --conf spark.rapids.python.concurrentPythonWorkers=2"

+## '--conf spark.xxx.xxx=xxx' to 'export PYSP_TEST_spark_xxx_xxx=xxx'
+if [ -n "$SPARK_CONF" ]; then


We need to translate --conf spark.xxx=xxx into export PYSP_TEST_spark_xxx=xxx, as we are calling the script run_pyspark_from_build.sh to parallelly run the integration tests.

The scripts run_pyspark_from_build.sh requires us to translate spark confs into export PYSP_TEST_spark_xxx=xxx to run IT with parallelism.

NvTimLiu · 2021-04-10T12:11:03Z

jenkins/databricks/test.sh

 TEST_TYPE="nightly"
 if [ -d "$LOCAL_JAR_PATH" ]; then
    ## Run tests with jars in the LOCAL_JAR_PATH dir downloading from the denpedency repo
    LOCAL_JAR_PATH=$LOCAL_JAR_PATH bash $LOCAL_JAR_PATH/integration_tests/run_pyspark_from_build.sh  --runtime_env="databricks" --test_type=$TEST_TYPE

    ## Run cudf-udf tests
    CUDF_UDF_TEST_ARGS="$CUDF_UDF_TEST_ARGS --conf spark.executorEnv.PYTHONPATH=`ls $LOCAL_JAR_PATH/rapids-4-spark_*.jar | grep -v 'tests.jar'`"
-    LOCAL_JAR_PATH=$LOCAL_JAR_PATH SPARK_SUBMIT_FLAGS=$CUDF_UDF_TEST_ARGS TEST_PARALLEL=1 \
+    LOCAL_JAR_PATH=$LOCAL_JAR_PATH SPARK_SUBMIT_FLAGS="$SPARK_CONF $CUDF_UDF_TEST_ARGS" TEST_PARALLEL=1 \


append the dynamic spark confs here

NvTimLiu · 2021-04-10T12:12:11Z

build

jenkins/databricks/params.py

Signed-off-by: Tim Liu <timl@nvidia.com>

NvTimLiu · 2021-04-13T12:36:15Z

jenkins/databricks/params.py

@@ -26,19 +26,21 @@
 clusterid = ''
 build_profiles = 'databricks,!snapshot-shims'
 jar_path = ''
+# `spark_conf` can take comma seperated mutiple spark configurations, e.g., spark.foo=1,spark.bar=2,...'


Add a comment to make the opt '-f' string format clear

NvTimLiu · 2021-04-13T12:37:57Z

build

tgravescs · 2021-04-13T13:07:29Z

jenkins/databricks/test.sh

@@ -38,21 +39,35 @@ CUDF_UDF_TEST_ARGS="--conf spark.python.daemon.module=rapids.daemon_databricks \
    --conf spark.rapids.python.memory.gpu.allocFraction=0.1 \
    --conf spark.rapids.python.concurrentPythonWorkers=2"

+## 'spark.foo=1,spar.bar=2,...' to 'export PYSP_TEST_spark_foo=1 export PYSP_TEST_spark_bar=2'


nit s/spar.bar/spark.bar/

tgravescs · 2021-04-13T13:13:41Z

jenkins/databricks/test.sh

 TEST_TYPE="nightly"
 if [ -d "$LOCAL_JAR_PATH" ]; then
    ## Run tests with jars in the LOCAL_JAR_PATH dir downloading from the denpedency repo
    LOCAL_JAR_PATH=$LOCAL_JAR_PATH bash $LOCAL_JAR_PATH/integration_tests/run_pyspark_from_build.sh  --runtime_env="databricks" --test_type=$TEST_TYPE

    ## Run cudf-udf tests
    CUDF_UDF_TEST_ARGS="$CUDF_UDF_TEST_ARGS --conf spark.executorEnv.PYTHONPATH=`ls $LOCAL_JAR_PATH/rapids-4-spark_*.jar | grep -v 'tests.jar'`"
-    LOCAL_JAR_PATH=$LOCAL_JAR_PATH SPARK_SUBMIT_FLAGS=$CUDF_UDF_TEST_ARGS TEST_PARALLEL=1 \
+    LOCAL_JAR_PATH=$LOCAL_JAR_PATH SPARK_SUBMIT_FLAGS="$SPARK_ARGS $CUDF_UDF_TEST_ARGS" TEST_PARALLEL=1 \


what is SPARK_ARGS, I'm not seeing it?

Sorry, it is SPARK_CONF, forgot to change it back

NvTimLiu · 2021-04-13T14:27:03Z

build

NvTimLiu · 2021-04-14T11:25:45Z

As the blossom is having some network issue, I'll trigger the pre-merge build after Blossom team fix the network issue.

NvTimLiu · 2021-04-15T10:57:45Z

build

NvTimLiu · 2021-04-15T13:38:59Z

Both nightly build and nightly test pipelines PASS

* Add daynamic spark confs for the Databricks We need a way to set spark confs dynamically for the Databricks, e.g., when we test cuDF sonatype release jars, we need to disable cudf-rapids version match by adding "--conf spark.rapids.cudfVersionOverride=true", or enable/disable AQE, or anything else. By adding the parameter spark_conf="--conf spark.xxx.xxx=xxx --conf ......" for the script 'run-tests.py', we can dynamically add whatever confs for the Databricks cluster. Signed-off-by: Tim Liu <timl@nvidia.com> * Comma separated list of spark configurations Signed-off-by: Tim Liu <timl@nvidia.com> * Add a comment to make the '-f' format clear * Add a comment to make the '-f' format clear * Fix typo * Add '--conf' if the SPARK_CONF is not empty

NvTimLiu added the build Related to CI / CD or cleanly building label Apr 10, 2021

NvTimLiu self-assigned this Apr 10, 2021

NvTimLiu requested review from GaryShen2008, jlowe, revans2 and tgravescs as code owners April 10, 2021 11:48

NvTimLiu commented Apr 10, 2021

View reviewed changes

NvTimLiu requested a review from pxLi April 11, 2021 10:31

tgravescs reviewed Apr 12, 2021

View reviewed changes

jenkins/databricks/params.py Show resolved Hide resolved

jlowe changed the title ~~Add daynamic spark confs for the Databricks~~ Add dynamic Spark configuration for Databricks Apr 12, 2021

NvTimLiu added 3 commits April 13, 2021 20:16

Comma separated list of spark configurations

4631bb9

Signed-off-by: Tim Liu <timl@nvidia.com>

Add a comment to make the '-f' format clear

0367a2b

Add a comment to make the '-f' format clear

0331f3a

NvTimLiu commented Apr 13, 2021

View reviewed changes

tgravescs reviewed Apr 13, 2021

View reviewed changes

Fix typo

1353df7

Add '--conf' if the SPARK_CONF is not empty

48f136d

tgravescs approved these changes Apr 15, 2021

View reviewed changes

NvTimLiu merged commit 0509509 into NVIDIA:branch-0.5 Apr 15, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add dynamic Spark configuration for Databricks #2116

Add dynamic Spark configuration for Databricks #2116

NvTimLiu commented Apr 10, 2021 •

edited

Loading

NvTimLiu Apr 10, 2021

NvTimLiu Apr 10, 2021

NvTimLiu Apr 10, 2021

NvTimLiu commented Apr 10, 2021

NvTimLiu Apr 13, 2021

NvTimLiu commented Apr 13, 2021

tgravescs Apr 13, 2021

NvTimLiu Apr 13, 2021

tgravescs Apr 13, 2021

NvTimLiu Apr 13, 2021

NvTimLiu commented Apr 13, 2021

NvTimLiu commented Apr 14, 2021

NvTimLiu commented Apr 15, 2021

NvTimLiu commented Apr 15, 2021

Add dynamic Spark configuration for Databricks #2116

Add dynamic Spark configuration for Databricks #2116

Conversation

NvTimLiu commented Apr 10, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

NvTimLiu commented Apr 10, 2021

Choose a reason for hiding this comment

NvTimLiu commented Apr 13, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

NvTimLiu commented Apr 13, 2021

NvTimLiu commented Apr 14, 2021

NvTimLiu commented Apr 15, 2021

NvTimLiu commented Apr 15, 2021

NvTimLiu commented Apr 10, 2021 •

edited

Loading