Skip to content

Commit

Permalink
Update udf-compiler descriptions in related docs
Browse files Browse the repository at this point in the history
Signed-off-by: Allen Xu <allxu@nvidia.com>
  • Loading branch information
wjxiz1992 committed Nov 13, 2020
1 parent 9afb81b commit 78e7fcc
Show file tree
Hide file tree
Showing 2 changed files with 3 additions and 3 deletions.
4 changes: 2 additions & 2 deletions docs/compatibility.md
Original file line number Diff line number Diff line change
Expand Up @@ -290,10 +290,10 @@ Casting from string to timestamp currently has the following limitations.
Only timezone 'Z' (UTC) is supported. Casting unsupported formats will result in null values.

## UDF to Catalyst Expressions
To speedup the process of UDF, spark-rapids introduces a udf-compiler extension to translate UDFs to Catalyst expressions.
To speedup the process of UDF, spark-rapids introduces a udf-compiler extension to translate UDFs to Catalyst expressions. This compiler will be injected automatically to spark extensions by setting `spark.plugins=com.nvidia.spark.SQLPlugin` and is disabled by default.

To enable this operation on the GPU, set
[`spark.rapids.sql.udfCompiler.enabled`](configs.md#sql.udfCompiler.enabled) to `true`, and `spark.sql.extensions=com.nvidia.spark.udf.Plugin`.
[`spark.rapids.sql.udfCompiler.enabled`](configs.md#sql.udfCompiler.enabled) to `true`.

However, Spark may produce different results for a compiled udf and the non-compiled. For example: a udf of `x/y` where `y` happens to be `0`, the compiled catalyst expressions will return `NULL` while the original udf would fail the entire job with a `java.lang.ArithmeticException: / by zero`

Expand Down
2 changes: 1 addition & 1 deletion udf-compiler/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,6 @@ How to run
----------

The UDF compiler is included in the rapids-4-spark jar that is produced by the `dist` maven project. Set up your cluster to run the RAPIDS Accelerator for Apache Spark
and set the spark config `spark.sql.extensions` to include `com.nvidia.spark.udf.Plugin`.
and this udf plugin will be automatically injected to spark extensions when `com.nvidia.spark.SQLPlugin` is set.

The plugin is still disabled by default and you will need to set `spark.rapids.sql.udfCompiler.enabled` to `true` to enable it.

0 comments on commit 78e7fcc

Please sign in to comment.