Skip to content

Commit

Permalink
[SPARK-45159][PYTHON] Handle named arguments only when necessary
Browse files Browse the repository at this point in the history
### What changes were proposed in this pull request?

Handles named arguments only when necessary.

### Why are the changes needed?

Constructing `kwargs` as `dict` could be expensive. It should be done only when necessary.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Existing tests.

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #42915 from ueshin/issues/SPARK-45159/kwargs.

Authored-by: Takuya UESHIN <ueshin@databricks.com>
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
  • Loading branch information
ueshin authored and HyukjinKwon committed Sep 15, 2023
1 parent 25c624f commit 822f58f
Show file tree
Hide file tree
Showing 2 changed files with 137 additions and 86 deletions.
24 changes: 24 additions & 0 deletions python/pyspark/sql/tests/connect/test_parity_arrow_python_udf.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,8 @@

import unittest

from pyspark.errors import AnalysisException, PythonException
from pyspark.sql.functions import udf
from pyspark.sql.tests.connect.test_parity_udf import UDFParityTests
from pyspark.sql.tests.test_arrow_python_udf import PythonUDFArrowTestsMixin

Expand All @@ -34,6 +36,28 @@ def tearDownClass(cls):
finally:
super(ArrowPythonUDFParityTests, cls).tearDownClass()

def test_named_arguments_negative(self):
@udf("int")
def test_udf(a, b):
return a + b

self.spark.udf.register("test_udf", test_udf)

with self.assertRaisesRegex(
AnalysisException,
"DUPLICATE_ROUTINE_PARAMETER_ASSIGNMENT.DOUBLE_NAMED_ARGUMENT_REFERENCE",
):
self.spark.sql("SELECT test_udf(a => id, a => id * 10) FROM range(2)").show()

with self.assertRaisesRegex(AnalysisException, "UNEXPECTED_POSITIONAL_ARGUMENT"):
self.spark.sql("SELECT test_udf(a => id, id * 10) FROM range(2)").show()

with self.assertRaises(PythonException):
self.spark.sql("SELECT test_udf(c => 'x') FROM range(2)").show()

with self.assertRaises(PythonException):
self.spark.sql("SELECT test_udf(id, a => id * 10) FROM range(2)").show()


if __name__ == "__main__":
import unittest
Expand Down
Loading

0 comments on commit 822f58f

Please sign in to comment.