-
Notifications
You must be signed in to change notification settings - Fork 28.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-41382][CONNECT][PYTHON] Implement product
function
#38915
Conversation
@@ -539,6 +539,15 @@ class SparkConnectPlanner(session: SparkSession) { | |||
* @return | |||
*/ | |||
private def transformScalarFunction(fun: proto.Expression.UnresolvedFunction): Expression = { | |||
if (fun.getFunctionName == "product") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Had a discussion with @cloud-fan , we don't want to expose all expressions in SQL syntax, and product
is such an example.
I think we will also deal with dedicated expressions for Pandas-on-Spark in this way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking we wanted to add product
to FunctionRegistry
as a follow-up for #38905. I didn't see how we've done this in this PR. Would you help me understand?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the reason for not allowing this in SQL? And defining it in the function registry should not necessarily mean it can be used from SQL. It might be easier to disallow usage in SQL in the function registry then dealing with it here. I think the function registry is the right place for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc @cloud-fan
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't have a mechanism to disallow functions in FunctionRegistry
. We can build such a mechanism but it's simpler to define the list of "hidden functions" here.
There is no specific reason why product
can't be a SQL function. We just need a separate PR to propose it and justify it (is it in the SQL standard? is it commonly supported in other databases?)
@@ -539,6 +539,15 @@ class SparkConnectPlanner(session: SparkSession) { | |||
* @return | |||
*/ | |||
private def transformScalarFunction(fun: proto.Expression.UnresolvedFunction): Expression = { | |||
if (fun.getFunctionName == "product") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My concern on such way is it looks too specialized. Not sure how many such case we will have, but at least can we pull this if and if
to a validation function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good point, I think we will have more similar cases, let me add a new function for it.
77716df
to
2b4b3fa
Compare
fec47cd
to
71855d2
Compare
fun.getFunctionName match { | ||
case "product" => | ||
if (fun.getArgumentsCount != 1) { | ||
throw InvalidPlanInput("Product requires single child expression") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this throw an Analysis exception instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure about the exception type here
71855d2
to
8cbe398
Compare
@zhengruifeng mind fixing the conflicts? Otherwise should be good to go. |
8cbe398
to
6f06863
Compare
merged into master, thank you for reviews |
### What changes were proposed in this pull request? Implement `product` function ### Why are the changes needed? for API coverage ### Does this PR introduce _any_ user-facing change? new API ### How was this patch tested? added test Closes apache#38915 from zhengruifeng/connect_function_product. Authored-by: Ruifeng Zheng <ruifengz@apache.org> Signed-off-by: Ruifeng Zheng <ruifengz@apache.org>
What changes were proposed in this pull request?
Implement
product
functionWhy are the changes needed?
for API coverage
Does this PR introduce any user-facing change?
new API
How was this patch tested?
added test