Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-41382][CONNECT][PYTHON] Implement product function #38915

Closed

Conversation

zhengruifeng
Copy link
Contributor

What changes were proposed in this pull request?

Implement product function

Why are the changes needed?

for API coverage

Does this PR introduce any user-facing change?

new API

How was this patch tested?

added test

@@ -539,6 +539,15 @@ class SparkConnectPlanner(session: SparkSession) {
* @return
*/
private def transformScalarFunction(fun: proto.Expression.UnresolvedFunction): Expression = {
if (fun.getFunctionName == "product") {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Had a discussion with @cloud-fan , we don't want to expose all expressions in SQL syntax, and product is such an example.

I think we will also deal with dedicated expressions for Pandas-on-Spark in this way.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking we wanted to add product to FunctionRegistry as a follow-up for #38905. I didn't see how we've done this in this PR. Would you help me understand?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the reason for not allowing this in SQL? And defining it in the function registry should not necessarily mean it can be used from SQL. It might be easier to disallow usage in SQL in the function registry then dealing with it here. I think the function registry is the right place for this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

@cloud-fan cloud-fan Dec 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't have a mechanism to disallow functions in FunctionRegistry. We can build such a mechanism but it's simpler to define the list of "hidden functions" here.

There is no specific reason why product can't be a SQL function. We just need a separate PR to propose it and justify it (is it in the SQL standard? is it commonly supported in other databases?)

@@ -539,6 +539,15 @@ class SparkConnectPlanner(session: SparkSession) {
* @return
*/
private def transformScalarFunction(fun: proto.Expression.UnresolvedFunction): Expression = {
if (fun.getFunctionName == "product") {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My concern on such way is it looks too specialized. Not sure how many such case we will have, but at least can we pull this if and if to a validation function?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point, I think we will have more similar cases, let me add a new function for it.

fun.getFunctionName match {
case "product" =>
if (fun.getArgumentsCount != 1) {
throw InvalidPlanInput("Product requires single child expression")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this throw an Analysis exception instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure about the exception type here

@HyukjinKwon
Copy link
Member

@zhengruifeng mind fixing the conflicts? Otherwise should be good to go.

@zhengruifeng
Copy link
Contributor Author

merged into master, thank you for reviews

@zhengruifeng zhengruifeng deleted the connect_function_product branch December 7, 2022 03:43
beliefer pushed a commit to beliefer/spark that referenced this pull request Dec 18, 2022
### What changes were proposed in this pull request?
Implement `product` function

### Why are the changes needed?
for API coverage

### Does this PR introduce _any_ user-facing change?
new API

### How was this patch tested?
added test

Closes apache#38915 from zhengruifeng/connect_function_product.

Authored-by: Ruifeng Zheng <ruifengz@apache.org>
Signed-off-by: Ruifeng Zheng <ruifengz@apache.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants