Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for StringConcatFactory.makeConcatWithConstants (#9555) #10057

Merged
merged 2 commits into from
Dec 20, 2023

Conversation

seanprime7
Copy link
Contributor

This PR adds support for StringConcatFactory.makeConcatWithConstants.
4 of the existing unit tests started failing with JDK11 as the bytecode generation target due to the introduction of StringConcatFactory.makeConcatWithConstants in JDK9.
See #9555 for more details about the failures.

It resolves #9555

@seanprime7
Copy link
Contributor Author

@abellina

Copy link
Collaborator

@abellina abellina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me! It needs copyright updated, but the catalyst it produces matches what we did in JDK8/2.12.

@abellina
Copy link
Collaborator

abellina commented Dec 15, 2023

One thing separate from this PR we might consider is to try and simplify the concat catalyst expressions. With JDK8/scala 2.12 and with JDK11/scala 2.13 I see the same thing:

== Physical Plan ==
AdaptiveSparkPlan isFinalPlan=false
+- Project [x#2856, y#2857, z#2858, if (z#2858) concat(, , x#2856,  , y#2857, @@@,  true) else concat(, , y#2857,  , x#2856, !!!,  false) AS new#2862]
   +- Exchange SinglePartition, REPARTITION_BY_NUM, [id=#883]
      +- LocalTableScan [x#2856, y#2857, z#2858]

I think the concats cold be slightly simpler to remove that first empty string:

concat(x#2856, ' ', y#2857, '@@@',  ' true')

The code we are generating does the right thing, I am just saying it could be improved. @seanprime7 would this be an easy change? And it's ok for us to file an issue and do as a follow on, especially since it was already there before.

Signed-off-by: Sean Lee <selee@nvidia.com>
@seanprime7
Copy link
Contributor Author

One thing separate from this PR we might consider is to try and simplify the concat catalyst expressions. With JDK8/scala 2.12 and with JDK11/scala 2.13 I see the same thing:

== Physical Plan ==
AdaptiveSparkPlan isFinalPlan=false
+- Project [x#2856, y#2857, z#2858, if (z#2858) concat(, , x#2856,  , y#2857, @@@,  true) else concat(, , y#2857,  , x#2856, !!!,  false) AS new#2862]
   +- Exchange SinglePartition, REPARTITION_BY_NUM, [id=#883]
      +- LocalTableScan [x#2856, y#2857, z#2858]

I think the concats cold be slightly simpler to remove that first empty string:

concat(x#2856, ' ', y#2857, '@@@',  ' true')

The code we are generating does the right thing, I am just saying it could be improved. @seanprime7 would this be an easy change? And it's ok for us to file an issue and do as a follow on, especially since it was already there before.

I've made a change not to generate Concat ops with empty strings in it. If you want to further simplify the expression, you can add more simplification rules to CatalystExpressionBuilder.scala.

abellina
abellina previously approved these changes Dec 15, 2023
Copy link
Collaborator

@abellina abellina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The optimization (for empty space concat) only went to scala 2.13, but that should be fine (the old code has been there for a long time).

@abellina
Copy link
Collaborator

build

@abellina
Copy link
Collaborator

@seanprime7 so I tried the optimization, but the catalyst plan looks the same?

== Physical Plan ==
AdaptiveSparkPlan isFinalPlan=false
+- Project [x#2856, y#2857, z#2858, if (z#2858) concat(, x#2856,  , y#2857, @@@ true) else concat(, y#2857,  , x#2856, !!! false) AS new#2862]
   +- Exchange SinglePartition, REPARTITION_BY_NUM, [id=#883]
      +- LocalTableScan [x#2856, y#2857, z#2858]

I'd rather split that off to a different issue, and back-out 047b416, we can take a look at it later for both scala 2.12 and 2.13.

Thoughts?

@seanprime7
Copy link
Contributor Author

047b416 has been backed out

@abellina
Copy link
Collaborator

build

@sameerz sameerz added Scala 2.13 Issues related to Scala 2.13 binaries bug Something isn't working labels Dec 18, 2023
@abellina abellina merged commit 0ca4a2c into branch-24.02 Dec 20, 2023
73 of 75 checks passed
@abellina abellina deleted the selee/9555 branch December 20, 2023 21:34
@abellina
Copy link
Collaborator

thanks @seanprime7

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Scala 2.13 Issues related to Scala 2.13 binaries
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] Scala 2.13 build with JDK 11 or 17 fails OpcodeSuite tests
3 participants