-
Notifications
You must be signed in to change notification settings - Fork 28.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-45760][SQL][FOLLOWUP] Inline With inside conditional branches #43978
Conversation
val newAlwaysEvaluatedInputs = c.alwaysEvaluatedInputs.map( | ||
rewriteWithExprAndInputPlans(_, inputPlans)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is dealing with common expressions only in always evaluated input e.g., predicate of If
.
How about common expressions shared between predicate and branches?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought about it before. The problem is that it's hard to update the original ConditionalExpression
with the new shared common expressions
. alwaysEvaluatedInputs
is static so that I can let every ConditionalExpression
to implement a method to update it.
e match { | ||
case w: With => | ||
// Rewrite nested With expression in CommonExpressionDef first. | ||
val defs = w.defs.map(rewriteWithExprAndInputPlans(_, inputPlans)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that we have "manual" recursion (instead of transformExpressionsUp()
), shall we deal with nested With
s in w.child
too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, the current logic seems to behave correctly if there is an inner With
in an outer With
's child
and the inner has a definition with a reference to an outer definition . (The previous transformExpressionsUp()
had issues in that case.) But the rule is not idempotent now, so maybe we should recurse into w.child
after replacing CommonExpressionRef
s?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a good catch! It seems doesn't matter when to recurse into w.child
, either before replacing CommonExpressionRef
or after is fine?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe before is better, as the expression tree may be much larger after replacing CommonExpressionRef
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure. E.g. if we have With(With(x + x, Seq(x = y + y)), Seq(y = a + 1))
where x
and y
are references and a
is an attribute and we would recurse into With(x + x, Seq(x = y + y))
before replacing the y
references to actual attributes, that aliases a + 1
, then the childProjectionIndex
calculation for y + y
won't find the right child, will it? But an UT covering this case would be good. :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh correlated nested With
! I'm not sure if we want to support it or not... But at least we should fail if we don't want to support it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then we may need a test for that (either supported or failed if not).
case With(child, defs) => | ||
// For With in the conditional branches, they may not be evaluated at all and we can't | ||
// pull the common expressions into a project which will always be evaluated. Inline it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, for specific conditional expression, e.g., If
, we can still extract common expression shared on both branches which will be evaluated for sure?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as https://github.com/apache/spark/pull/43978/files#r1403392772 .
It's easy to find these common expressions shared on both branches, but it's hard to put them back to If
. I think it's better to do it when we make it into a general rule that find shared common expressions and create With
to deduplicate.
Trying to follow along with this since duplicate expression evaluations have been a huge pain for us for a while (mostly on the execution side, but having incomprehensible explain strings isn't fun either). Am I understanding this correctly that this effectively negates the |
|
So is this just saying not to recurse into conditional expressions to find |
@Kimahriman Yes. |
The doc issue is unrelated
I'm merging it to master, thanks for review! |
What changes were proposed in this pull request?
This is a followup of #43623 to fix a regression. For
With
inside conditional branches, they may not be evaluated at all and we should not pull out the common expressions into aProject
, but just inline.Why are the changes needed?
avoid perf regression
Does this PR introduce any user-facing change?
No
How was this patch tested?
new test
Was this patch authored or co-authored using generative AI tooling?
No