[FEA] Support common subexpression elimination for expand operator #10249
Labels
performance
A performance related task/issue
task
Work required that improves the product but is not user facing
Is your feature request related to a problem? Please describe.
Expand operator may have multiple
Seq[Expression]
as belowspark-rapids/sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuExpandExec.scala
Line 91 in 5d08aec
logical rule
. But it may fail in some cases to extra expressions as a child project node. As a result, some duplicated computations will happen given it's a forest other than a single tree of expression seq.Given following thing as an example:
If it failed to be optimized by logical plan,
complex_expr
will be duplicately evaluated.Describe the solution you'd like
Introduce a new approach allowing extract project where it can have common sub-expression elimination via tiered project evaluation approach.
With optimization within expand node, we can do something at physical level to fix up cases failed to have expand expressions extracted.
Describe alternatives you've considered
Have a fixup case-by-case when column pruning failed to happen in expand node.
The text was updated successfully, but these errors were encountered: