[FEA] Add new Expression execution APIs #7259
Labels
feature request
New feature or request
reliability
Features to improve reliability or bugs that severly impact the reliability of the plugin
Is your feature request related to a problem? Please describe.
In parallel with or after #7258 we should add in some new expression execution APIs to let us execute Expressions without going over memory budgets.
There are several different classes of expressions that we need to worry about.
There are also several expressions that are really only expressions in name, because they cannot be executed by a project. These include aggregations, window functions, and explode functions. We don't need to worry about them in this context.
The idea here is to take a few up front expressions, and a few predictable expressions and write some APIs along with implementations for them that would allow an execution framework to avoid running past a memory budget. The hard part is that some expressions, like GpuCast, can be different things in different situations so we need to make sure that the APIs are somewhat dynamic and can be selected appropriately at runtime.
The idea is for up front expressions to provide an API that would return the output size and intermediate memory used based off of an input number of rows. predictable expressions would need a new execution API where the inputs to them are passed in as parameters along with a budget, instead of passing in a ColumnarBatch and letting the expression pull the answer from their children. If the expression can complete within the allotted budget, then it would just return the answer. If it could not it would return an error, not throw an error. That would let the framework decide to take action on splitting up the inputs.
This is likely going to need to be done going back and forth with the framework to execute the code. But for now I am splitting it up into a separate issue.
The text was updated successfully, but these errors were encountered: