-
Notifications
You must be signed in to change notification settings - Fork 230
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEA] Support GpuHashAggregateExec in LoRe #10942
Comments
Do you have any plans on how to support this? Project is fairly simple because it is a one to one relationship with input and output batches. Technically it could be a one to many relationship if we run out of memory and split an input batch to make it work. How would this work for hash aggregate that can have a many to many relationship? Even trying to trigger it on how long it took to run feels problematic because we might not see any slowness until after the first batch, which then would require us to save around all input batches or save them out on the chance that they might be needed to reproduce the problem. |
@revans2 Sorry for my late reply. Internally we had a discussion around this with @binmahone @res-life @liurenjie1024 @GaryShen2008. Just as what we offline discussed, we will change the granularity from batch to task level. Thus, it should work for stateful operator like aggregation. Also, regards to the dump timing, we're considering introducing other two modes: i. exact id matching via task id or split id; ii. dumping first few tasks. Later one can help non-tailing case. For this part, let's explore option whether we can be consistent with @jlowe 's profiler tool. @liurenjie1024 will help on that later and @binmahone is helping explore options regards id matching approach. |
hi @revans2 In the new LORE implementation we'll use two IDs to uniquely identify the lifespan of a specific operator in a specific task:
Consider a case where we have skew in e.g. JoinExec, the skew task will exhibit consistent LORE ID+Parittion ID among different runs of the same SQL (Even in the same spark session). With this design, users can dump data only related to the problematic operator in a specific task, and we can replay the specific operator at local in a single thread. The LORE ID+Partition ID design can also be extend to enable self-contained profiling (#10870). Currently, #10870 can be enabled based on time range/job range/stage range. However job range and stage range should be considered unstable and may result in unexpected traces dumped. With LORE ID+Partition ID, we are more specific and acurate to express what traces we need. (LORE ID+PartitionID can uniquely identify which task on which executor) This is how we see it, what you think @revans2 @jlowe @GaryShen2008 ? @winningsix @liurenjie1024 please feel free to add your inputs. |
Is your feature request related to a problem? Please describe.
Support Agg in LoRE. Both partial and final.
spark-rapids/sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuAggregateExec.scala
Lines 1711 to 1715 in bb05b17
The text was updated successfully, but these errors were encountered: