[FEA] Support union for nested types #1459

revans2 · 2021-01-06T22:36:26Z

Is your feature request related to a problem? Please describe.
Union only works for non-nested types right now. It would be nice to support it for nested types as well. This is mostly blocked on scalar support for nested types. When we do get around to implementing it we should be sure to test with SPARK-32376 where a struct can have null columns added to it to make the two sides match. I am not sure if it will require any code changes on our part though.

kuhushukla · 2021-02-03T00:53:37Z

Thanks for the feature request!

Did some simple struct tests and adding the nested types to expr checks seems to work and match cpu result. But again, it was a preliminary test.
@revans2

This is mostly blocked on scalar support for nested types

Could you elaborate what do you mean by this? Which test should exercise this limitation and can we do without it? Thank you.

revans2 · 2021-02-03T15:27:38Z

This is mostly blocked on scalar support for nested types

In Spark 3.1.0 and above when it sees a union of Struct<A: Int, B: Array<String>> and Struct<A:Int, C: Long> the output will look like Struct<A:Int, B: Array<String>, C: Long>. To make this work the first data frame will add a C to the Struct that is always nulls, and the second data frame will add a B to the Struct that is always null. The way that works is to insert a project that picks apart the original struct and puts it back together with the new scalar null value inserted in. This requires us to be able to create a scalar null for an Array<String> and expand that out into a full column.

It is not directly in union that the problem show up, but it is a secondary effect that happens afterwards.

razajafri · 2021-03-12T23:26:56Z

@sameerz we won't be able to support the union of structs in cases where the struct can have a null value. I have added tests for all the cases in #1919 and xfailed the ones that we don't support ATM.

This is still dependent on the cudf support for Scalar Structs

sameerz · 2021-07-01T15:58:01Z

Update: at this point we support union of structs and union of nested structs. We still need to support union of of lists and maps.

[auto-merge] bot-auto-merge-branch-23.10 to branch-23.12 [skip ci] [bot]

revans2 added feature request New feature or request ? - Needs Triage Need team to review and classify labels Jan 6, 2021

sameerz removed the ? - Needs Triage Need team to review and classify label Jan 12, 2021

This was referenced Jan 27, 2021

[FEA] Support union on nested types #1606

Closed

[FEA] Support features for user query #1608

Closed

sameerz assigned gerashegalov Feb 23, 2021

sameerz added this to the Feb 16 - Feb 26 milestone Feb 23, 2021

gerashegalov modified the milestones: Feb 16 - Feb 26, Mar 1 - Mar 12 Mar 1, 2021

razajafri assigned razajafri and unassigned gerashegalov Mar 10, 2021

sameerz modified the milestones: Mar 1 - Mar 12, Mar 15 - March 26 Mar 15, 2021

razajafri removed their assignment Mar 17, 2021

sameerz removed this from the Mar 15 - March 26 milestone Mar 30, 2021

rwlee mentioned this issue Aug 6, 2021

Support Union on Map types #3164

Merged

rwlee mentioned this issue Aug 31, 2021

UnionExec array and nested array support #3359

Merged

revans2 closed this as completed in #3359 Sep 3, 2021

revans2 mentioned this issue Oct 5, 2021

[BUG] tests marked with xfail missed #3751

Closed

tgravescs pushed a commit to tgravescs/spark-rapids that referenced this issue Nov 30, 2023

Merge pull request NVIDIA#1459 from NVIDIA/bot-auto-merge-branch-23.10

b17c6e9

[auto-merge] bot-auto-merge-branch-23.10 to branch-23.12 [skip ci] [bot]

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FEA] Support union for nested types #1459

[FEA] Support union for nested types #1459

revans2 commented Jan 6, 2021 •

edited by sameerz

Loading

kuhushukla commented Feb 3, 2021

revans2 commented Feb 3, 2021

razajafri commented Mar 12, 2021 •

edited

Loading

sameerz commented Jul 1, 2021

[FEA] Support union for nested types #1459

[FEA] Support union for nested types #1459

Comments

revans2 commented Jan 6, 2021 • edited by sameerz Loading

kuhushukla commented Feb 3, 2021

revans2 commented Feb 3, 2021

razajafri commented Mar 12, 2021 • edited Loading

sameerz commented Jul 1, 2021

revans2 commented Jan 6, 2021 •

edited by sameerz

Loading

razajafri commented Mar 12, 2021 •

edited

Loading