Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add in the GpuArrayFilter command #10763

Merged
merged 3 commits into from
May 6, 2024
Merged

Conversation

revans2
Copy link
Collaborator

@revans2 revans2 commented May 3, 2024

this fixes #10760

I still need to do some performance comparisons/testing, but functionality wise it looks good. Will update it shortly.

Signed-off-by: Robert (Bobby) Evans <bobby@apache.org>
@revans2
Copy link
Collaborator Author

revans2 commented May 3, 2024

build

@revans2
Copy link
Collaborator Author

revans2 commented May 3, 2024

For performance numbers I ran the following query on Spark 3.4.2

spark.time(spark.range(0, 10000000000L, 1, 512).selectExpr("array(id, id + 1, id - 100, id + 3, id + 2) as ar").selectExpr("filter(ar, f -> f % 3 == 0) as f").selectExpr("SUM(size(f)) as r").show())

On an a6000 running with 4 concurrent It took 36 seconds in the worst case, I ran it 3 times and the best was about 33 seconds.

On a 32-core/64-thread Threadripper PRO 5975WX, using all 64 cores it took 688 seconds (and I didn't really want to run it again).

That is a speedup of about 19x.

@revans2
Copy link
Collaborator Author

revans2 commented May 3, 2024

build

@sameerz sameerz added the feature request New feature or request label May 4, 2024
abellina
abellina previously approved these changes May 6, 2024
Copy link
Collaborator

@abellina abellina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM other than copyright/indenting that @jlowe pointed out.

@revans2
Copy link
Collaborator Author

revans2 commented May 6, 2024

@jlowe @abellina please take another look

@revans2
Copy link
Collaborator Author

revans2 commented May 6, 2024

build

@revans2
Copy link
Collaborator Author

revans2 commented May 6, 2024

The link check is failing because of #10768

@revans2 revans2 merged commit 71ecc9f into NVIDIA:branch-24.06 May 6, 2024
43 of 44 checks passed
@revans2 revans2 deleted the array_filter branch May 6, 2024 18:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[FEA]Support ArrayFilter
4 participants