Remove Path key/subscript to save GPU memory usage to improve perf #1987

res-life · 2024-04-22T08:34:36Z

Remove Path key/subscript to save GPU memory usage to improve perf.

I found less GPU stack memory usage will help on perf.
I decreased the max path depth from 16 to 8, and get perf improvement.

constexpr int max_path_depth = 8;

And I optimized the main flow logic to reduced query paths:

key and named to one path => named
subscript and index/wildcard to one path => index/wildcard
From the definition of JSON query path, key and named always ocurr in the same time, and also subscript and index/wildcard are always occur in the same time. So it's safe to reduce (key, named) to (named), and it's safe to reduce (subscript, index/wildcard) to (index/wildcard)

Note: If query path is a single wildcard , e.g.: $.* , Spark returns null, so the path is equivalent to invalid path.

performance test

Customer	Base line(1st, 2nd, 3rd) cudf legacy	After this PR	speed
customer 1	231500 ms, 232760 ms, 232647 ms	275719 ms, 272898 ms, 274407 ms	about 84%

Signed-off-by: Chong Gao <res_life@163.com>

res-life · 2024-04-22T08:43:13Z

Some IT cases are not passed:

./integration_tests/run_pyspark_from_build.sh -s -k test_get_json_object
========= 9 failed, 58 passed, 27086 deselected, 9 warnings in 18.72s ==========

res-life · 2024-04-23T10:55:38Z

All IT cases passed after applying two PRs:

revans2 · 2024-04-23T13:42:56Z

src/main/cpp/src/get_json_object.cu

@@ -696,7 +400,7 @@ __device__ bool evaluate_path(json_parser& p,
  // There is a same constant in JSONUtil.java, keep them consistent when changing
  // Note: Spark-Rapids should guarantee the path depth is less or equal to this limit,
  // or GPU reports cudaErrorIllegalAddress
-  constexpr int max_path_depth = 16;
+  constexpr int max_path_depth = 8;


Can we please split up these two changes so that we can evaluate the performance of each of them separately? I would really like to see max_path_depth be turned into a template parameter for the kernel. That way we can have a few different kernels and select the right one depending on the path done. That or we can do like @nvdbaranec did on the original version and set the amount of shared memory that would be needed as part of launching the kernel. But that can be follow on work.

Can we please split up these two changes so that we can evaluate the performance of each of them separately?

Now, we combine [Key, Named] 2 paths (e.g.: .name1) to a single [Named] path, so the change from 16 to 8 is not a tuning.
And:
[Subscribe, Index] e.g.: [1] => [Index]
[Subscribe, Wildcard] e.g.: [*] => [Wildcard]
No need to separate to two PRs.

That or we can do like @nvdbaranec did on the original version and set the amount of shared memory that would be needed as part of launching the kernel. But that can be follow on work.

OK, will add in a follow-up.

res-life · 2024-04-26T08:39:54Z

build

Remove Path key/subscribe to save GPU memory usage to improve perf

4016027

Signed-off-by: Chong Gao <res_life@163.com>

res-life requested review from revans2 and ttnghia April 22, 2024 08:34

Chong Gao added 2 commits April 22, 2024 21:22

Fix

6d2779e

Fix

f8a2c36

res-life mentioned this pull request Apr 23, 2024

Add short circuit path for get-json-object when there is separate wildcard path NVIDIA/spark-rapids#10734

Merged

res-life changed the title ~~Remove Path key/subscribe to save GPU memory usage to improve perf~~ [WIP] Remove Path key/subscribe to save GPU memory usage to improve perf Apr 23, 2024

revans2 approved these changes Apr 23, 2024

View reviewed changes

res-life changed the title ~~[WIP] Remove Path key/subscribe to save GPU memory usage to improve perf~~ [WIP] Remove Path key/subscript to save GPU memory usage to improve perf Apr 26, 2024

sameerz added the performance label Apr 26, 2024

res-life changed the title ~~[WIP] Remove Path key/subscript to save GPU memory usage to improve perf~~ Remove Path key/subscript to save GPU memory usage to improve perf Apr 28, 2024

res-life marked this pull request as ready for review April 29, 2024 05:53

res-life merged commit 2ca29f7 into NVIDIA:branch-24.06 Apr 29, 2024
3 checks passed

res-life deleted the get-json-object-perf3 branch April 29, 2024 05:54

res-life mentioned this pull request Apr 30, 2024

Get json object comments address #1924

Merged

res-life mentioned this pull request May 17, 2024

Make the max jsonpath depth 16 to match other values in the code #2047

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove Path key/subscript to save GPU memory usage to improve perf #1987

Remove Path key/subscript to save GPU memory usage to improve perf #1987

res-life commented Apr 22, 2024 •

edited

Loading

res-life commented Apr 22, 2024 •

edited

Loading

res-life commented Apr 23, 2024

revans2 Apr 23, 2024

res-life Apr 29, 2024 •

edited

Loading

res-life commented Apr 26, 2024

Remove Path key/subscript to save GPU memory usage to improve perf #1987

Remove Path key/subscript to save GPU memory usage to improve perf #1987

Conversation

res-life commented Apr 22, 2024 • edited Loading

performance test

res-life commented Apr 22, 2024 • edited Loading

res-life commented Apr 23, 2024

revans2 Apr 23, 2024

Choose a reason for hiding this comment

res-life Apr 29, 2024 • edited Loading

Choose a reason for hiding this comment

res-life commented Apr 26, 2024

res-life commented Apr 22, 2024 •

edited

Loading

res-life commented Apr 22, 2024 •

edited

Loading

res-life Apr 29, 2024 •

edited

Loading