-
Notifications
You must be signed in to change notification settings - Fork 64
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove Path key/subscript to save GPU memory usage to improve perf #1987
Conversation
Signed-off-by: Chong Gao <res_life@163.com>
Some IT cases are not passed:
|
All IT cases passed after applying two PRs: |
@@ -696,7 +400,7 @@ __device__ bool evaluate_path(json_parser& p, | |||
// There is a same constant in JSONUtil.java, keep them consistent when changing | |||
// Note: Spark-Rapids should guarantee the path depth is less or equal to this limit, | |||
// or GPU reports cudaErrorIllegalAddress | |||
constexpr int max_path_depth = 16; | |||
constexpr int max_path_depth = 8; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we please split up these two changes so that we can evaluate the performance of each of them separately? I would really like to see max_path_depth be turned into a template parameter for the kernel. That way we can have a few different kernels and select the right one depending on the path done. That or we can do like @nvdbaranec did on the original version and set the amount of shared memory that would be needed as part of launching the kernel. But that can be follow on work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we please split up these two changes so that we can evaluate the performance of each of them separately?
Now, we combine [Key, Named] 2 paths (e.g.: .name1) to a single [Named] path, so the change from 16 to 8 is not a tuning.
And:
[Subscribe, Index] e.g.: [1] => [Index]
[Subscribe, Wildcard] e.g.: [*] => [Wildcard]
No need to separate to two PRs.
That or we can do like @nvdbaranec did on the original version and set the amount of shared memory that would be needed as part of launching the kernel. But that can be follow on work.
OK, will add in a follow-up.
build |
Remove Path key/subscript to save GPU memory usage to improve perf.
I found less GPU stack memory usage will help on perf.
I decreased the max path depth from 16 to 8, and get perf improvement.
And I optimized the main flow logic to reduced query paths:
From the definition of JSON query path, key and named always ocurr in the same time, and also subscript and index/wildcard are always occur in the same time. So it's safe to reduce (key, named) to (named), and it's safe to reduce (subscript, index/wildcard) to (index/wildcard)
Note: If query path is a single
wildcard
, e.g.: $.* , Spark returns null, so the path is equivalent to invalid path.performance test