Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

get-json-object: Add JSON parser and parser utility #1836

Merged
merged 2 commits into from
Mar 10, 2024

Conversation

res-life
Copy link
Collaborator

@res-life res-life commented Mar 5, 2024

closes #1827
closes #1829

  • Add JSON parser
  • JSON parser utility: copy_raw_text, skip_children
  • Define internal interfaces for JSON generator, JSON parser, JSON path parser
  • Copy get-json-obj CUDA code from cuDF

Signed-off-by: Chong Gao res_life@163.com

@res-life
Copy link
Collaborator Author

res-life commented Mar 5, 2024

build

@res-life
Copy link
Collaborator Author

res-life commented Mar 5, 2024

Epic issue is: #1823

This PR:

  • Add JSON parser.
  • Define interfaces for JSON generator, JSON parser, JSON path parser.

The last spark-rapids-jni buiding failed in CI. So I used a previous commit: 9274bd5
I compiled this PR successfully based on reversion 9274bd5.

Haoyang will implement JSON path parser.
Suraj will implement JSON generator parser.

We will post several PRs to facilitate review process.
In the end, after tested get-json-object totally successfully, then enable these PRs.

How to enable these PRs: just change the JNI code to call new get-json-object.

@res-life
Copy link
Collaborator Author

res-life commented Mar 5, 2024

@revans2 Help to review.

Copy link
Collaborator

@revans2 revans2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general it looks good, but we need to finish the TODOs and add in JNI APIs.

@res-life res-life changed the title get-json-object: Add JSON parser; Define internal interfaces [WIP] get-json-object: Add JSON parser; Define internal interfaces Mar 6, 2024
@res-life res-life marked this pull request as draft March 6, 2024 00:27
@res-life res-life changed the title [WIP] get-json-object: Add JSON parser; Define internal interfaces [WIP] get-json-object: new implementation Mar 6, 2024
Add Json Parser utility;
Define internal interfaces;
Copy get-json-obj CUDA code from cuDF;

Signed-off-by: Chong Gao <res_life@163.com>
@res-life res-life changed the base branch from branch-24.04 to get-json-object-feature March 8, 2024 02:21
@res-life res-life changed the title [WIP] get-json-object: new implementation get-json-object: Add JSON parser and parser utility Mar 8, 2024
@res-life res-life self-assigned this Mar 8, 2024
@res-life res-life marked this pull request as ready for review March 8, 2024 02:25
@res-life
Copy link
Collaborator Author

res-life commented Mar 8, 2024

I created a new branch: NVIDIA:get-json-object-feature

@revans2 Help review. Let's use this feature branch to merge code, thanks.

@thirtiseven @SurajAralihalli Let's use new feature branch to implement.
Please use this new feature branch to post PR.

Copy link
Collaborator

@revans2 revans2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most of this looks fine to me, but there are so many TODOs in the code I don't know if this is really ready to go or not, or really how to evaluate what is here.

@res-life res-life merged commit 0c67f0b into NVIDIA:get-json-object-feature Mar 10, 2024
2 checks passed
thirtiseven added a commit that referenced this pull request Mar 27, 2024
* get-json-object:  Add JSON parser and parser utility (#1836)

* Add Json Parser;
Add Json Parser utility;
Define internal interfaces;
Copy get-json-obj CUDA code from cuDF;

Signed-off-by: Chong Gao <res_life@163.com>

* Code format

---------

Signed-off-by: Chong Gao <res_life@163.com>
Co-authored-by: Chong Gao <res_life@163.com>

* get-json-object: match current field name (#1857)

Signed-off-by: Chong Gao <res_life@163.com>
Co-authored-by: Chong Gao <res_life@163.com>

* get-json-object: add utility write_escaped_text for JSON generator (#1863)

Signed-off-by: Chong Gao <res_life@163.com>
Co-authored-by: Chong Gao <res_life@163.com>

* Add JNI for GetJsonObject (#1862)

* Add JNI for GetJsonObject

Signed-off-by: Haoyang Li <haoyangl@nvidia.com>

* clean up

Signed-off-by: Haoyang Li <haoyangl@nvidia.com>

* Parse json path in plugin

Signed-off-by: Haoyang Li <haoyangl@nvidia.com>

* Apply suggestions from code review

Co-authored-by: Nghia Truong <7416935+ttnghia@users.noreply.github.com>

* Use table_view

Signed-off-by: Haoyang Li <haoyangl@nvidia.com>

* Update java

Signed-off-by: Haoyang Li <haoyangl@nvidia.com>

* Apply suggestions from code review

Co-authored-by: Nghia Truong <7416935+ttnghia@users.noreply.github.com>

* clean up

Signed-off-by: Haoyang Li <haoyangl@nvidia.com>

* use matched enum for type

Signed-off-by: Haoyang Li <haoyangl@nvidia.com>

* clean up

Signed-off-by: Haoyang Li <haoyangl@nvidia.com>

* upmerge

Signed-off-by: Haoyang Li <haoyangl@nvidia.com>

* format

Signed-off-by: Haoyang Li <haoyangl@nvidia.com>

---------

Signed-off-by: Haoyang Li <haoyangl@nvidia.com>
Co-authored-by: Nghia Truong <7416935+ttnghia@users.noreply.github.com>

* get-json-object: main flow (#1868)

Signed-off-by: Chong Gao <res_life@163.com>
Co-authored-by: Chong Gao <res_life@163.com>

* Optimize memory usage in match_current_field_name (#1889)

* Optimize match_current_field_name using less memory

Signed-off-by: Chong Gao <res_life@163.com>

* Convert a function to device code

* Add a JNI test case

* Add JNI test case

* Change nesting depth to 4

* Change nesting depth to 8 to fix test

Signed-off-by: Haoyang Li <haoyangl@nvidia.com>

* remove clang format change

Signed-off-by: Haoyang Li <haoyangl@nvidia.com>

---------

Signed-off-by: Chong Gao <res_life@163.com>
Signed-off-by: Haoyang Li <haoyangl@nvidia.com>
Co-authored-by: Chong Gao <res_life@163.com>

* get-json-object: Recursive to iterative (#1890)

* Change recursive to iterative

Signed-off-by: Chong Gao <res_life@163.com>

---------

Signed-off-by: Chong Gao <res_life@163.com>
Co-authored-by: Chong Gao <res_life@163.com>

* Fix bug

* Format

* Use uppercase for path_instruction_type

Signed-off-by: Haoyang Li <haoyangl@nvidia.com>

* Add test cases from Baidu

* Fix escape char error; add test case

* getJsonObject number normalization (#1897)

* Support number normalization

Signed-off-by: Haoyang Li <haoyangl@nvidia.com>

* delete cpp test and add a java test case

Signed-off-by: Haoyang Li <haoyangl@nvidia.com>

---------

Signed-off-by: Haoyang Li <haoyangl@nvidia.com>

* Add test case

* Fix a escape/unescape size bug

Signed-off-by: Haoyang Li <haoyangl@nvidia.com>

* Fix bug: handle leading zeros for number; Refactor

* Apply suggestions from code review

Co-authored-by: Nghia Truong <7416935+ttnghia@users.noreply.github.com>

* Address comments

Signed-off-by: Haoyang Li <haoyangl@nvidia.com>

* fix java test

Signed-off-by: Haoyang Li <haoyangl@nvidia.com>

* Add test cases; Fix a bug

* follow up escape/unescape bug fix

Signed-off-by: Haoyang Li <haoyangl@nvidia.com>

* Minor refactor

* Add a case; Fix bug

---------

Signed-off-by: Chong Gao <res_life@163.com>
Signed-off-by: Haoyang Li <haoyangl@nvidia.com>
Co-authored-by: Chong Gao <res_life@163.com>
Co-authored-by: Haoyang Li <haoyangl@nvidia.com>
Co-authored-by: Nghia Truong <7416935+ttnghia@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants