Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix ORC read corruption when specified schema does not match file order #3062

Merged
merged 1 commit into from
Jul 29, 2021

Conversation

wbo4958
Copy link
Collaborator

@wbo4958 wbo4958 commented Jul 28, 2021

This PR first fixes the orc data reading mess up for the schema which can't be pruned
second, adds the dis-order read schema unit tests for both Parquet and Orc.

Please refer to issue #3060.

With this PR, GPU can produce the correct result but The CPU itself has produced the wrong result. So this PR will bring the in-compatible behavior between GPU and CPU.

@wbo4958 wbo4958 requested review from tgravescs and jlowe July 28, 2021 11:21
@wbo4958
Copy link
Collaborator Author

wbo4958 commented Jul 28, 2021

build

This PR first fixes the orc data read mess up for the schema which
can't be pruned, and then add the dis-order read schema unit tests
for both Parquet and Orc.

Signed-off-by: Bobby Wang <wbo4958@gmail.com>
@wbo4958
Copy link
Collaborator Author

wbo4958 commented Jul 28, 2021

build

Copy link
Member

@jlowe jlowe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wbo4958 can you elaborate in the PR description about the binary files that were added?

@jlowe jlowe changed the title Fix orc read data mess up for the schema can't be pruned Fix ORC read corruption when specified schema does not match file order Jul 28, 2021
@sameerz sameerz added the bug Something isn't working label Jul 29, 2021
@jlowe jlowe merged commit c4b4ae6 into NVIDIA:branch-21.08 Jul 29, 2021
@wbo4958 wbo4958 deleted the fix-orc-issue branch July 29, 2021 21:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants