Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] databricks orc test_read_nested_pruning hanging #11547

Open
abellina opened this issue Sep 30, 2024 · 3 comments
Open

[BUG] databricks orc test_read_nested_pruning hanging #11547

abellina opened this issue Sep 30, 2024 · 3 comments
Assignees
Labels
? - Needs Triage Need team to review and classify bug Something isn't working

Comments

@abellina
Copy link
Collaborator

We have seen some CI jobs failing over the weekend with this issue:

08:34:04  ../../src/main/python/orc_test.py::test_read_nested_pruning[false-orc-{'spark.rapids.sql.format.orc.reader.type': 'MULTITHREADED', 'spark.rapids.sql.reader.multithreaded.combine.sizeBytes': '64m', 'spark.rapids.sql.reader.multithreaded.read.keepOrder': True, 'spark.rapids.sql.reader.chunked': True, 'spark.rapids.sql.reader.chunked.limitMemoryUsage': False}-[['ar', Array(Struct(['str_1', String],['str_2', String]))]]-[['ar', Array(Struct(['str_2', String]))]]][DATAGEN_SEED=1727612435, TZ=UTC, INJECT_OOM] client_loop: send disconnect: Broken pipe
08:36:25  ssh: connect to host 54.191.207.123 port 2200: Connection timed out

We should investigate. This is with 24.10 snapshot.

@abellina abellina added ? - Needs Triage Need team to review and classify bug Something isn't working labels Sep 30, 2024
@abellina
Copy link
Collaborator Author

I'll take a look at reproing this

@abellina
Copy link
Collaborator Author

abellina commented Sep 30, 2024

Note I can't repro this locally (RTX5000) with the provided datagen seed against spark 3.3.0. I will try it in databricks as well (A10)

@abellina
Copy link
Collaborator Author

abellina commented Sep 30, 2024

I have been able to run this against databricks 12.2 on an A10. I cannot repro the issue by running:

export DATAGEN_SEED=1727612435
./jenkins/databricks/test.sh

And modifying test.sh so it only runs -k test_read_nested_pruning

==== 540 passed, 55 warnings in 174.20s (0:02:54) ====

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
? - Needs Triage Need team to review and classify bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants