-
Notifications
You must be signed in to change notification settings - Fork 230
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] cache_test.py failed w/ cache.serializer in spark 3.1.2 #3311
Comments
I wonder if this is due to build changes, there is now a 312 ParquetCachedBatchSerializer. I"ll run a few tests. |
running a part of the test shows that setting it to the new shim layer version works: This is going to change before the 21.10 release again to hopefully have a non shim specific name. |
I'm not quite sure why this is happening, looking at failure its only when 'spark.sql.inMemoryColumnarStorage.enableVectorizedReader': 'true' |
With this test file the first error I see locally is not mentioned in this issue:
This test passes deterministically when the shim for PCBS matches. So I presume that once we are done with #3223 at least up to the classloader task, the issue will disappear. But it would be good to fully understand how the choice of PCBS shim affects the plan to be sure. |
#3390 should resolve this issue, but we need to remember to re-enable tests before closing this. |
so I'm running into this same exception when commonizing the code and it looks like it happens when the gpu version of it isn't used, so I assume its falling back to cpu version and the schema is coming out empty which our HostToGpuCoalesceIterator isn't expecting. The reason it wasn't using GPU version is just class instanceOf match didn't happen properly because it was looking for different package name class then the shim version was showing. So I fixed the shim issue but I think we want to test the fallback case here, something doesn't seem to be happening properly there. |
note pr https://github.com/NVIDIA/spark-rapids/pull/3473/files to make it work, but we still need to investigate this with fallback case |
so testing this out more if we don't specify --conf spark.sql.cache.serializer=com.nvidia.spark.ParquetCachedBatchSerializer at all then it works properly. So it seem its just if you specify but it doesn't actually load, which normally shouldn't be the case, so this might not be as high priority. |
In case when the PCBS isn't loaded due to any reason, we should fail gracefully |
This is happening because the The only time when this can happen is when InMemoryTableScanExec isn't replaced with GpuInMemoryTableScanExec, which cannot happen if we are using PCBS and if we aren't using PCBS then we will not run into this problem because the Default Serializer is not showing this problem. I don't think this is a P1 as I haven't been able to repro it without changing the code. |
this was marked as a P1 for 21.12 and not 21.10 just so that we didn't lose it and we simply output a useful message. If this configuration is not going to work then we should just log and useful message and potentially quit if we can recognize this. |
Describe the bug
nightly failed in spark 3.1.2
java.lang.ArrayIndexOutOfBoundsException: 0
. detailed log,executor log,
failure list,
The text was updated successfully, but these errors were encountered: