[BUG] Spark 3.3 IT test cache_test.py::test_passing_gpuExpr_as_Expr fails with IllegalArgumentException #4931

tgravescs · 2022-03-10T17:47:36Z

Describe the bug
Spark 3.3 integration test build fails:

 FAILED ../../src/main/python/cache_test.py::test_passing_gpuExpr_as_Expr[{'spark.sql.inMemoryColumnarStorage.enableVectorizedReader': 'true'}][ALLOW_NON_GPU(CollectLimitExec)]
11:36:44  FAILED ../../src/main/python/cache_test.py::test_passing_gpuExpr_as_Expr[{'spark.sql.inMemoryColumnarStorage.enableVectorizedReader': 'false'}][ALLOW_NON_GPU(CollectLimitExec)]

 Caused by: java.lang.IllegalArgumentException: For input string: "null"
11:36:44  E                   	at scala.collection.immutable.StringLike.parseBoolean(StringLike.scala:330)
11:36:44  E                   	at scala.collection.immutable.StringLike.toBoolean(StringLike.scala:289)
11:36:44  E                   	at scala.collection.immutable.StringLike.toBoolean$(StringLike.scala:289)
11:36:44  E                   	at scala.collection.immutable.StringOps.toBoolean(StringOps.scala:33)
11:36:44  E                   	at org.apache.spark.sql.execution.datasources.parquet.SparkToParquetSchemaConverter.<init>(ParquetSchemaConverter.scala:455)
11:36:44  E                   	at org.apache.spark.sql.execution.datasources.parquet.ParquetWriteSupport.init(ParquetWriteSupport.scala:114)
11:36:44  E                   	at com.nvidia.spark.rapids.shims.ParquetOutputFileFormat.getRecordWriter(ParquetCachedBatchSerializer.scala:1505)
11:36:44  E                   	at com.nvidia.spark.rapids.shims.ParquetCachedBatchSerializer$CachedBatchIteratorProducer$InternalRowToCachedBatchIterator.$anonfun$next$1(ParquetCachedBatchSerializer.scala:1247)
11:36:44  E                   	at org.apache.spark.sql.internal.SQLConf$.withExistingConf(SQLConf.scala:158)
11:36:44  E                   	at com.nvidia.spark.rapids.shims.ParquetCachedBatchSerializer$CachedBatchIteratorProducer$InternalRowToCachedBatchIterator.next(ParquetCachedBatchSerializer.scala:1247)

The text was updated successfully, but these errors were encountered:

firestarman · 2022-03-11T07:41:14Z

This is due to missing the parquet field ID setting in the Configuration for parquet writing in PCBS.

Added this in PR #4926

Details is here https://github.com/NVIDIA/spark-rapids/pull/4926/files#diff-d2170a624b05030a6b93a827792ae5ee35d9d870bab6e86823cc4264f32bee47R1442

tgravescs added bug Something isn't working ? - Needs Triage Need team to review and classify P0 Must have for release labels Mar 10, 2022

firestarman self-assigned this Mar 11, 2022

firestarman mentioned this issue Mar 11, 2022

Support DayTimeIntervalType in ParquetCachedBatchSerializer[databricks] #4926

Merged

sameerz added this to the Feb 28 - Mar 18 milestone Mar 11, 2022

razajafri closed this as completed in #4926 Mar 15, 2022

sameerz removed the ? - Needs Triage Need team to review and classify label Mar 15, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] Spark 3.3 IT test cache_test.py::test_passing_gpuExpr_as_Expr fails with IllegalArgumentException #4931

[BUG] Spark 3.3 IT test cache_test.py::test_passing_gpuExpr_as_Expr fails with IllegalArgumentException #4931

tgravescs commented Mar 10, 2022

firestarman commented Mar 11, 2022

[BUG] Spark 3.3 IT test cache_test.py::test_passing_gpuExpr_as_Expr fails with IllegalArgumentException #4931

[BUG] Spark 3.3 IT test cache_test.py::test_passing_gpuExpr_as_Expr fails with IllegalArgumentException #4931

Comments

tgravescs commented Mar 10, 2022

firestarman commented Mar 11, 2022