You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
I wish the plugin can support Cache(PCBS) Array-of-Struct.
Describe the solution you'd like
Currently if we enable PCBS to cache an Array-Of-Struct, it will fallback on CPU.
Minimum repro:
import org.apache.spark.sql.Row
import org.apache.spark.sql.types._
val data = Seq(
Row(Row("Adam ","","Green"),1,"M",10),
Row(Row("Bob ","Middle","Green"),2,"M",20),
Row(Row("Cathy ","","Green"),3,"F",30)
)
val schema = (new StructType()
.add("name",new StructType()
.add("firstname",StringType)
.add("middlename",StringType)
.add("lastname",StringType))
.add("low",IntegerType)
.add("gender",StringType)
.add("high",IntegerType))
val df = spark.createDataFrame(spark.sparkContext.parallelize(data),schema)
df.write.format("parquet").mode("overwrite").save("/tmp/testparquet")
val df2 = spark.read.parquet("/tmp/testparquet")
df2.createOrReplaceTempView("df2")
val array_of_struct=spark.sql("SELECT array(struct(low,high)) as arrayofstruct FROM df2")
array_of_struct.write.format("parquet").mode("overwrite").save("/tmp/testparquet2")
val array_of_struct2 = spark.read.parquet("/tmp/testparquet2")
array_of_struct2.cache()
array_of_struct2.collect()
Driver log:
!Exec <InMemoryTableScanExec> cannot run on GPU because unsupported data types in output: ArrayType(StructType(StructField(low,IntegerType,true), StructField(high,IntegerType,true)),true) [arrayofstruct#85]
Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.
Additional context
Add any other context, code examples, or references to existing implementations about the feature request here.
The text was updated successfully, but these errors were encountered:
Is your feature request related to a problem? Please describe.
I wish the plugin can support Cache(PCBS) Array-of-Struct.
Describe the solution you'd like
Currently if we enable PCBS to cache an Array-Of-Struct, it will fallback on CPU.
Minimum repro:
Driver log:
Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.
Additional context
Add any other context, code examples, or references to existing implementations about the feature request here.
The text was updated successfully, but these errors were encountered: