[BUG] test_group_apply_udf and test_group_apply_udf_more_types hangs on Databricks 9.1 #4599

jlowe · 2022-01-21T16:42:57Z

Running test_group_apply_udf or test_group_apply_udf_more_types hangs in the Databricks 9.1 environment. There's no CPU utilization, so it is not an infinite loop. From a stack trace, it appears the code is waiting for data from Python that never arrives:

"Executor task launch worker for task 1.0 in stage 6.0 (TID 45)" #79 daemon prio=5 os_prio=0 tid=0x00007fab6c140000 nid=0x5876 runnable [0x00007faaa43be000]
   java.lang.Thread.State: RUNNABLE
	at java.net.SocketInputStream.socketRead0(Native Method)
	at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
	at java.net.SocketInputStream.read(SocketInputStream.java:171)
	at java.net.SocketInputStream.read(SocketInputStream.java:141)
	at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
	at java.io.BufferedInputStream.read(BufferedInputStream.java:265)
	- locked <0x00000000fa10ad40> (a java.io.BufferedInputStream)
	at java.io.DataInputStream.readInt(DataInputStream.java:387)
	at org.apache.spark.sql.rapids.execution.python.GpuPythonArrowOutput$$anon$1.read(GpuArrowEvalPythonExec.scala:328)
	at org.apache.spark.sql.rapids.execution.python.GpuPythonArrowOutput$$anon$1.read(GpuArrowEvalPythonExec.scala:285)

The text was updated successfully, but these errors were encountered:

firestarman · 2022-01-24T09:14:48Z

There is a DB specific config spark.databricks.execution.pandasZeroConfConversion.groupbyApply.enabled which is false by default. The test can pass after setting this config to true.

Seems DB 9.1 supports to disable this 'zero-conf-conversion' feature and has it disabled by default. While the plugin is missing the support of disabling it.
That is to say, the correct Python runner (grouped python runner or base arrow python runner) should be picked according to this config when being created in the GpuFlatMapGroupInPandas operator.

jlowe added bug Something isn't working ? - Needs Triage Need team to review and classify labels Jan 21, 2022

jlowe mentioned this issue Jan 21, 2022

Avoid unapply on PromotePrecision [databricks] #4583

Merged

GaryShen2008 assigned firestarman Jan 24, 2022

firestarman mentioned this issue Jan 24, 2022

[DB31x] Pick the correct Python runner for flatmap-group Pandas UDF[databricks] #4618

Merged

sameerz removed the ? - Needs Triage Need team to review and classify label Jan 25, 2022

sameerz added this to the Jan 10 - Jan 28 milestone Jan 25, 2022

firestarman closed this as completed in #4618 Jan 26, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] test_group_apply_udf and test_group_apply_udf_more_types hangs on Databricks 9.1 #4599

[BUG] test_group_apply_udf and test_group_apply_udf_more_types hangs on Databricks 9.1 #4599

jlowe commented Jan 21, 2022

firestarman commented Jan 24, 2022 •

edited

Loading

[BUG] test_group_apply_udf and test_group_apply_udf_more_types hangs on Databricks 9.1 #4599

[BUG] test_group_apply_udf and test_group_apply_udf_more_types hangs on Databricks 9.1 #4599

Comments

jlowe commented Jan 21, 2022

firestarman commented Jan 24, 2022 • edited Loading

firestarman commented Jan 24, 2022 •

edited

Loading