[BUG] test_groupby_std_variance_partial_replace_fallback failed #4744

jlowe · 2022-02-10T15:06:41Z

test_groupby_std_variance_partial_replace_fallback failed in recent nightly tests.
Excerpt from one of the failures:

[2022-02-10T14:57:45.236Z] _ test_groupby_std_variance_partial_replace_fallback[false-partial-{'spark.rapids.sql.variableFloatAgg.enabled': 'true', 'spark.rapids.sql.hasNans': 'false', 'spark.rapids.sql.castStringToFloat.enabled': 'true', 'spark.rapids.sql.batchSizeBytes': '250'}-[('a', RepeatSeq(String)), ('b', Integer), ('c', Long)]] _
[2022-02-10T14:57:45.236Z] 
[2022-02-10T14:57:45.236Z] data_gen = [('a', RepeatSeq(String)), ('b', Integer), ('c', Long)]
[2022-02-10T14:57:45.236Z] conf = {'spark.rapids.sql.batchSizeBytes': '250', 'spark.rapids.sql.castStringToFloat.enabled': 'true', 'spark.rapids.sql.hasNans': 'false', 'spark.rapids.sql.variableFloatAgg.enabled': 'true'}
[2022-02-10T14:57:45.236Z] replace_mode = 'partial', aqe_enabled = 'false'
[2022-02-10T14:57:45.236Z] 
[2022-02-10T14:57:45.236Z]     @ignore_order(local=True)
[2022-02-10T14:57:45.236Z]     @approximate_float
[2022-02-10T14:57:45.236Z]     @allow_non_gpu('KnownFloatingPointNormalized', 'NormalizeNaNAndZero',
[2022-02-10T14:57:45.236Z]                    'HashAggregateExec', 'SortAggregateExec',
[2022-02-10T14:57:45.236Z]                    'Cast',
[2022-02-10T14:57:45.236Z]                    'ShuffleExchangeExec', 'HashPartitioning', 'SortExec',
[2022-02-10T14:57:45.236Z]                    'StddevPop', 'StddevSamp', 'VariancePop', 'VarianceSamp',
[2022-02-10T14:57:45.236Z]                    'SortArray', 'Alias', 'Literal', 'Count',
[2022-02-10T14:57:45.236Z]                    'GpuToCpuCollectBufferTransition', 'CpuToGpuCollectBufferTransition',
[2022-02-10T14:57:45.236Z]                    'AggregateExpression')
[2022-02-10T14:57:45.236Z]     @pytest.mark.parametrize('data_gen', _init_list_with_nans_and_no_nans, ids=idfn)
[2022-02-10T14:57:45.236Z]     @pytest.mark.parametrize('conf', get_params(_confs, params_markers_for_confs), ids=idfn)
[2022-02-10T14:57:45.236Z]     @pytest.mark.parametrize('replace_mode', _replace_modes_non_distinct, ids=idfn)
[2022-02-10T14:57:45.236Z]     @pytest.mark.parametrize('aqe_enabled', ['false', 'true'], ids=idfn)
[2022-02-10T14:57:45.236Z]     def test_groupby_std_variance_partial_replace_fallback(data_gen,
[2022-02-10T14:57:45.236Z]                                                            conf,
[2022-02-10T14:57:45.236Z]                                                            replace_mode,
[2022-02-10T14:57:45.236Z]                                                            aqe_enabled):
[2022-02-10T14:57:45.236Z]         local_conf = copy_and_update(conf, {'spark.rapids.sql.hashAgg.replaceMode': replace_mode,
[2022-02-10T14:57:45.236Z]                                             'spark.sql.adaptive.enabled': aqe_enabled})
[2022-02-10T14:57:45.236Z]     
[2022-02-10T14:57:45.236Z]         exist_clz = ['StddevPop', 'StddevSamp', 'VariancePop', 'VarianceSamp',
[2022-02-10T14:57:45.236Z]                      'GpuStddevPop', 'GpuStddevSamp', 'GpuVariancePop', 'GpuVarianceSamp']
[2022-02-10T14:57:45.236Z]         non_exist_clz = []
[2022-02-10T14:57:45.236Z]     
[2022-02-10T14:57:45.236Z] >       assert_cpu_and_gpu_are_equal_collect_with_capture(
[2022-02-10T14:57:45.236Z]             lambda spark: gen_df(spark, data_gen, length=1000)
[2022-02-10T14:57:45.236Z]                 .groupby('a')
[2022-02-10T14:57:45.236Z]                 .agg(
[2022-02-10T14:57:45.236Z]                     f.stddev('b'),
[2022-02-10T14:57:45.236Z]                     f.stddev_pop('b'),
[2022-02-10T14:57:45.236Z]                     f.stddev_samp('b'),
[2022-02-10T14:57:45.236Z]                     f.variance('b'),
[2022-02-10T14:57:45.236Z]                     f.var_pop('b'),
[2022-02-10T14:57:45.236Z]                     f.var_samp('b')
[2022-02-10T14:57:45.236Z]                 ),
[2022-02-10T14:57:45.236Z]             exist_classes=','.join(exist_clz),
[2022-02-10T14:57:45.236Z]             non_exist_classes=','.join(non_exist_clz),
[2022-02-10T14:57:45.236Z]             conf=local_conf)
[2022-02-10T14:57:45.236Z] 
[2022-02-10T14:57:45.236Z] ../../src/main/python/hash_aggregate_test.py:1686: 
[2022-02-10T14:57:45.236Z] _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
[2022-02-10T14:57:45.236Z] ../../src/main/python/asserts.py:326: in assert_cpu_and_gpu_are_equal_collect_with_capture
[2022-02-10T14:57:45.236Z]     from_gpu, gpu_df = with_gpu_session(bring_back, conf=conf)
[2022-02-10T14:57:45.236Z] ../../src/main/python/spark_session.py:103: in with_gpu_session
[2022-02-10T14:57:45.236Z]     return with_spark_session(func, conf=copy)
[2022-02-10T14:57:45.236Z] ../../src/main/python/spark_session.py:70: in with_spark_session
[2022-02-10T14:57:45.236Z]     ret = func(_spark)
[2022-02-10T14:57:45.236Z] ../../src/main/python/asserts.py:206: in bring_back
[2022-02-10T14:57:45.236Z]     return (df.collect(), df)
[2022-02-10T14:57:45.236Z] /var/lib/jenkins/spark/spark-3.1.2-bin-hadoop3.2/python/lib/pyspark.zip/pyspark/sql/dataframe.py:677: in collect
[2022-02-10T14:57:45.236Z]     sock_info = self._jdf.collectToPython()
[2022-02-10T14:57:45.236Z] /var/lib/jenkins/spark/spark-3.1.2-bin-hadoop3.2/python/lib/py4j-0.10.9-src.zip/py4j/java_gateway.py:1304: in __call__
[2022-02-10T14:57:45.236Z]     return_value = get_return_value(
[2022-02-10T14:57:45.236Z] /var/lib/jenkins/spark/spark-3.1.2-bin-hadoop3.2/python/lib/pyspark.zip/pyspark/sql/utils.py:111: in deco
[2022-02-10T14:57:45.236Z]     return f(*a, **kw)
[2022-02-10T14:57:45.236Z] _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
[2022-02-10T14:57:45.236Z] 
[2022-02-10T14:57:45.236Z] answer = 'xro1744804'
[2022-02-10T14:57:45.236Z] gateway_client = <py4j.java_gateway.GatewayClient object at 0x7fa5ead3b790>
[2022-02-10T14:57:45.236Z] target_id = 'o1744803', name = 'collectToPython'
[2022-02-10T14:57:45.236Z] 
[2022-02-10T14:57:45.236Z]     def get_return_value(answer, gateway_client, target_id=None, name=None):
[2022-02-10T14:57:45.236Z]         """Converts an answer received from the Java gateway into a Python object.
[2022-02-10T14:57:45.236Z]     
[2022-02-10T14:57:45.236Z]         For example, string representation of integers are converted to Python
[2022-02-10T14:57:45.236Z]         integer, string representation of objects are converted to JavaObject
[2022-02-10T14:57:45.236Z]         instances, etc.
[2022-02-10T14:57:45.236Z]     
[2022-02-10T14:57:45.236Z]         :param answer: the string returned by the Java gateway
[2022-02-10T14:57:45.236Z]         :param gateway_client: the gateway client used to communicate with the Java
[2022-02-10T14:57:45.236Z]             Gateway. Only necessary if the answer is a reference (e.g., object,
[2022-02-10T14:57:45.236Z]             list, map)
[2022-02-10T14:57:45.236Z]         :param target_id: the name of the object from which the answer comes from
[2022-02-10T14:57:45.236Z]             (e.g., *object1* in `object1.hello()`). Optional.
[2022-02-10T14:57:45.236Z]         :param name: the name of the member from which the answer comes from
[2022-02-10T14:57:45.236Z]             (e.g., *hello* in `object1.hello()`). Optional.
[2022-02-10T14:57:45.236Z]         """
[2022-02-10T14:57:45.236Z]         if is_error(answer)[0]:
[2022-02-10T14:57:45.236Z]             if len(answer) > 1:
[2022-02-10T14:57:45.236Z]                 type = answer[1]
[2022-02-10T14:57:45.236Z]                 value = OUTPUT_CONVERTER[type](answer[2:], gateway_client)
[2022-02-10T14:57:45.236Z]                 if answer[1] == REFERENCE_TYPE:
[2022-02-10T14:57:45.236Z] >                   raise Py4JJavaError(
[2022-02-10T14:57:45.236Z]                         "An error occurred while calling {0}{1}{2}.\n".
[2022-02-10T14:57:45.236Z]                         format(target_id, ".", name), value)
[2022-02-10T14:57:45.236Z] E                   py4j.protocol.Py4JJavaError: An error occurred while calling o1744803.collectToPython.
[2022-02-10T14:57:45.236Z] E                   : org.apache.spark.SparkException: Job aborted due to stage failure: Task 2 in stage 38959.0 failed 1 times, most recent failure: Lost task 2.0 in stage 38959.0 (TID 933569) (10.136.6.4 executor 3): java.lang.AssertionError:  value at 0 is null
[2022-02-10T14:57:45.236Z] E                   	at ai.rapids.cudf.HostColumnVectorCore.assertsForGet(HostColumnVectorCore.java:230)
[2022-02-10T14:57:45.236Z] E                   	at ai.rapids.cudf.HostColumnVectorCore.getDouble(HostColumnVectorCore.java:321)
[2022-02-10T14:57:45.236Z] E                   	at com.nvidia.spark.rapids.RapidsHostColumnVectorCore.getDouble(RapidsHostColumnVectorCore.java:124)
[2022-02-10T14:57:45.236Z] E                   	at org.apache.spark.sql.vectorized.ColumnarBatchRow.getDouble(ColumnarBatch.java:211)
[2022-02-10T14:57:45.236Z] E                   	at org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection.writeFields_0_0$(Unknown Source)
[2022-02-10T14:57:45.236Z] E                   	at org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection.apply(Unknown Source)
[2022-02-10T14:57:45.236Z] E                   	at org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection.apply(Unknown Source)
[2022-02-10T14:57:45.236Z] E                   	at scala.collection.Iterator$$anon$10.next(Iterator.scala:459)
[2022-02-10T14:57:45.236Z] E                   	at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage2.agg_doAggregateWithKeys_0$(Unknown Source)
[2022-02-10T14:57:45.236Z] E                   	at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage2.processNext(Unknown Source)
[2022-02-10T14:57:45.236Z] E                   	at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
[2022-02-10T14:57:45.236Z] E                   	at org.apache.spark.sql.execution.WholeStageCodegenExec$$anon$1.hasNext(WholeStageCodegenExec.scala:755)
[2022-02-10T14:57:45.236Z] E                   	at org.apache.spark.sql.execution.SparkPlan.$anonfun$getByteArrayRdd$1(SparkPlan.scala:345)
[2022-02-10T14:57:45.236Z] E                   	at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2(RDD.scala:898)
[2022-02-10T14:57:45.236Z] E                   	at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2$adapted(RDD.scala:898)
[2022-02-10T14:57:45.236Z] E                   	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
[2022-02-10T14:57:45.236Z] E                   	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373)
[2022-02-10T14:57:45.236Z] E                   	at org.apache.spark.rdd.RDD.iterator(RDD.scala:337)
[2022-02-10T14:57:45.236Z] E                   	at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
[2022-02-10T14:57:45.236Z] E                   	at org.apache.spark.scheduler.Task.run(Task.scala:131)
[2022-02-10T14:57:45.236Z] E                   	at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:497)
[2022-02-10T14:57:45.236Z] E                   	at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1439)
[2022-02-10T14:57:45.236Z] E                   	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:500)
[2022-02-10T14:57:45.236Z] E                   	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
[2022-02-10T14:57:45.236Z] E                   	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
[2022-02-10T14:57:45.236Z] E                   	at java.lang.Thread.run(Thread.java:748)
[2022-02-10T14:57:45.236Z] E                   
[2022-02-10T14:57:45.236Z] E                   Driver stacktrace:
[2022-02-10T14:57:45.236Z] E                   	at org.apache.spark.scheduler.DAGScheduler.failJobAndIndependentStages(DAGScheduler.scala:2258)
[2022-02-10T14:57:45.236Z] E                   	at org.apache.spark.scheduler.DAGScheduler.$anonfun$abortStage$2(DAGScheduler.scala:2207)
[2022-02-10T14:57:45.236Z] E                   	at org.apache.spark.scheduler.DAGScheduler.$anonfun$abortStage$2$adapted(DAGScheduler.scala:2206)
[2022-02-10T14:57:45.236Z] E                   	at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)
[2022-02-10T14:57:45.236Z] E                   	at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)
[2022-02-10T14:57:45.236Z] E                   	at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)
[2022-02-10T14:57:45.236Z] E                   	at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:2206)
[2022-02-10T14:57:45.236Z] E                   	at org.apache.spark.scheduler.DAGScheduler.$anonfun$handleTaskSetFailed$1(DAGScheduler.scala:1079)
[2022-02-10T14:57:45.236Z] E                   	at org.apache.spark.scheduler.DAGScheduler.$anonfun$handleTaskSetFailed$1$adapted(DAGScheduler.scala:1079)
[2022-02-10T14:57:45.236Z] E                   	at scala.Option.foreach(Option.scala:407)
[2022-02-10T14:57:45.236Z] E                   	at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:1079)
[2022-02-10T14:57:45.236Z] E                   	at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:2445)
[2022-02-10T14:57:45.236Z] E                   	at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2387)
[2022-02-10T14:57:45.236Z] E                   	at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2376)
[2022-02-10T14:57:45.236Z] E                   	at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:49)
[2022-02-10T14:57:45.236Z] E                   	at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:868)
[2022-02-10T14:57:45.236Z] E                   	at org.apache.spark.SparkContext.runJob(SparkContext.scala:2196)
[2022-02-10T14:57:45.236Z] E                   	at org.apache.spark.SparkContext.runJob(SparkContext.scala:2217)
[2022-02-10T14:57:45.236Z] E                   	at org.apache.spark.SparkContext.runJob(SparkContext.scala:2236)
[2022-02-10T14:57:45.236Z] E                   	at org.apache.spark.SparkContext.runJob(SparkContext.scala:2261)
[2022-02-10T14:57:45.236Z] E                   	at org.apache.spark.rdd.RDD.$anonfun$collect$1(RDD.scala:1030)
[2022-02-10T14:57:45.236Z] E                   	at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
[2022-02-10T14:57:45.236Z] E                   	at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
[2022-02-10T14:57:45.236Z] E                   	at org.apache.spark.rdd.RDD.withScope(RDD.scala:414)
[2022-02-10T14:57:45.236Z] E                   	at org.apache.spark.rdd.RDD.collect(RDD.scala:1029)
[2022-02-10T14:57:45.236Z] E                   	at org.apache.spark.sql.execution.SparkPlan.executeCollect(SparkPlan.scala:390)
[2022-02-10T14:57:45.236Z] E                   	at org.apache.spark.sql.Dataset.$anonfun$collectToPython$1(Dataset.scala:3519)
[2022-02-10T14:57:45.236Z] E                   	at org.apache.spark.sql.Dataset.$anonfun$withAction$1(Dataset.scala:3687)
[2022-02-10T14:57:45.236Z] E                   	at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$5(SQLExecution.scala:103)
[2022-02-10T14:57:45.236Z] E                   	at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:163)
[2022-02-10T14:57:45.236Z] E                   	at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:90)
[2022-02-10T14:57:45.236Z] E                   	at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:775)
[2022-02-10T14:57:45.236Z] E                   	at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:64)
[2022-02-10T14:57:45.236Z] E                   	at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3685)
[2022-02-10T14:57:45.236Z] E                   	at org.apache.spark.sql.Dataset.collectToPython(Dataset.scala:3516)
[2022-02-10T14:57:45.236Z] E                   	at sun.reflect.GeneratedMethodAccessor85.invoke(Unknown Source)
[2022-02-10T14:57:45.236Z] E                   	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
[2022-02-10T14:57:45.236Z] E                   	at java.lang.reflect.Method.invoke(Method.java:498)
[2022-02-10T14:57:45.236Z] E                   	at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
[2022-02-10T14:57:45.237Z] E                   	at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
[2022-02-10T14:57:45.237Z] E                   	at py4j.Gateway.invoke(Gateway.java:282)
[2022-02-10T14:57:45.237Z] E                   	at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
[2022-02-10T14:57:45.237Z] E                   	at py4j.commands.CallCommand.execute(CallCommand.java:79)
[2022-02-10T14:57:45.237Z] E                   	at py4j.GatewayConnection.run(GatewayConnection.java:238)
[2022-02-10T14:57:45.237Z] E                   	at java.lang.Thread.run(Thread.java:748)
[2022-02-10T14:57:45.237Z] E                   Caused by: java.lang.AssertionError:  value at 0 is null
[2022-02-10T14:57:45.237Z] E                   	at ai.rapids.cudf.HostColumnVectorCore.assertsForGet(HostColumnVectorCore.java:230)
[2022-02-10T14:57:45.237Z] E                   	at ai.rapids.cudf.HostColumnVectorCore.getDouble(HostColumnVectorCore.java:321)
[2022-02-10T14:57:45.237Z] E                   	at com.nvidia.spark.rapids.RapidsHostColumnVectorCore.getDouble(RapidsHostColumnVectorCore.java:124)
[2022-02-10T14:57:45.237Z] E                   	at org.apache.spark.sql.vectorized.ColumnarBatchRow.getDouble(ColumnarBatch.java:211)
[2022-02-10T14:57:45.237Z] E                   	at org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection.writeFields_0_0$(Unknown Source)
[2022-02-10T14:57:45.237Z] E                   	at org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection.apply(Unknown Source)
[2022-02-10T14:57:45.237Z] E                   	at org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection.apply(Unknown Source)
[2022-02-10T14:57:45.237Z] E                   	at scala.collection.Iterator$$anon$10.next(Iterator.scala:459)
[2022-02-10T14:57:45.237Z] E                   	at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage2.agg_doAggregateWithKeys_0$(Unknown Source)
[2022-02-10T14:57:45.237Z] E                   	at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage2.processNext(Unknown Source)
[2022-02-10T14:57:45.237Z] E                   	at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
[2022-02-10T14:57:45.237Z] E                   	at org.apache.spark.sql.execution.WholeStageCodegenExec$$anon$1.hasNext(WholeStageCodegenExec.scala:755)
[2022-02-10T14:57:45.237Z] E                   	at org.apache.spark.sql.execution.SparkPlan.$anonfun$getByteArrayRdd$1(SparkPlan.scala:345)
[2022-02-10T14:57:45.237Z] E                   	at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2(RDD.scala:898)
[2022-02-10T14:57:45.237Z] E                   	at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2$adapted(RDD.scala:898)
[2022-02-10T14:57:45.237Z] E                   	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
[2022-02-10T14:57:45.237Z] E                   	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373)
[2022-02-10T14:57:45.237Z] E                   	at org.apache.spark.rdd.RDD.iterator(RDD.scala:337)
[2022-02-10T14:57:45.237Z] E                   	at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
[2022-02-10T14:57:45.237Z] E                   	at org.apache.spark.scheduler.Task.run(Task.scala:131)
[2022-02-10T14:57:45.237Z] E                   	at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:497)
[2022-02-10T14:57:45.237Z] E                   	at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1439)
[2022-02-10T14:57:45.237Z] E                   	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:500)
[2022-02-10T14:57:45.237Z] E                   	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
[2022-02-10T14:57:45.237Z] E                   	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
[2022-02-10T14:57:45.237Z] E                   	... 1 more
[2022-02-10T14:57:45.237Z] 
[2022-02-10T14:57:45.237Z] /var/lib/jenkins/spark/spark-3.1.2-bin-hadoop3.2/python/lib/py4j-0.10.9-src.zip/py4j/protocol.py:326: Py4JJavaError

The text was updated successfully, but these errors were encountered:

jlowe added bug Something isn't working ? - Needs Triage Need team to review and classify P0 Must have for release labels Feb 10, 2022

sameerz removed the ? - Needs Triage Need team to review and classify label Feb 11, 2022

sameerz added this to the Feb 14 - Feb 25 milestone Feb 11, 2022

sameerz assigned jlowe Feb 11, 2022

jlowe mentioned this issue Feb 15, 2022

Ensure GpuM2 merge aggregation does not produce a null mean or m2 #4792

Merged

jlowe closed this as completed in #4792 Feb 16, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] test_groupby_std_variance_partial_replace_fallback failed #4744

[BUG] test_groupby_std_variance_partial_replace_fallback failed #4744

jlowe commented Feb 10, 2022

[BUG] test_groupby_std_variance_partial_replace_fallback failed #4744

[BUG] test_groupby_std_variance_partial_replace_fallback failed #4744

Comments

jlowe commented Feb 10, 2022