Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] GC/OOM on parquet_test.py::test_small_file_memory #9135

Open
mythrocks opened this issue Aug 29, 2023 · 14 comments
Open

[BUG] GC/OOM on parquet_test.py::test_small_file_memory #9135

mythrocks opened this issue Aug 29, 2023 · 14 comments
Assignees
Labels
bug Something isn't working

Comments

@mythrocks
Copy link
Collaborator

mythrocks commented Aug 29, 2023

The parquet_test.py::test_small_file_memory test runs out of memory on CDH (at least on the equivalent of Spark 3.3):

$ SPARK_HOME=$SPARK_HOME ./run_pyspark_from_build.sh -k test_small_file_memory
...
23/08/29 21:17:50 ERROR executor.Executor: [Executor task launch worker for task 2.0 in stage 5.0 (TID 2015)]: Exception in task 2.0 in stage 5.0 (TID 2015)
java.util.concurrent.ExecutionException: java.lang.OutOfMemoryError: Java heap space
	at java.util.concurrent.FutureTask.report(FutureTask.java:122) ~[?:1.8.0_322]
	at java.util.concurrent.FutureTask.get(FutureTask.java:192) ~[?:1.8.0_322]
	at com.nvidia.spark.rapids.MultiFileCoalescingPartitionReaderBase.$anonfun$readPartFiles$7(GpuMultiFileReader.scala:1216) ~[spark3xx-common/:?]
	at com.nvidia.spark.rapids.MultiFileCoalescingPartitionReaderBase.$anonfun$readPartFiles$7$adapted(GpuMultiFileReader.scala:1215) ~[spark3xx-common/:?]
	at scala.collection.Iterator.foreach(Iterator.scala:943) ~[scala-library-2.12.15.jar:?]
	at scala.collection.Iterator.foreach$(Iterator.scala:943) ~[scala-library-2.12.15.jar:?]
...

The failing test takes about a minute to run on my workstation, but considerably longer on CDH, before it fails.

As indicated above, this test was run manually on CDH. As a control group, I was able to run the window_function_test.py on the same setup, without failures.
Additionally, parquet_test.py::test_small_file_memory runs properly on Apache Spark 3.2.x:

$ SPARK_HOME=$SPARK_HOME ./run_pyspark_from_build.sh -k test_small_file_memory
...
../../src/main/python/parquet_test.py::test_small_file_memory[] 23/08/29 21:20:32 WARN SQLConf: The SQL config 'spark.sql.legacy.parquet.datetimeRebaseModeInWrite' has been deprecated in Spark v3.2 and may be removed in the future. Use 'spark.sql.parquet.datetimeRebaseModeInWrite' instead.
23/08/29 21:20:32 WARN SQLConf: The SQL config 'spark.sql.legacy.parquet.int96RebaseModeInWrite' has been deprecated in Spark v3.2 and may be removed in the future. Use 'spark.sql.parquet.int96RebaseModeInWrite' instead.
23/08/29 21:20:42 WARN SQLConf: The SQL config 'spark.sql.legacy.parquet.int96RebaseModeInWrite' has been deprecated in Spark v3.2 and may be removed in the future. Use 'spark.sql.parquet.int96RebaseModeInWrite' instead.
23/08/29 21:20:42 WARN SQLConf: The SQL config 'spark.sql.legacy.parquet.datetimeRebaseModeInWrite' has been deprecated in Spark v3.2 and may be removed in the future. Use 'spark.sql.parquet.datetimeRebaseModeInWrite' instead.
PASSED   [ 50%]
../../src/main/python/parquet_test.py::test_small_file_memory[parquet] 23/08/29 21:20:45 WARN SQLConf: The SQL config 'spark.sql.legacy.parquet.datetimeRebaseModeInWrite' has been deprecated in Spark v3.2 and may be removed in the future. Use 'spark.sql.parquet.datetimeRebaseModeInWrite' instead.
23/08/29 21:20:45 WARN SQLConf: The SQL config 'spark.sql.legacy.parquet.int96RebaseModeInWrite' has been deprecated in Spark v3.2 and may be removed in the future. Use 'spark.sql.parquet.int96RebaseModeInWrite' instead.
23/08/29 21:20:53 WARN SQLConf: The SQL config 'spark.sql.legacy.parquet.int96RebaseModeInWrite' has been deprecated in Spark v3.2 and may be removed in the future. Use 'spark.sql.parquet.int96RebaseModeInWrite' instead.
23/08/29 21:20:53 WARN SQLConf: The SQL config 'spark.sql.legacy.parquet.datetimeRebaseModeInWrite' has been deprecated in Spark v3.2 and may be removed in the future. Use 'spark.sql.parquet.datetimeRebaseModeInWrite' instead.
PASSED [100%]

It appears that something is indeed up.

@mythrocks mythrocks added bug Something isn't working ? - Needs Triage Need team to review and classify and removed ? - Needs Triage Need team to review and classify labels Aug 29, 2023
@mythrocks
Copy link
Collaborator Author

I should point out that this is a pre-existing test (to stress-test coalesced reads in Parquet). Literally 3 years ago: 7ac919b.

@mythrocks
Copy link
Collaborator Author

I'm wondering if this message has anything to do with the problem:

23/08/29 22:00:16 WARN  rapids.MultiFileReaderThreadPool: [Executor task launch worker for task 2.0 in stage 5.0 (TID 2015)]: Configuring the file reader thread pool with a max of 128 threads instead of spark.rapids.sql.multiThreadedRead.numThreads = 20
23/08/29 22:00:16 WARN  rapids.MultiFileReaderThreadPool: [Executor task launch worker for task 3.0 in stage 5.0 (TID 2016)]: Configuring the file reader thread pool with a max of 128 threads instead of spark.rapids.sql.multiThreadedRead.numThreads = 20
23/08/29 22:00:16 WARN  rapids.MultiFileReaderThreadPool: [Executor task launch worker for task 0.0 in stage 5.0 (TID 2013)]: Configuring the file reader thread pool with a max of 128 threads instead of spark.rapids.sql.multiThreadedRead.numThreads = 20
23/08/29 22:00:16 WARN  rapids.MultiFileReaderThreadPool: [Executor task launch worker for task 1.0 in stage 5.0 (TID 2014)]: Configuring the file reader thread pool with a max of 128 threads instead of spark.rapids.sql.multiThreadedRead.numThreads = 20

@mythrocks
Copy link
Collaborator Author

mythrocks commented Aug 29, 2023

I've done some more digging. I think this might have to do with resetting the thread pool size or something.

Here is an easy repro:

spark-shell --jars /home/cloudera/mithunr/spark-rapids/dist/target/rapids-4-spark_2.12-23.10.0-SNAPSHOT-cuda11.jar --conf sparns=com.nvidia.spark.SQLPlugin --conf spark.rapids.sql.explain=ALL --conf spark.kryo.registrator=com.nvidia.spark.rapids.GpuKryoRegistrator --conf fs.defaultFS="file:///" --conf spark.rapids.sql.format.parquet.reader.type=COALESCING --conf spark.sql.sources.useV1SourceList="" --conf spark.sql.files.maxPartitionBytes=1g --master local[1]
// Write corpus.
spark.conf.set("spark.rapids.sql.enabled", false)
(0 to 2048).toDF.repartition(2000).write.mode("overwrite").parquet("file:///tmp/myth_parq_ints")

// Read for boom.
spark.conf.set("spark.rapids.sql.enabled", true)
spark.read.parquet("file:///tmp/myth_parq_ints").show
  1. It does not matter if the data resides on HDFS or local disk.
  2. PERFILE and MULTITHREADED readers are fine. Only COALESCING reader is conking.

I'm not yet convinced that this behaviour is specific to CDH. I wonder if this could be an artifact of having a large number of threads, maybe?

$ lscpu | fgrep CPU\(s\) | head -1
CPU(s):                128

@mythrocks
Copy link
Collaborator Author

mythrocks commented Aug 30, 2023

More experiments:

  1. Reproduced same crash with Apache Spark 3.3.0, on the same node. Looks to be independent of CDH.
  2. The read succeeds if Spark is started up with PERFILE reader, and then changed to COALESCING.

@jlowe
Copy link
Member

jlowe commented Aug 30, 2023

What is the executor core count set to on the cluster? Is the test running in local mode or in cluster mode?

@jlowe
Copy link
Member

jlowe commented Aug 30, 2023

I think this is related to #9051. I suspect we were using the incorrect driver core count instead of the (I suspect larger) executor core count when setting up the multithreaded reader pool. Easiest fix is to give the executors more heap space on the cloudera cluster setup if that's possible. Long-term fix is budgeting host memory usage. This initiative is underway for off-heap memory, but this is a heap OOM. Would be interesting to get a heap dump on OOM to see what's taking up all the heap space.

@mythrocks
Copy link
Collaborator Author

What is the executor core count set to on the cluster? Is the test running in local mode or in cluster mode?

I've run it in local mode, for both CDH and Apache Spark. I have set --master local[1]. That should imply a single core, no?

I've captured a heap-dump on OOM. Analyzing it now.

@mythrocks
Copy link
Collaborator Author

mythrocks commented Aug 30, 2023

Heap dump indicates that there are 128 java.lang.Threads, each holding about 8MB Java local byte[]. That exhausts the default 1GB heap in my runs. IMO this implies that the large core count might be part of the problem.

I see that the COALESCING multi-threaded Parquet reader decides to use 128 threads in its memory pool, thereby exhausting the heap.

23/08/30 22:09:33 WARN MultiFileReaderThreadPool: Configuring the file reader thread pool with a max of 128 threads instead of spark.rapids.sql.multiThreadedRead.numThreads = 20

This is with spark.executor.cores not set to anything, probably causing the pool to consume all available CPUs.

There might be a couple of ways around the OOM:

  1. As @jlowe suggested already, we could bump the heap for the test. (In my run, --driver-memory 8g worked, even with 128 threads.)
  2. We could set spark.executor.cores to a specific, smaller value.

I'm inclined to try the latter. I'll update here with results.

Edit: There might be a longer term solution here, to size the GpuMultiFileReader's thread-pool as a function of both number of cores and available memory.

@jlowe
Copy link
Member

jlowe commented Aug 30, 2023

We may want to revisit the 8MB heap buffer that is being used. This was copied from some existing Parquet reading code and could make sense to produce large, chunky reads, especially for cloud applications, but asking for 8MB when there are hundreds of threads is...problematic. We may want to automatically scale that buffer size based on the number of threads or simply use a smaller value for the temporary buffer.

@mythrocks
Copy link
Collaborator Author

mythrocks commented Aug 30, 2023

Ah, it turns out that one can't simply set spark.executor.cores for a specific test (test_small_file_memory). The Spark application is already up by that point.

One option is to set spark.executor.cores in run_pyspark_from_build.sh. I don't know if that would be acceptable. I did verify that that allowed this test to run.

mythrocks added a commit to mythrocks/spark-rapids that referenced this issue Sep 1, 2023
Fixes NVIDIA#9135. (By workaround.)

This change sets `spark.executor.cores` to `10`, if it is unset. This allows
integration tests to work around the failure seen in `parquet_test.py:test_small_file_memory`,
where the `COALESCING` Parquet reader's thread pool accidentally uses 128 threads with 8MB memory
each, thus consuming the entire heap.

Note that this is a bit of a workaround.  A more robust solution would be to scale the Parquet
reader's buffers based on the amount of available memory, and the number of threads.

Signed-off-by: MithunR <mythrocks@gmail.com>
@mythrocks
Copy link
Collaborator Author

There is a tentative workaround to get this test to run on the large CDH nodes here:
#9177.

The more robust fix (to size the Parquet reader's buffers "appropriately") may be tackled later on.

@mythrocks
Copy link
Collaborator Author

I have raised #9269 and #9271 to mitigate this problem.

@pxLi pxLi changed the title [BUG] GC/OOM on parquet_test.py::test_small_file_memory on CDH [BUG] GC/OOM on parquet_test.py::test_small_file_memory Sep 27, 2023
@pxLi
Copy link
Collaborator

pxLi commented Sep 27, 2023

We also saw this case failed OOM constantly while testing in arm environment
(CI HW specs: GPU: A30, CPU 128 cores which is similar to CDH node)

00:28:11      def test_small_file_memory(spark_tmp_path, v1_enabled_list):
00:28:11          # stress the memory usage by creating a lot of small files.
00:28:11          # The more files we combine the more the offsets will be different which will cause
00:28:11          # footer size to change.
00:28:11          # Without the addition of extraMemory in GpuParquetScan this would cause reallocations
00:28:11          # of the host memory buffers.
00:28:11          cols = [string_gen] * 4
00:28:11          gen_list = [('_c' + str(i), gen ) for i, gen in enumerate(cols)]
00:28:11          first_data_path = spark_tmp_path + '/PARQUET_DATA'
00:28:11          with_cpu_session(
00:28:11                  lambda spark : gen_df(spark, gen_list).repartition(2000).write.parquet(first_data_path),
00:28:11                  conf=rebase_write_corrected_conf)
00:28:11          data_path = spark_tmp_path + '/PARQUET_DATA'
00:28:11  >       assert_gpu_and_cpu_are_equal_collect(
00:28:11                  lambda spark : spark.read.parquet(data_path),
00:28:11                  conf={'spark.rapids.sql.format.parquet.reader.type': 'COALESCING',
00:28:11                        'spark.sql.sources.useV1SourceList': v1_enabled_list,
00:28:11                        'spark.sql.files.maxPartitionBytes': "1g"})
org.apache.spark.SparkException: Job aborted due to stage failure: Task 3 in stage 19511.0 failed 1 times, most recent failure: Lost task 3.0 in stage 19511.0 (TID 77495) (verify-pxli-mvn-verify-github-jtzwk-6nhg1 executor driver): java.util.concurrent.ExecutionException: java.lang.OutOfMemoryError: Java heap space

00:28:11  E                   Driver stacktrace:
00:28:11  E                   	at org.apache.spark.scheduler.DAGScheduler.failJobAndIndependentStages(DAGScheduler.scala:2253)
00:28:11  E                   	at org.apache.spark.scheduler.DAGScheduler.$anonfun$abortStage$2(DAGScheduler.scala:2202)
00:28:11  E                   	at org.apache.spark.scheduler.DAGScheduler.$anonfun$abortStage$2$adapted(DAGScheduler.scala:2201)
00:28:11  E                   	at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)
00:28:11  E                   	at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)
00:28:11  E                   	at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)
00:28:11  E                   	at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:2201)
00:28:11  E                   	at org.apache.spark.scheduler.DAGScheduler.$anonfun$handleTaskSetFailed$1(DAGScheduler.scala:1078)
00:28:11  E                   	at org.apache.spark.scheduler.DAGScheduler.$anonfun$handleTaskSetFailed$1$adapted(DAGScheduler.scala:1078)
00:28:11  E                   	at scala.Option.foreach(Option.scala:407)
00:28:11  E                   	at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:1078)
00:28:11  E                   	at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:2440)
00:28:11  E                   	at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2382)
00:28:11  E                   	at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2371)
00:28:11  E                   	at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:49)
00:28:11  E                   	at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:868)
00:28:11  E                   	at org.apache.spark.SparkContext.runJob(SparkContext.scala:2202)
00:28:11  E                   	at org.apache.spark.SparkContext.runJob(SparkContext.scala:2223)
00:28:11  E                   	at org.apache.spark.SparkContext.runJob(SparkContext.scala:2242)
00:28:11  E                   	at org.apache.spark.SparkContext.runJob(SparkContext.scala:2267)
00:28:11  E                   	at org.apache.spark.rdd.RDD.$anonfun$collect$1(RDD.scala:1030)
00:28:11  E                   	at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
00:28:11  E                   	at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
00:28:11  E                   	at org.apache.spark.rdd.RDD.withScope(RDD.scala:414)
00:28:11  E                   	at org.apache.spark.rdd.RDD.collect(RDD.scala:1029)
00:28:11  E                   	at org.apache.spark.sql.execution.SparkPlan.executeCollect(SparkPlan.scala:390)
00:28:11  E                   	at org.apache.spark.sql.Dataset.$anonfun$collectToPython$1(Dataset.scala:3519)
00:28:11  E                   	at org.apache.spark.sql.Dataset.$anonfun$withAction$1(Dataset.scala:3687)
00:28:11  E                   	at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$5(SQLExecution.scala:103)
00:28:11  E                   	at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:163)
00:28:11  E                   	at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:90)
00:28:11  E                   	at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:772)
00:28:11  E                   	at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:64)
00:28:11  E                   	at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3685)
00:28:11  E                   	at org.apache.spark.sql.Dataset.collectToPython(Dataset.scala:3516)
00:28:11  E                   	at sun.reflect.GeneratedMethodAccessor142.invoke(Unknown Source)
00:28:11  E                   	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
00:28:11  E                   	at java.lang.reflect.Method.invoke(Method.java:498)
00:28:11  E                   	at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
00:28:11  E                   	at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
00:28:11  E                   	at py4j.Gateway.invoke(Gateway.java:282)
00:28:11  E                   	at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
00:28:11  E                   	at py4j.commands.CallCommand.execute(CallCommand.java:79)
00:28:11  E                   	at py4j.GatewayConnection.run(GatewayConnection.java:238)
00:28:11  E                   	at java.lang.Thread.run(Thread.java:750)
00:28:11  E                   Caused by: java.util.concurrent.ExecutionException: java.lang.OutOfMemoryError: Java heap space
00:28:11  E                   	at java.util.concurrent.FutureTask.report(FutureTask.java:122)
00:28:11  E                   	at java.util.concurrent.FutureTask.get(FutureTask.java:192)
00:28:11  E                   	at com.nvidia.spark.rapids.MultiFileCoalescingPartitionReaderBase.$anonfun$readPartFiles$7(GpuMultiFileReader.scala:1216)
00:28:11  E                   	at com.nvidia.spark.rapids.MultiFileCoalescingPartitionReaderBase.$anonfun$readPartFiles$7$adapted(GpuMultiFileReader.scala:1215)
00:28:11  E                   	at scala.collection.Iterator.foreach(Iterator.scala:941)
00:28:11  E                   	at scala.collection.Iterator.foreach$(Iterator.scala:941)
00:28:11  E                   	at scala.collection.AbstractIterator.foreach(Iterator.scala:1429)
00:28:11  E                   	at scala.collection.IterableLike.foreach(IterableLike.scala:74)
00:28:11  E                   	at scala.collection.IterableLike.foreach$(IterableLike.scala:73)
00:28:11  E                   	at scala.collection.AbstractIterable.foreach(Iterable.scala:56)
00:28:11  E                   	at com.nvidia.spark.rapids.MultiFileCoalescingPartitionReaderBase.$anonfun$readPartFiles$4(GpuMultiFileReader.scala:1215)
00:28:11  E                   	at com.nvidia.spark.rapids.Arm$.closeOnExcept(Arm.scala:88)
00:28:11  E                   	at com.nvidia.spark.rapids.MultiFileCoalescingPartitionReaderBase.$anonfun$readPartFiles$1(GpuMultiFileReader.scala:1198)
00:28:11  E                   	at com.nvidia.spark.rapids.Arm$.withResource(Arm.scala:29)
00:28:11  E                   	at com.nvidia.spark.rapids.MultiFileCoalescingPartitionReaderBase.readPartFiles(GpuMultiFileReader.scala:1185)
00:28:11  E                   	at com.nvidia.spark.rapids.MultiFileCoalescingPartitionReaderBase.$anonfun$readBatch$1(GpuMultiFileReader.scala:1146)
00:28:11  E                   	at com.nvidia.spark.rapids.Arm$.withResource(Arm.scala:29)
00:28:11  E                   	at com.nvidia.spark.rapids.MultiFileCoalescingPartitionReaderBase.readBatch(GpuMultiFileReader.scala:1125)
00:28:11  E                   	at com.nvidia.spark.rapids.MultiFileCoalescingPartitionReaderBase.next(GpuMultiFileReader.scala:1098)
00:28:11  E                   	at com.nvidia.spark.rapids.PartitionIterator.hasNext(dataSourceUtil.scala:29)
00:28:11  E                   	at com.nvidia.spark.rapids.MetricsBatchIterator.hasNext(dataSourceUtil.scala:46)
00:28:11  E                   	at org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:37)
00:28:11  E                   	at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:458)
00:28:11  E                   	at com.nvidia.spark.rapids.ColumnarToRowIterator.$anonfun$fetchNextBatch$3(GpuColumnarToRowExec.scala:285)
00:28:11  E                   	at com.nvidia.spark.rapids.Arm$.withResource(Arm.scala:29)
00:28:11  E                   	at com.nvidia.spark.rapids.ColumnarToRowIterator.fetchNextBatch(GpuColumnarToRowExec.scala:284)
00:28:11  E                   	at com.nvidia.spark.rapids.ColumnarToRowIterator.loadNextBatch(GpuColumnarToRowExec.scala:257)
00:28:11  E                   	at com.nvidia.spark.rapids.ColumnarToRowIterator.hasNext(GpuColumnarToRowExec.scala:301)
00:28:11  E                   	at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:458)
00:28:11  E                   	at org.apache.spark.sql.execution.SparkPlan.$anonfun$getByteArrayRdd$1(SparkPlan.scala:345)
00:28:11  E                   	at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2(RDD.scala:898)
00:28:11  E                   	at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2$adapted(RDD.scala:898)
00:28:11  E                   	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
00:28:11  E                   	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373)
00:28:11  E                   	at org.apache.spark.rdd.RDD.iterator(RDD.scala:337)
00:28:11  E                   	at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
00:28:11  E                   	at org.apache.spark.scheduler.Task.run(Task.scala:131)
00:28:11  E                   	at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:497)
00:28:11  E                   	at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1439)
00:28:11  E                   	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:500)
00:28:11  E                   	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
00:28:11  E                   	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
00:28:11  E                   	... 1 more
00:28:11  E                   Caused by: java.lang.OutOfMemoryError: Java heap space

I will re-verify after #9269 #9271 are resolved

@mythrocks
Copy link
Collaborator Author

It stands to reason that it fails. :/ High core count, with low heap allocation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants