Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] AcceleratedColumnarToRowIterator queue empty #1195

Closed
abellina opened this issue Nov 24, 2020 · 1 comment · Fixed by #1204
Closed

[BUG] AcceleratedColumnarToRowIterator queue empty #1195

abellina opened this issue Nov 24, 2020 · 1 comment · Fixed by #1204
Assignees
Labels
bug Something isn't working P0 Must have for release

Comments

@abellina
Copy link
Collaborator

I noticed this while running Q51 from TPCDS at 3TB. It's the only Exception seen in the executors.

Not an easy to reproduce case other than Q51, but I think some print debug may help here.

20/11/18 07:22:19 ERROR Executor: Exception in task 123.0 in stage 79.0 (TID 2407)
java.util.NoSuchElementException: queue empty
        at scala.collection.mutable.Queue.dequeue(Queue.scala:67)
        at com.nvidia.spark.rapids.AcceleratedColumnarToRowIterator.$anonfun$loadNextBatch$4(GpuColumnarToRowExec.scala:124)
        at com.nvidia.spark.rapids.AcceleratedColumnarToRowIterator.$anonfun$loadNextBatch$4$adapted(GpuColumnarToRowExec.scala:120)
        at com.nvidia.spark.rapids.Arm.withResource(Arm.scala:46)
        at com.nvidia.spark.rapids.Arm.withResource$(Arm.scala:44)
        at com.nvidia.spark.rapids.AcceleratedColumnarToRowIterator.withResource(GpuColumnarToRowExec.scala:38)
        at com.nvidia.spark.rapids.AcceleratedColumnarToRowIterator.$anonfun$loadNextBatch$3(GpuColumnarToRowExec.scala:120)
        at com.nvidia.spark.rapids.AcceleratedColumnarToRowIterator.$anonfun$loadNextBatch$3$adapted(GpuColumnarToRowExec.scala:119)
        at com.nvidia.spark.rapids.Arm.withResource(Arm.scala:28)
        at com.nvidia.spark.rapids.Arm.withResource$(Arm.scala:26)
        at com.nvidia.spark.rapids.AcceleratedColumnarToRowIterator.withResource(GpuColumnarToRowExec.scala:38)
        at com.nvidia.spark.rapids.AcceleratedColumnarToRowIterator.$anonfun$loadNextBatch$2(GpuColumnarToRowExec.scala:119)
        at com.nvidia.spark.rapids.AcceleratedColumnarToRowIterator.$anonfun$loadNextBatch$2$adapted(GpuColumnarToRowExec.scala:118)
        at com.nvidia.spark.rapids.Arm.withResource(Arm.scala:28)
        at com.nvidia.spark.rapids.Arm.withResource$(Arm.scala:26)
        at com.nvidia.spark.rapids.AcceleratedColumnarToRowIterator.withResource(GpuColumnarToRowExec.scala:38)
        at com.nvidia.spark.rapids.AcceleratedColumnarToRowIterator.$anonfun$loadNextBatch$1(GpuColumnarToRowExec.scala:118)
        at com.nvidia.spark.rapids.AcceleratedColumnarToRowIterator.$anonfun$loadNextBatch$1$adapted(GpuColumnarToRowExec.scala:102)
        at com.nvidia.spark.rapids.Arm.withResource(Arm.scala:28)
        at com.nvidia.spark.rapids.Arm.withResource$(Arm.scala:26)
        at com.nvidia.spark.rapids.AcceleratedColumnarToRowIterator.withResource(GpuColumnarToRowExec.scala:38)
        at com.nvidia.spark.rapids.AcceleratedColumnarToRowIterator.loadNextBatch(GpuColumnarToRowExec.scala:102)
        at com.nvidia.spark.rapids.AcceleratedColumnarToRowIterator.hasNext(GpuColumnarToRowExec.scala:136)
        at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:458)
        at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:458)
        at scala.collection.convert.Wrappers$IteratorWrapper.hasNext(Wrappers.scala:31)
        at org.sparkproject.guava.collect.Ordering.leastOf(Ordering.java:628)
        at org.apache.spark.util.collection.Utils$.takeOrdered(Utils.scala:37)
        at org.apache.spark.rdd.RDD.$anonfun$takeOrdered$2(RDD.scala:1492)
        at org.apache.spark.rdd.RDD.$anonfun$mapPartitions$2(RDD.scala:837)
        at org.apache.spark.rdd.RDD.$anonfun$mapPartitions$2$adapted(RDD.scala:837)
        at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:349)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:313)
        at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
        at org.apache.spark.scheduler.Task.run(Task.scala:127)
        at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:446)
        at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1377)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:449)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
@abellina abellina added bug Something isn't working ? - Needs Triage Need team to review and classify labels Nov 24, 2020
@abellina
Copy link
Collaborator Author

More info here. @revans2 suspected this was an empty batch, the 0 below indicates a 0 row batch.

20/11/24 15:46:25 WARN AcceleratedColumnarToRowIterator: will dequeue: 6 9
20/11/24 15:46:25 WARN AcceleratedColumnarToRowIterator: will dequeue: 6 4
20/11/24 15:46:25 WARN AcceleratedColumnarToRowIterator: will dequeue: 6 7
20/11/24 15:46:25 WARN AcceleratedColumnarToRowIterator: will dequeue: 6 0
20/11/24 15:46:25 ERROR Executor: Exception in task 123.0 in stage 51.0 (TID 1213)
java.util.NoSuchElementException: queue empty

@sameerz sameerz added P0 Must have for release and removed ? - Needs Triage Need team to review and classify labels Nov 24, 2020
@sameerz sameerz added this to the Nov 23 - Dec 4 milestone Nov 24, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working P0 Must have for release
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants