Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stop double closing SerializeBatchDeserializeHostBuffer host buffers when running with Spark 3.2 #3422

Merged
merged 1 commit into from
Sep 10, 2021

Conversation

andygrove
Copy link
Contributor

Closes #3416

This PR fixes a bug that we don't seem to have been hitting until Spark 3.2.0.

SerializeConcatHostBuffersDeserializeBatch.close closes all of its instances of SerializeBatchDeserializeHostBuffer which in turn causes the underlying HostMemoryBuffer instances to be closed. SerializeConcatHostBuffersDeserializeBatch.close then also tries closes the underlying HostMemoryBuffer instances, causing the Close called too many times issue.

I added debug logging to both close methods and did not see them get called with earlier Spark versions, so maybe this code was only getting called in certain contexts where the existing code is safe? I feel like I may be missing something here.

Here is the output with Spark 3.2.0 demonstrating that the problem described in #3416 is now resolved.

I ran mvn verify with the default Spark version and did not see any additional host memory leaks reported.

[BENCHMARK RUNNER] [q58] Iteration 0 took 17242 msec. Status: Completed.
[BENCHMARK RUNNER] [q58] Iteration 0 took 17242 msec. Status: Completed
[BENCHMARK RUNNER] [q58] Saving benchmark report to tpcds-adhoc-q58-1631195118537.json
21/09/09 13:45:36 INFO RapidsBufferCatalog: Closing storage
21/09/09 13:45:36 INFO BlockManagerInfo: Removed broadcast_48_piece0 on 192.168.0.6:39771 in memory (size: 72.5 KiB, free: 34.0 GiB)
21/09/09 13:45:38 INFO BlockManagerInfo: Removed broadcast_25_piece0 on 192.168.0.6:39771 in memory (size: 34.1 KiB, free: 34.0 GiB)
21/09/09 13:45:38 INFO BlockManagerInfo: Removed broadcast_42_piece0 on 192.168.0.6:39771 in memory (size: 34.1 KiB, free: 34.0 GiB)
21/09/09 13:45:38 INFO BlockManagerInfo: Removed broadcast_32_piece0 on 192.168.0.6:39771 in memory (size: 34.1 KiB, free: 34.0 GiB)
21/09/09 13:45:38 INFO BlockManagerInfo: Removed broadcast_28_piece0 on 192.168.0.6:39771 in memory (size: 2.1 MiB, free: 34.0 GiB)
21/09/09 13:45:38 INFO BlockManagerInfo: Removed broadcast_39_piece0 on 192.168.0.6:39771 in memory (size: 34.1 KiB, free: 34.0 GiB)
21/09/09 13:45:38 INFO BlockManagerInfo: Removed broadcast_36_piece0 on 192.168.0.6:39771 in memory (size: 354.0 B, free: 34.0 GiB)
21/09/09 13:45:38 INFO BlockManagerInfo: Removed broadcast_37_piece0 on 192.168.0.6:39771 in memory (size: 354.0 B, free: 34.0 GiB)
21/09/09 13:45:38 INFO BlockManagerInfo: Removed broadcast_24_piece0 on 192.168.0.6:39771 in memory (size: 34.1 KiB, free: 34.0 GiB)
21/09/09 13:45:38 INFO BlockManagerInfo: Removed broadcast_30_piece0 on 192.168.0.6:39771 in memory (size: 34.1 KiB, free: 34.0 GiB)
21/09/09 13:45:38 INFO BlockManagerInfo: Removed broadcast_41_piece0 on 192.168.0.6:39771 in memory (size: 361.0 B, free: 34.0 GiB)
21/09/09 13:45:38 INFO BlockManagerInfo: Removed broadcast_43_piece0 on 192.168.0.6:39771 in memory (size: 34.1 KiB, free: 34.0 GiB)
21/09/09 13:45:38 INFO BlockManagerInfo: Removed broadcast_45_piece0 on 192.168.0.6:39771 in memory (size: 34.1 KiB, free: 34.0 GiB)
21/09/09 13:45:38 INFO BlockManagerInfo: Removed broadcast_31_piece0 on 192.168.0.6:39771 in memory (size: 34.1 KiB, free: 34.0 GiB)
21/09/09 13:45:38 INFO BlockManagerInfo: Removed broadcast_38_piece0 on 192.168.0.6:39771 in memory (size: 354.0 B, free: 34.0 GiB)
21/09/09 13:45:38 INFO SparkContext: Invoking stop() from shutdown hook
21/09/09 13:45:38 INFO SparkUI: Stopped Spark web UI at http://192.168.0.6:4040
21/09/09 13:45:38 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
21/09/09 13:45:38 INFO MemoryStore: MemoryStore cleared
21/09/09 13:45:38 INFO BlockManager: BlockManager stopped
21/09/09 13:45:38 INFO BlockManagerMaster: BlockManagerMaster stopped
21/09/09 13:45:38 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
21/09/09 13:45:38 INFO SparkContext: Successfully stopped SparkContext
21/09/09 13:45:38 INFO DiskBlockManager: Shutdown hook called
21/09/09 13:45:38 INFO ShutdownHookManager: Shutdown hook called
21/09/09 13:45:38 INFO ShutdownHookManager: Deleting directory /tmp/spark-447dbf7f-547f-4bfe-bc0b-15ffb7224840
21/09/09 13:45:38 INFO ShutdownHookManager: Deleting directory /tmp/spark-b0257414-42d3-4325-b675-724ab6bc2ae2

@andygrove andygrove self-assigned this Sep 9, 2021
@andygrove andygrove added bug Something isn't working Spark 3.2+ labels Sep 9, 2021
@andygrove
Copy link
Contributor Author

@revans2 Could you review this when you get a chance? This PR does fix the issue I am seeing but I am nervous that there is something I am not understanding here which caused us to write the original code the way it was.

Signed-off-by: Andy Grove <andygrove@nvidia.com>
Copy link
Collaborator

@revans2 revans2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks correct.

GpuBroadcastExchangeExec will convert ColumnarBatch instances to SerializeBatchDeserializeHostBuffer instances before calling collect on them.

val data = child.executeColumnar().map(cb => try {
new SerializeBatchDeserializeHostBuffer(cb)
} finally {
cb.close()
})

Within each task SerializeBatchDeserializeHostBuffer copies the data to the host when it is created so the GPU version can be closed without leaking. The host version is closed when it is serialized out.

The data is read back in on the driver, but it is read into header and buffer, so the data is never on the GPU, only ever on the host. SerializeConcatHostBuffersDeserializeBatch takes ownership of the batches passed into it data, and then pulls out headers and buffers to access the underlying data more conveniently. Because buffers is owned by data, closing data should close them as well.

@andygrove
Copy link
Contributor Author

build

@jlowe jlowe merged commit 7e0cccb into NVIDIA:branch-21.10 Sep 10, 2021
@andygrove andygrove deleted the fix-double-close-320 branch September 10, 2021 15:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Spark 3.2+
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] Resource cleanup issues with Spark 3.2
3 participants