Skip to content
This repository has been archived by the owner on Feb 5, 2024. It is now read-only.

Add stream support to enable overlap between copy and compute #66

Open
mlxd opened this issue Oct 20, 2022 · 0 comments
Open

Add stream support to enable overlap between copy and compute #66

mlxd opened this issue Oct 20, 2022 · 0 comments
Labels
enhancement New feature or request

Comments

@mlxd
Copy link
Member

mlxd commented Oct 20, 2022

Issue description

Description of the issue - include code snippets and screenshots here
if relevant. You may use the following template below

  • Expected behavior: With the adjoint pipeline often requiring explicit memory copies between GPU buffers, as well as explicit compute calls, it can be beneficial to explicitly add support for CUDA-streams and re-enable the OpenMP threading supports. This will allow CUDA API calls that can be overlapped to do so, and hence reduce existing synchronization points. This will involves updates to the DataBuffer class, as well as explicitly to the StateVectorCudaBase and AdjointJacobianGPU classes, to better make use of the new functionality.
@mlxd mlxd added the enhancement New feature or request label Oct 20, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant