Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UCX/RNDV/CUDA: RNDV protocol improvements for CUDA - v1.9.x #5648

Merged
merged 2 commits into from
Sep 10, 2020

Conversation

bureddy
Copy link
Contributor

@bureddy bureddy commented Sep 2, 2020

porting #5473 from master

@yosefe yosefe added this to the v1.9.0 milestone Sep 2, 2020
@yosefe yosefe changed the title UCX/RNDV/CUDA: RNDV protocol improvements for CUDA -v1.9.x UCX/RNDV/CUDA: RNDV protocol improvements for CUDA - v1.9.x Sep 2, 2020
@shamisp shamisp self-requested a review September 2, 2020 18:11
@yosefe yosefe self-assigned this Sep 2, 2020
@yosefe yosefe requested a review from brminich September 4, 2020 14:18
Copy link
Contributor

@brminich brminich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hoopoepg, can you please take a look as well?

@@ -61,6 +61,8 @@ typedef struct ucp_context_config {
size_t seg_size;
/** RNDV pipeline fragment size */
size_t rndv_frag_size;
/** RNDV pipline send threshold */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pipeline

}

if (ucs_popcount(lane_map) > 1) {
/* remove lanes if bandwidth is too less compare to best lane */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove lanes if bandwidth is too low comparing to the best lane

rndv_rts_hdr->size);

if ((rndv_mode == UCP_RNDV_MODE_PUT_ZCOPY) ||
UCP_MEM_IS_CUDA(rreq->recv.mem_type)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why need to check cuda here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in default UCP_RNDV_MOD_AUTO, we try to use PUT protocol for CUDA IPC case.

@yosefe
Copy link
Contributor

yosefe commented Sep 9, 2020

bot:pipe:retest

@yosefe yosefe merged commit 18c5ab4 into openucx:v1.9.x Sep 10, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants