Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Correctly use dtype when computing shared memory requirements of separable convolution #409

Merged

Conversation

grlee77
Copy link
Contributor

@grlee77 grlee77 commented Sep 7, 2022

closes #408

An error in the keyword arguments passed to the shared memory calculation is fixed in the first commit here. In the second commit, a second fallback is added in case of any other anticipated compilation failure.

calling code was accidentally passing image_dtype to the unused anchor argument
which caused the default cp.float32 dtype to always be used in the shared memory
calculation. This could result in ptxas errors about too much shared data when
the itemsize of the dtype was > than for float32.
…eption occured

The ResourceLimitError can be checked very quickly, so we want to keep it as well.
@grlee77 grlee77 added bug Something isn't working non-breaking Introduces a non-breaking change labels Sep 7, 2022
@grlee77 grlee77 added this to the v22.10.00 milestone Sep 7, 2022
@grlee77 grlee77 requested a review from a team as a code owner September 7, 2022 01:30
Only warn on shared memory limit if algorithm='shared_memory'
was explicitly requested.
@jakirkham
Copy link
Member

rerun tests

@jakirkham
Copy link
Member

Thanks Greg! 🙏

Happy with merging. Only question would be if we can include a test for the error. Could happen in this PR or after

@gigony gigony modified the milestones: v22.10.00, v22.12.00 Oct 4, 2022
Copy link
Contributor

@gigony gigony left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good to me. Thank you Greg!

prior to the fix, test failures occured on my system for float64 here
for the sigma values 16, 21, 26, 31
@grlee77
Copy link
Contributor Author

grlee77 commented Oct 5, 2022

Only question would be if we can include a test for the error. Could happen in this PR or after

Test case has now been added (values aren't checked, it is just to verify that no compilation error occurs). I verified that without the fix in this MR, the dtype=float64 case fails for the parameterized sigmas 16, 21, 26 and 31 on my system.

@gigony
Copy link
Contributor

gigony commented Oct 5, 2022

@gpucibot merge

@rapids-bot rapids-bot bot merged commit f047273 into rapidsai:branch-22.10 Oct 5, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working non-breaking Introduces a non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] shared memory error during separable kernel compilation edge cases
3 participants