Correctly use dtype when computing shared memory requirements of separable convolution #409

grlee77 · 2022-09-07T01:30:14Z

closes #408

An error in the keyword arguments passed to the shared memory calculation is fixed in the first commit here. In the second commit, a second fallback is added in case of any other anticipated compilation failure.

calling code was accidentally passing image_dtype to the unused anchor argument which caused the default cp.float32 dtype to always be used in the shared memory calculation. This could result in ptxas errors about too much shared data when the itemsize of the dtype was > than for float32.

…eption occured The ResourceLimitError can be checked very quickly, so we want to keep it as well.

Only warn on shared memory limit if algorithm='shared_memory' was explicitly requested.

jakirkham · 2022-10-04T19:02:34Z

rerun tests

jakirkham · 2022-10-04T19:13:34Z

Thanks Greg! 🙏

Happy with merging. Only question would be if we can include a test for the error. Could happen in this PR or after

gigony

looks good to me. Thank you Greg!

prior to the fix, test failures occured on my system for float64 here for the sigma values 16, 21, 26, 31

grlee77 · 2022-10-05T02:03:40Z

Only question would be if we can include a test for the error. Could happen in this PR or after

Test case has now been added (values aren't checked, it is just to verify that no compilation error occurs). I verified that without the fix in this MR, the dtype=float64 case fails for the parameterized sigmas 16, 21, 26 and 31 on my system.

gigony · 2022-10-05T17:17:29Z

@gpucibot merge

grlee77 added 2 commits September 6, 2022 21:27

Also fallback to CuPy convolution if any other unexpected CompilerExc…

2fbba67

…eption occured The ResourceLimitError can be checked very quickly, so we want to keep it as well.

grlee77 added bug Something isn't working non-breaking Introduces a non-breaking change labels Sep 7, 2022

grlee77 added this to the v22.10.00 milestone Sep 7, 2022

grlee77 requested a review from a team as a code owner September 7, 2022 01:30

grlee77 mentioned this pull request Sep 25, 2022

Adding scikit-image' Blob detection #411

Closed

silence warning if default algorithm=None was used

a2bece4

Only warn on shared memory limit if algorithm='shared_memory' was explicitly requested.

jakirkham approved these changes Oct 4, 2022

View reviewed changes

gigony modified the milestones: v22.10.00, v22.12.00 Oct 4, 2022

gigony approved these changes Oct 4, 2022

View reviewed changes

add test case that failed

b32660c

prior to the fix, test failures occured on my system for float64 here for the sigma values 16, 21, 26, 31

jakirkham approved these changes Oct 5, 2022

View reviewed changes

rapids-bot bot merged commit f047273 into rapidsai:branch-22.10 Oct 5, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Correctly use dtype when computing shared memory requirements of separable convolution #409

Correctly use dtype when computing shared memory requirements of separable convolution #409

grlee77 commented Sep 7, 2022

jakirkham commented Oct 4, 2022

jakirkham commented Oct 4, 2022

gigony left a comment

grlee77 commented Oct 5, 2022

gigony commented Oct 5, 2022

Correctly use dtype when computing shared memory requirements of separable convolution #409

Correctly use dtype when computing shared memory requirements of separable convolution #409

Conversation

grlee77 commented Sep 7, 2022

jakirkham commented Oct 4, 2022

jakirkham commented Oct 4, 2022

gigony left a comment

Choose a reason for hiding this comment

grlee77 commented Oct 5, 2022

gigony commented Oct 5, 2022