Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ARM] Fix NCHWc int8 dot product schedule lowering #10773

Merged
merged 3 commits into from
Mar 28, 2022

Conversation

masahi
Copy link
Member

@masahi masahi commented Mar 24, 2022

The code path

if is_dotprod_available():
intrin = dot_int8_int8_int32_neon_82(int32_lanes=4)

is broken because (1) we are not passing dtype and (2) the intrin description has a bug in the number of arguments to sdot/udot. For example I get this error if I try to run it:

  File "/home/masa/projects/dev/tvm/src/target/llvm/codegen_llvm.cc", line 981
TVMError: 
---------------------------------------------------------------
An error occurred during the execution of TVM.
For more information, please see: https://tvm.apache.org/docs/errors.html
---------------------------------------------------------------
  Check failed: (f) is false: Cannot find intrinsic declaration, possible type mismatch: llvm.aarch64.neon.udot

The PR that added this intrin, #3978, didn't add a test so this code path was never tested. The PR added something under apps/topi_recipe/conv, https://github.com/apache/tvm/blob/main/apps/topi_recipe/conv/test_conv_int8_arm.py, but this script is broken and needs fixing too.

@tkonolige

name="conv2d_nchw_spatial_pack.arm_cpu",
plevel=10,
)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

schedule_conv2d_nchw_spatial_pack has tophub entries, so they are always picked up based on "time cost" even if topi.arm_cpu.is_int8_hw_support path is taken and the NCHWc int8 schedule is surely faster.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like another good argument for removing tophub. You can disable it for the test by putting it all in an empty ApplyHistoryBest.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The annoying thing is this happens on only specific input or filter sizes. It took me some time to figure out why the NCHWc schedule is not used when the input shape is (56, 56) and filter size is 3x3. If I use (128, 128) shape or 1x1 filter, I didn't see this issue.

Copy link
Contributor

@tkonolige tkonolige left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @masahi

@junrushao
Copy link
Member

@vinx13 would you mind taking a look? Thanks a lot!

Copy link
Contributor

@u99127 u99127 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to add this test - thank you for finding and fixing this. ]

@Mousius
Copy link
Member

Mousius commented Mar 28, 2022

LGTM, I'll wait for @vinx13 to take a look 😸

@vinx13 vinx13 merged commit 7896108 into apache:main Mar 28, 2022
pfk-beta pushed a commit to pfk-beta/tvm that referenced this pull request Apr 11, 2022
* [ARM] Fix NCHWc int8 dot product schedule lowering

* fix arm task extraction test not running

* skip test on i386
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants