Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Conv2DTransposed] Fix wrong shape check and add new TOPI module to support groups #9465

Merged
merged 29 commits into from
Nov 12, 2021

Conversation

Lyken17
Copy link
Contributor

@Lyken17 Lyken17 commented Nov 6, 2021

This PR includes two bug fixs. The first one is the wrong shape check in src/relay/op/nn/convolution.h, the other is about enable groups > 1 in current relay generic as well as specializations.

Shape check

The default shape format of TVM is N x C x iH x iW for input and I x O x kH x kW for weight, a proper shape for Conv2dTransposed should be

  • input: (batch, in_channels, iH, iW)
  • weight: (in_channels, out_channels // groups, kH, kW)

Therefore the original channel checking.

ICHECK(reporter->AssertEQ(param->channels, wshape[1]))

is wrong. This should be wshape[0] rather than wshape[1].

Thus the original groups checking

ICHECK(reporter->AssertEQ(indexdiv(dshape_nchw[1], param->groups), wshape[0]));

is wrong. The proper comparison dimension should be wshape[1] rather than wshape[0].

Besides, the name for debug is not correct either. All logging information are using conv2d rather than conv2d_transposed, which is confusing.

Example to trigger error in current implementation

import numpy as np

import tvm
from tvm import relay
from tvm import relay, auto_scheduler
from tvm.relay import testing


SEMVER = '#[version = "0.0.5"]\n'

def assert_graph_equal(lhs, rhs):
    tvm.ir.assert_structural_equal(lhs, rhs, map_free_vars=True)

def roundtrip(expr):
    x = tvm.parser.fromtext(expr.astext())
    assert_graph_equal(x, expr)

# Testing Utilities for full modules.
def parse_module(code):
    mod = tvm.parser.parse(SEMVER + code)
    roundtrip(mod)
    return mod


program = """
def @main(%input0: Tensor[(1, 32, 224, 224), float32], 
        %v0_0_weight: Tensor[(32, 1, 3, 3), float32]) -> Tensor[(1, 32, 224, 224), float32] {
  /* test comment */
  %0 = nn.conv2d_transpose(%input0, %v0_0_weight, strides=[1, 1], padding=[1, 1, 1, 1], groups=32, channels=32, kernel_size=[3, 3]);
  %0
}
"""

mod = parse_module(program)
print(mod)

# This module cannot be built yet. There are bugs in the topi needed to be fix as well. 
# target = "llvm"
# lib = relay.build(mod, target=target, params=None)
# print("build [fwd] pass successful")
"""
An error occurred during the execution of TVM.
For more information, please see: https://tvm.apache.org/docs/errors.html
---------------------------------------------------------------
  Check failed: (false) is false: [18:09:32] /home/ligeng/Workspace/tvm/src/relay/op/nn/convolution.h:1111: 
---------------------------------------------------------------
An error occurred during the execution of TVM.
For more information, please see: https://tvm.apache.org/docs/errors.html
---------------------------------------------------------------
  Check failed: (reporter->AssertEQ(indexdiv(dshape_nchw[1], param->groups), wshape[0])) is false: Conv2DTransposed: data.shape[1] // groups != weight.shape[0],  data.shape= [1, 32, 224, 224] groups= 32 weight.shape= [32, 2, 3, 3]
"""

Enable groups > 1 for Conv2dTranspose

This bug was originally raised in #8182 and #8799. Since the two issues are both about Conv2dTranspose, I merge them into one to better test.

In current tvm.relay.op.strategy.generic.py, the groups of Conv2dTranspose are forced to be one

@override_native_generic_func("conv2d_transpose_strategy")
def conv2d_transpose_strategy(attrs, inputs, out_type, target):
    """conv2d_transpose generic strategy"""
    logger.warning("conv2d_transpose is not optimized for this platform.")
    layout = attrs.data_layout
    dilation = get_const_tuple(attrs.dilation)
    groups = attrs.groups
    assert layout == "NCHW", "only support nchw for now"
    assert dilation == (1, 1), "not support dilate now"
    assert groups == 1, "only support groups == 1 for now"

This PR aims to enlables groups > 1 for this OP on multiple platforms.

Copy link
Member

@junrushao junrushao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you much for the quick fix!! Would you like to include your minimal reproducible example as a unittest test case so that such regression could always be tested properly?

Copy link
Member

@Hzfengsy Hzfengsy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a minimum test and fix the lint.

@Lyken17
Copy link
Contributor Author

Lyken17 commented Nov 8, 2021

@Hzfengsy @junrushao1994 I've linted the code and added the test cases. However, current TOPI does not support groups in conv2dtranspose and I temporarilly mark the test as skip. This relies on another PR to fix #8799, will come back to this PR after that was fixed.

@junrushao
Copy link
Member

Thank you all for the effort! @Lyken17 let's do the last mile, fixing the merge conflict and get the CI green :-)

@Lyken17
Copy link
Contributor Author

Lyken17 commented Nov 12, 2021

@junrushao1994 resolved, please have a check.

@junrushao
Copy link
Member

Looks good! Please fix the lint btw

@Lyken17
Copy link
Contributor Author

Lyken17 commented Nov 12, 2021

Lint fixed. Let wait for the green CI!

@junrushao junrushao merged commit 3ad7c4a into apache:main Nov 12, 2021
@junrushao
Copy link
Member

Thank you SO much @Lyken17 @alicja-SiMa-ai @vinx13 @AndrewZhaoLuo @Hzfengsy and many!

mehrdadh pushed a commit to mehrdadh/tvm that referenced this pull request Dec 1, 2021
…upport groups (apache#9465)

* f wrong type check in conv2d_transpose

* add test case for conv2d transpose

* add groups support for conv2d_transpose

* add  naive implementation and schedule for conv2d with groups

* enable tests for cpu and arm_cpu, raise error for cuda platform

* revert the cuda and generic strategy

* revert back the x86 strategy

* revert back the arm_cpu strategy

* revert back the arm_cpu strategy

* revert back the arm_cpu strategy

* fix EOF of x86

* clang lint updated c++ code

* update topi implementation

* Revert test

* Revert test

* add generic/x86/arm specialization for conv2d_transpose with groups > 1

* remove commentted codes

* fix lint

* fix lint

* fix c++ lint

* fix lint

* fix python lint

* remove comments and reformat

* lint file

* lint code

* fix lint

* update logging information in convolution.h

Co-authored-by: Alicja Kwasniewska <alicja.kwasniewska@sima.ai>
mehrdadh pushed a commit to mehrdadh/tvm that referenced this pull request Dec 1, 2021
…upport groups (apache#9465)

* f wrong type check in conv2d_transpose

* add test case for conv2d transpose

* add groups support for conv2d_transpose

* add  naive implementation and schedule for conv2d with groups

* enable tests for cpu and arm_cpu, raise error for cuda platform

* revert the cuda and generic strategy

* revert back the x86 strategy

* revert back the arm_cpu strategy

* revert back the arm_cpu strategy

* revert back the arm_cpu strategy

* fix EOF of x86

* clang lint updated c++ code

* update topi implementation

* Revert test

* Revert test

* add generic/x86/arm specialization for conv2d_transpose with groups > 1

* remove commentted codes

* fix lint

* fix lint

* fix c++ lint

* fix lint

* fix python lint

* remove comments and reformat

* lint file

* lint code

* fix lint

* update logging information in convolution.h

Co-authored-by: Alicja Kwasniewska <alicja.kwasniewska@sima.ai>
ylc pushed a commit to ylc/tvm that referenced this pull request Jan 7, 2022
…upport groups (apache#9465)

* f wrong type check in conv2d_transpose

* add test case for conv2d transpose

* add groups support for conv2d_transpose

* add  naive implementation and schedule for conv2d with groups

* enable tests for cpu and arm_cpu, raise error for cuda platform

* revert the cuda and generic strategy

* revert back the x86 strategy

* revert back the arm_cpu strategy

* revert back the arm_cpu strategy

* revert back the arm_cpu strategy

* fix EOF of x86

* clang lint updated c++ code

* update topi implementation

* Revert test

* Revert test

* add generic/x86/arm specialization for conv2d_transpose with groups > 1

* remove commentted codes

* fix lint

* fix lint

* fix c++ lint

* fix lint

* fix python lint

* remove comments and reformat

* lint file

* lint code

* fix lint

* update logging information in convolution.h

Co-authored-by: Alicja Kwasniewska <alicja.kwasniewska@sima.ai>
yangulei pushed a commit to yangulei/tvm that referenced this pull request Jan 11, 2022
…upport groups (apache#9465)

* f wrong type check in conv2d_transpose

* add test case for conv2d transpose

* add groups support for conv2d_transpose

* add  naive implementation and schedule for conv2d with groups

* enable tests for cpu and arm_cpu, raise error for cuda platform

* revert the cuda and generic strategy

* revert back the x86 strategy

* revert back the arm_cpu strategy

* revert back the arm_cpu strategy

* revert back the arm_cpu strategy

* fix EOF of x86

* clang lint updated c++ code

* update topi implementation

* Revert test

* Revert test

* add generic/x86/arm specialization for conv2d_transpose with groups > 1

* remove commentted codes

* fix lint

* fix lint

* fix c++ lint

* fix lint

* fix python lint

* remove comments and reformat

* lint file

* lint code

* fix lint

* update logging information in convolution.h

Co-authored-by: Alicja Kwasniewska <alicja.kwasniewska@sima.ai>
ylc pushed a commit to ylc/tvm that referenced this pull request Jan 13, 2022
…upport groups (apache#9465)

* f wrong type check in conv2d_transpose

* add test case for conv2d transpose

* add groups support for conv2d_transpose

* add  naive implementation and schedule for conv2d with groups

* enable tests for cpu and arm_cpu, raise error for cuda platform

* revert the cuda and generic strategy

* revert back the x86 strategy

* revert back the arm_cpu strategy

* revert back the arm_cpu strategy

* revert back the arm_cpu strategy

* fix EOF of x86

* clang lint updated c++ code

* update topi implementation

* Revert test

* Revert test

* add generic/x86/arm specialization for conv2d_transpose with groups > 1

* remove commentted codes

* fix lint

* fix lint

* fix c++ lint

* fix lint

* fix python lint

* remove comments and reformat

* lint file

* lint code

* fix lint

* update logging information in convolution.h

Co-authored-by: Alicja Kwasniewska <alicja.kwasniewska@sima.ai>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants