Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Vulkan] Remove some interface block decoration #8102

Merged
merged 3 commits into from
May 22, 2021
Merged

[Vulkan] Remove some interface block decoration #8102

merged 3 commits into from
May 22, 2021

Conversation

llehtahw
Copy link
Contributor

From spir-v decoration

Block
Apply only to a structure type to establish it is a non-SSBO-like shader-interface block.

And interface block(GLSL)

An Interface Block is a group of GLSL input, output, uniform, or storage buffer variables. These blocks have special syntax and semantics that can be applied to them.

I guess that StructArrayTypes of shared memory (or shared variables in GLSL) and local memory are not required to be decorated with Block.
After applying this patch, this issue seems solved to me.

@llehtahw
Copy link
Contributor Author

@masahi @tqchen

@masahi
Copy link
Member

masahi commented May 21, 2021

Wow!! I can confirm that gemm, conv2d, and sorting all work on linux + nv vulkan with this patch, thank you very much!!

I've been thinking that the error on nv was a driver issue, very glad to know that this is our bug 🙂

I'm going to test on intel and RADV mesa driver, both of which have had problems with TVM vk.

Just curious, how did you figure out this?

@llehtahw
Copy link
Contributor Author

Just curious, how did you figure out this?

It's good luck.

I found that, if I do not use shared read cache, the gemm test case works properly. So I suspect there may be something wrong with shared memory.

I also translated TVM generated spv code to glsl code with spirv-cross, but the glsl code could not be compiled back to spv because that the type definitions of shared variables were missed. Then I corrected the glsl code and compiled it to spv.

So I got two spv files, one from TVM and the other from valid glsl. I compared their spirv IR, especially parts related to shared memory (workgroup variable). Conclusively, glsl-generated code lacks the Block decoration.


I'm not familiar with Vulkan, so I'm not sure if this patch is good enough. @masahi do you have any suggestions?

@Lunderberg
Copy link
Contributor

One suggestion, I think we should need to apply the DecorationArrayStride only when interface_buffer is false (see below). Everything else good to me. Fantastic debugging, and thank you very much!

General notes from reading over changes, and re-reading documentation for similar issues:

  • Do issues arise if the same struct is declared both as an interface and as a non-interface? No, since you added the interface_block to the cache lookup, they would be treated as separate types.

  • Are there any other uses of DecorationBlock or DecorationBufferBlock that should be removed? No, the only other use is in DeclareStorageType, which is only used when declaring I/O by push constants or uniform buffers, so it is fine. That said, in the future we might want to rename DeclareStorageType to DeclareIOStruct, or add the same interface_block argument to it, to avoid implying that it could be used to declare a struct that isn't used for IO.

  • Are there any other decorations that should only be used conditionally? Yes, DecorationArrayStride.

    • DecorationArrayStride - Looks like if an array contains a structure decorated by Buffer or BufferBlock, then the ArrayStride decoration must be omitted (ref). Therefore, in GetStructArrayType, the DecorationArrayStride and should only be applied if interface_block is false.
    • DecorationOffset - Should be used for each structure member, whether or not it is an interface type, no issues.
    • DecorationDescriptorSet, DecorationBinding - Used for buffer inputs/outputs, no issues.
    • DecorationBuiltIn - Used for accessing WorkgroupId and LocalInvocationId, no issues.

@masahi masahi self-assigned this May 21, 2021
@masahi
Copy link
Member

masahi commented May 21, 2021

Thanks @llehtahw, that's really great debugging technique!

And thanks @Lunderberg on array stride. Indeed the spec says very explicitly that Each array type must have an ArrayStride decoration, unless it is an array that contains a structure decorated with Block or BufferBlock, in which case it must not have an ArrayStride decoration. @llehtahw Can you update this PR according to this rule?

@masahi
Copy link
Member

masahi commented May 21, 2021

@Lunderberg If I make this change,

-  // decorate the array type.
-  this->Decorate(spv::OpDecorate, arr_type, spv::DecorationArrayStride, nbytes);
+  if (!interface_block) {
+    // decorate the array type.
+    this->Decorate(spv::OpDecorate, arr_type, spv::DecorationArrayStride, nbytes);
+  }

I get this error

  1: tvm::codegen::BuildSPIRV(tvm::IRModule, tvm::Target, bool)
  0: tvm::codegen::SPIRVTools::ValidateShader(std::vector<unsigned int, std::allocator<unsigned int> > const&)
  File "/home/masa/projects/dev/tvm/src/target/spirv/build_vulkan.cc", line 68
TVMError: 
---------------------------------------------------------------
An error occurred during the execution of TVM.
For more information, please see: https://tvm.apache.org/docs/errors.html
---------------------------------------------------------------

  Check failed: res == SPV_SUCCESS (-10 vs. 0) :  index=33 error:Structure id 10 decorated as Block for variable in StorageBuffer storage class must follow standard storage 
buffer layout rules: member 0 contains an array with stride 0, but with an element size of 4
  %_struct_10 = OpTypeStruct %_runtimearr_float

Looking at the spec sentence, I wonder what exactly array that contains a structure means. Isn't array itself a structure? What does it mean to have an array that contains Block?

@Lunderberg
Copy link
Contributor

I get reproduce the same validation error, and I'll look into it on my end. I think that an "array that contains a structure" would be the a type declared with OpTypeArray whose element type is declared with OpTypeStruct, the equivalent of a vector<struct>. From my reading of the spec, if that internal struct is decorated with Block, then it shouldn't be decorated with ArrayStride. The validator is clearly finding something mismatched, so I'll check again on it.

@Lunderberg
Copy link
Contributor

Lunderberg commented May 21, 2021

Found it! I was mistaken on the ordering. The output is a struct that contains an array, not an array consisting of structs. The struct receives DecorationBlock, but the sentence in the spec refers to the array's elements. Since the value_type passed to this function is only ever a primitive value, and is not itself a Block-decorated struct, that sentence in the spec doesn't apply.

Sorry for accidentally sending you down a rabbit hole, and everything in the PR looks good to me.

@masahi masahi merged commit a2bf07f into apache:main May 22, 2021
@masahi
Copy link
Member

masahi commented May 22, 2021

Thanks @llehtahw again for the great work, and @Lunderberg for the review!!

@llehtahw
Copy link
Contributor Author

Thanks @masahi for testing.

And thanks @Lunderberg for comments, learned a lot from your notes!

@llehtahw llehtahw deleted the fix-interface-block-decorate branch May 22, 2021 03:38
trevor-m pushed a commit to trevor-m/tvm that referenced this pull request Jun 17, 2021
* Remove block decorator for shared/local variables

* Fix lint
trevor-m pushed a commit to neo-ai/tvm that referenced this pull request Jun 17, 2021
* Remove block decorator for shared/local variables

* Fix lint
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants