Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[JIT] CodeGen verification testing #75102

Merged
merged 110 commits into from
Sep 23, 2022
Merged

[JIT] CodeGen verification testing #75102

merged 110 commits into from
Sep 23, 2022

Conversation

TIHan
Copy link
Contributor

@TIHan TIHan commented Sep 6, 2022

Description

@markples and I have been working hard on having the ability to write tests that verify code-gen for X64 and ARM64. Our goal is to make writing these tests easy as possible and be able to incorporate them with already existing tests.

The tool we are using for verifying code-gen is LLVM's FileCheck. We created a new NuGet package in dotnet/llvm-project called Microsoft.NETCore.Runtime.JIT.Tools that contains LLVM's FileCheck - we consume this package now.
See https://llvm.org/docs/CommandGuide/FileCheck.html for more information on how FileCheck is used.

We have also created a tool called, SuperFileCheck, that wraps FileCheck and uses Roslyn to allow us to write tests more easily and look readable in C#. For example:

    [MethodImpl(MethodImplOptions.NoInlining)]
    static Vector64<byte> AdvSimd_CompareEqual_Vector64_Byte_Zero(Vector64<byte> left)
    {
        // ARM64-FULL-LINE: cmeq v0.8b, v0.8b, #0
        return AdvSimd.CompareEqual(left, Vector64<byte>.Zero);
    }

The test above verifies the method call, AdvSimd.CompareEqual, will emit the correct ARM64 instruction, cmeq with two operands and the zero. <prefix>-FULL-LINE: <instr> is a new construct in SuperFileCheck that translates to FileCheck 's syntax, <prefix>: {{* ^}}<instr>{{$}} - this allows to match a single line exactly without having to use {{* ^}} or {{$}}.

Things to consider:

  • DOTNET_JitDisasm will need to be a bit more reliable in its output as we create more of these kinds of tests - especially start and end anchors for methods. I think this is fine considering we are making DOTNET_JitDisasm public in release versions.

Acceptance Criteria

  • CI disasmchecks are run and passing
  • Write documentation
  • Write a few more tests
  • Cleanup

@TIHan TIHan marked this pull request as ready for review September 22, 2022 08:18
@TIHan
Copy link
Contributor Author

TIHan commented Sep 22, 2022

@dotnet/jit-contrib This is ready for review.

<BatchCLRTestPostCommands><![CDATA[
$(BatchDisasmCheckPostCommands)
$(CLRTestBatchPostCommands)
]]></BatchCLRTestPostCommands>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Non-blocking nit - names of the properties CLRTestBatchPostCommands and BatchCLRTestPostCommands are confusingly similar, please consider making them somewhat "more different".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea they are confusing. We want to clean this up but in a separate PR as we want to make this PR be focused and not just general refactorings.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can expand on this reply as I think I introduced this. There seemed to be a general naming convention that the csproj file for each test could specify CLRTestXXX and then they would get transformed here (if needed) into something not starting with CLRTest, mostly by moving the CLR later in the name. I could take a shot at changing all of the names (probably leave the csproj alone and then do something to make these internal/implementation ones more obviously internal/implementation?), but as Will said, I didn't want to deviate from that or add diffs to unrelated code in this PR.

Copy link
Member

@trylek trylek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great to me, thank you!

src/coreclr/tools/SuperFileCheck/Program.cs Show resolved Hide resolved
src/coreclr/tools/SuperFileCheck/Program.cs Show resolved Hide resolved
@@ -301,6 +301,7 @@ $(BatchIlrtTestLaunchCmds)
if defined RunCrossGen2 (
call :TakeLock
)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit - spurious diff, probably from me making/undoing changes

It's not worth another test cycle, but if you end up pushing other changes anyway, you may as well undo this.

@@ -8,143 +8,167 @@

class Program
{
// CompareEqual
// CompareEqual
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

another nit - looks like a spurious whitespace change

for (var i = 0; i < 4; i++)
{
result = AdvSimd.CompareEqual(left, asVar);
result = AdvSimd.CompareEqual(left, asVar);
result = AdvSimd.CompareEqual(left, asVar);
result = AdvSimd.CompareEqual(left, asVar);
// ARM64: blt
// ARM64-NOT: fcmeq
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will be interesting seeing how our FileCheck usage evolves as we learn.

In this case, are you actually concerned about the blt or are you using them as markers? If not, then you could simply put an ARM64-NOT: fcmeq after the FULL-LINE. BUT, if you care about the blt or if some other positive check as added, then you would need the multiple NOTs just like you have.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are concerned about blt as we want to ensure cmeq is not inside the loop, which is why there is a ARM64-NOT right after it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but it is also checking for no fcmeq before the loop. So:

The difference between

// ARM64-FULL-LINE: fcmeq {{v[0-9]+}}.4s, {{v[0-9]+}}.4s, #0.0
// ARM64: blt
// ARM64-NOT: fcmeq
// ARM64: blt
// ARM64-NOT: fcmeq

and

// ARM64-FULL-LINE: fcmeq {{v[0-9]+}}.4s, {{v[0-9]+}}.4s, #0.0
// ARM64: blt
// ARM64-NOT: fcmeq

is only that there must be a second blt in the method.

And then the differences between that and

// ARM64-FULL-LINE: fcmeq {{v[0-9]+}}.4s, {{v[0-9]+}}.4s, #0.0
// ARM64-NOT: fcmeq

are

  • The previous one allows a fcmeq between the required fcmeq and the initial blt
  • The previous one requires a blt

Put another way, the current checks require two blt instructions and no fcmeq after the first. Are those exactly the conditions that you want to check?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Those are almost the conditions. I want to check to make sure there is no fcmeq after any of the two blt instructions.

// X64-FULL-LINE-NEXT: and [[REG0]], 15
// X64-FULL-LINE-NEXT: add [[REG0]], [[REG1]]
// X64-FULL-LINE-NEXT: and [[REG0]], -16
// X64-WINDOWS-FULL-LINE-NEXT: mov [[REG2:[a-z]+]], [[REG1]]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to follow up on this - is it an expected codegen deficiency?

Copy link
Contributor Author

@TIHan TIHan Sep 22, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should ask.

@jakobbotsch @kunalspathak - I wrote a test here that showed a case where the codegen is different in windows in that it adds an extra mov instruction. Is this a deficiency or is there a reason for it?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not clear to me, seems like maybe there is some suboptimal preferencing in LSRA? The argument is in a different register on these two ABIs so seems like that somehow ends up affecting it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

During register assignment, if we have more than 1 register available to assign, we try to narrow them down using various heuristics until we are left with just 1 register that we assign. However, if there is a tie-breaker, we just pick a register in the register order.

At the problematic site, both the registers were assigned using REG_ORDER meaning we found more than 1 register to be assigned and we decided based on the ordering of register.

image

It so happened that for Unix/x64, EDI (which was the original register) comes first, but for windows, EDX comes before ECX and we prefer it.

// TEMPORARY ORDER TO AVOID CALLEE-SAVES
// TODO-CQ: Review this and set appropriately
#ifdef UNIX_AMD64_ABI
#define REG_VAR_ORDER REG_EAX,REG_EDI,REG_ESI, \
REG_EDX,REG_ECX,REG_R8,REG_R9, \
REG_R10,REG_R11,REG_EBX,REG_ETW_FRAMED_EBP_LIST \
REG_R14,REG_R15,REG_R12,REG_R13
#else // !UNIX_AMD64_ABI
#define REG_VAR_ORDER REG_EAX,REG_EDX,REG_ECX, \
REG_R8,REG_R9,REG_R10,REG_R11, \
REG_ESI,REG_EDI,REG_EBX,REG_ETW_FRAMED_EBP_LIST \
REG_R14,REG_R15,REG_R12,REG_R13
#endif // !UNIX_AMD64_ABI

We can probably do better (add a new heuristic probably) by reutilizing existing assigned register if they won't be used any further, instead of assigning a new register.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, that is starting to make a lot more sense. Thank you for the detailed information @kunalspathak .

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kunalspathak Are those heuristics mostly/all correctness based (as in, don't reuse a register that already has a value), or is it possible that one of the heuristics should have helped here but didn't for some reason?

Another possible heuristic would be trying to use eax in order to feed the ret without a mov. However, that would require -not- using it above, so maybe it already exists but couldn't be used. In this case, it's not as good because it would lose the short encodings for and.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kunalspathak Are those heuristics mostly/all correctness based (as in, don't reuse a register that already has a value), or is it possible that one of the heuristics should have helped here but didn't for some reason?

Well, I would say they are for performant reasons. If you end of reusing same sets of registers that already hold a live variable you end up spilling the live value to the stack (not good) or if we reuse the registers from inactive interval, we end up copying/reloading (again, not good).

src/coreclr/tools/SuperFileCheck/Program.cs Show resolved Hide resolved
<BatchCLRTestPostCommands><![CDATA[
$(BatchDisasmCheckPostCommands)
$(CLRTestBatchPostCommands)
]]></BatchCLRTestPostCommands>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can expand on this reply as I think I introduced this. There seemed to be a general naming convention that the csproj file for each test could specify CLRTestXXX and then they would get transformed here (if needed) into something not starting with CLRTest, mostly by moving the CLR later in the name. I could take a shot at changing all of the names (probably leave the csproj alone and then do something to make these internal/implementation ones more obviously internal/implementation?), but as Will said, I didn't want to deviate from that or add diffs to unrelated code in this PR.

@trylek
Copy link
Member

trylek commented Sep 22, 2022

There seemed to be a general naming convention that the csproj file for each test could specify CLRTestXXX and then they would get transformed here (if needed) into something not starting with CLRTest, mostly by moving the CLR later in the name.

I see, I didn't realize that CLRTestBatchPostCommands is supposed to come from the individual test projects. Frankly speaking I haven't found a single test using this property and there's apparently no equivalent Unix version CLRTestBashPostCommands either, perhaps the easiest way to clean this up would be by deleting the unused property.

@TIHan
Copy link
Contributor Author

TIHan commented Sep 23, 2022

Current CI failures look unrelated to these changes.

@TIHan
Copy link
Contributor Author

TIHan commented Sep 23, 2022

Current CI failures are from: #76041

There are no failures related to this PR - merging this in.

@TIHan TIHan merged commit 7681692 into dotnet:main Sep 23, 2022
@ghost ghost locked as resolved and limited conversation to collaborators Oct 24, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants