-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[JIT] CodeGen verification testing #75102
Conversation
…follow SuperFileCheck rules.
…leCheck test for mod optimization
@dotnet/jit-contrib This is ready for review. |
<BatchCLRTestPostCommands><![CDATA[ | ||
$(BatchDisasmCheckPostCommands) | ||
$(CLRTestBatchPostCommands) | ||
]]></BatchCLRTestPostCommands> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Non-blocking nit - names of the properties CLRTestBatchPostCommands
and BatchCLRTestPostCommands
are confusingly similar, please consider making them somewhat "more different".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea they are confusing. We want to clean this up but in a separate PR as we want to make this PR be focused and not just general refactorings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can expand on this reply as I think I introduced this. There seemed to be a general naming convention that the csproj file for each test could specify CLRTestXXX and then they would get transformed here (if needed) into something not starting with CLRTest, mostly by moving the CLR later in the name. I could take a shot at changing all of the names (probably leave the csproj alone and then do something to make these internal/implementation ones more obviously internal/implementation?), but as Will said, I didn't want to deviate from that or add diffs to unrelated code in this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great to me, thank you!
@@ -301,6 +301,7 @@ $(BatchIlrtTestLaunchCmds) | |||
if defined RunCrossGen2 ( | |||
call :TakeLock | |||
) | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit - spurious diff, probably from me making/undoing changes
It's not worth another test cycle, but if you end up pushing other changes anyway, you may as well undo this.
@@ -8,143 +8,167 @@ | |||
|
|||
class Program | |||
{ | |||
// CompareEqual | |||
// CompareEqual |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
another nit - looks like a spurious whitespace change
for (var i = 0; i < 4; i++) | ||
{ | ||
result = AdvSimd.CompareEqual(left, asVar); | ||
result = AdvSimd.CompareEqual(left, asVar); | ||
result = AdvSimd.CompareEqual(left, asVar); | ||
result = AdvSimd.CompareEqual(left, asVar); | ||
// ARM64: blt | ||
// ARM64-NOT: fcmeq |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It will be interesting seeing how our FileCheck usage evolves as we learn.
In this case, are you actually concerned about the blt
or are you using them as markers? If not, then you could simply put an ARM64-NOT: fcmeq
after the FULL-LINE
. BUT, if you care about the blt
or if some other positive check as added, then you would need the multiple NOT
s just like you have.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are concerned about blt
as we want to ensure cmeq
is not inside the loop, which is why there is a ARM64-NOT
right after it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but it is also checking for no fcmeq
before the loop. So:
The difference between
// ARM64-FULL-LINE: fcmeq {{v[0-9]+}}.4s, {{v[0-9]+}}.4s, #0.0
// ARM64: blt
// ARM64-NOT: fcmeq
// ARM64: blt
// ARM64-NOT: fcmeq
and
// ARM64-FULL-LINE: fcmeq {{v[0-9]+}}.4s, {{v[0-9]+}}.4s, #0.0
// ARM64: blt
// ARM64-NOT: fcmeq
is only that there must be a second blt
in the method.
And then the differences between that and
// ARM64-FULL-LINE: fcmeq {{v[0-9]+}}.4s, {{v[0-9]+}}.4s, #0.0
// ARM64-NOT: fcmeq
are
- The previous one allows a
fcmeq
between the requiredfcmeq
and the initialblt
- The previous one requires a
blt
Put another way, the current checks require two blt
instructions and no fcmeq
after the first. Are those exactly the conditions that you want to check?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Those are almost the conditions. I want to check to make sure there is no fcmeq
after any of the two blt
instructions.
src/tests/JIT/Regression/JitBlue/Runtime_34937/Runtime_34937.cs
Outdated
Show resolved
Hide resolved
src/tests/JIT/Regression/JitBlue/Runtime_34937/Runtime_34937.cs
Outdated
Show resolved
Hide resolved
// X64-FULL-LINE-NEXT: and [[REG0]], 15 | ||
// X64-FULL-LINE-NEXT: add [[REG0]], [[REG1]] | ||
// X64-FULL-LINE-NEXT: and [[REG0]], -16 | ||
// X64-WINDOWS-FULL-LINE-NEXT: mov [[REG2:[a-z]+]], [[REG1]] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to follow up on this - is it an expected codegen deficiency?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should ask.
@jakobbotsch @kunalspathak - I wrote a test here that showed a case where the codegen is different in windows in that it adds an extra mov
instruction. Is this a deficiency or is there a reason for it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not clear to me, seems like maybe there is some suboptimal preferencing in LSRA? The argument is in a different register on these two ABIs so seems like that somehow ends up affecting it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
During register assignment, if we have more than 1 register available to assign, we try to narrow them down using various heuristics until we are left with just 1 register that we assign. However, if there is a tie-breaker, we just pick a register in the register order.
At the problematic site, both the registers were assigned using REG_ORDER
meaning we found more than 1 register to be assigned and we decided based on the ordering of register.
It so happened that for Unix/x64, EDI
(which was the original register) comes first, but for windows, EDX
comes before ECX
and we prefer it.
runtime/src/coreclr/jit/targetamd64.h
Lines 191 to 203 in d0559e6
// TEMPORARY ORDER TO AVOID CALLEE-SAVES | |
// TODO-CQ: Review this and set appropriately | |
#ifdef UNIX_AMD64_ABI | |
#define REG_VAR_ORDER REG_EAX,REG_EDI,REG_ESI, \ | |
REG_EDX,REG_ECX,REG_R8,REG_R9, \ | |
REG_R10,REG_R11,REG_EBX,REG_ETW_FRAMED_EBP_LIST \ | |
REG_R14,REG_R15,REG_R12,REG_R13 | |
#else // !UNIX_AMD64_ABI | |
#define REG_VAR_ORDER REG_EAX,REG_EDX,REG_ECX, \ | |
REG_R8,REG_R9,REG_R10,REG_R11, \ | |
REG_ESI,REG_EDI,REG_EBX,REG_ETW_FRAMED_EBP_LIST \ | |
REG_R14,REG_R15,REG_R12,REG_R13 | |
#endif // !UNIX_AMD64_ABI |
We can probably do better (add a new heuristic probably) by reutilizing existing assigned register if they won't be used any further, instead of assigning a new register.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, that is starting to make a lot more sense. Thank you for the detailed information @kunalspathak .
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kunalspathak Are those heuristics mostly/all correctness based (as in, don't reuse a register that already has a value), or is it possible that one of the heuristics should have helped here but didn't for some reason?
Another possible heuristic would be trying to use eax
in order to feed the ret
without a mov
. However, that would require -not- using it above, so maybe it already exists but couldn't be used. In this case, it's not as good because it would lose the short encodings for and
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kunalspathak Are those heuristics mostly/all correctness based (as in, don't reuse a register that already has a value), or is it possible that one of the heuristics should have helped here but didn't for some reason?
Well, I would say they are for performant reasons. If you end of reusing same sets of registers that already hold a live variable you end up spilling the live value to the stack (not good) or if we reuse the registers from inactive interval, we end up copying/reloading (again, not good).
<BatchCLRTestPostCommands><![CDATA[ | ||
$(BatchDisasmCheckPostCommands) | ||
$(CLRTestBatchPostCommands) | ||
]]></BatchCLRTestPostCommands> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can expand on this reply as I think I introduced this. There seemed to be a general naming convention that the csproj file for each test could specify CLRTestXXX and then they would get transformed here (if needed) into something not starting with CLRTest, mostly by moving the CLR later in the name. I could take a shot at changing all of the names (probably leave the csproj alone and then do something to make these internal/implementation ones more obviously internal/implementation?), but as Will said, I didn't want to deviate from that or add diffs to unrelated code in this PR.
I see, I didn't realize that |
Current CI failures look unrelated to these changes. |
Current CI failures are from: #76041 There are no failures related to this PR - merging this in. |
Description
@markples and I have been working hard on having the ability to write tests that verify code-gen for X64 and ARM64. Our goal is to make writing these tests easy as possible and be able to incorporate them with already existing tests.
The tool we are using for verifying code-gen is LLVM's
FileCheck
. We created a new NuGet package in dotnet/llvm-project calledMicrosoft.NETCore.Runtime.JIT.Tools
that contains LLVM'sFileCheck
- we consume this package now.See https://llvm.org/docs/CommandGuide/FileCheck.html for more information on how
FileCheck
is used.We have also created a tool called,
SuperFileCheck
, that wrapsFileCheck
and uses Roslyn to allow us to write tests more easily and look readable in C#. For example:The test above verifies the method call,
AdvSimd.CompareEqual
, will emit the correct ARM64 instruction,cmeq
with two operands and the zero.<prefix>-FULL-LINE: <instr>
is a new construct inSuperFileCheck
that translates toFileCheck
's syntax,<prefix>: {{* ^}}<instr>{{$}}
- this allows to match a single line exactly without having to use{{* ^}}
or{{$}}
.Things to consider:
DOTNET_JitDisasm
will need to be a bit more reliable in its output as we create more of these kinds of tests - especially start and end anchors for methods. I think this is fine considering we are makingDOTNET_JitDisasm
public in release versions.Acceptance Criteria