-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Emit armv8.4 ldapur* for volatile loads with contained offsets #89681
Conversation
…ldar # Conflicts: # src/coreclr/scripts/superpmi_diffs.py
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch Issue Detailsvolatile int a;
volatile int b;
volatile int c;
volatile int d;
[Benchmark]
public int Test()
{
return a + b + c + d;
} Current codegen for ; Assembly listing for method P:Test():int:this (Tier1)
stp fp, lr, [sp, #-0x10]!
mov fp, sp
add x1, x0, #8
ldapr w1, [x1]
add x2, x0, #12
ldapr w2, [x2]
add w1, w1, w2
add x2, x0, #16
ldapr w2, [x2]
add w1, w1, w2
add x0, x0, #20
ldapr w0, [x0]
add w0, w1, w0
ldp fp, lr, [sp], #0x10
ret lr
; Total bytes of code 60 New codegen for ; Assembly listing for method P:Test():int:this (Tier1)
stp fp, lr, [sp, #-0x10]!
mov fp, sp
ldapur w1, [x0, #0x08]
ldapur w2, [x0, #0x0C]
add w1, w1, w2
ldapur w2, [x0, #0x10]
add w1, w1, w2
ldapur w0, [x0, #0x14]
add w0, w1, w0
ldp fp, lr, [sp], #0x10
ret lr
; Total bytes of code 44
Perf difference is not super exciting but it is what it is 🙂
|
static_assert_no_msg(INS_count <= 512); | ||
instruction _idIns : 9; | ||
static_assert_no_msg(INS_count <= 1024); | ||
instruction _idIns : 10; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a bit here since we now have more than 512 instructions (and that for sure will be increasing in future) but I also removed an used (on arm) _idCallRegPtr
bit so the total size of the struct remains the same - 16b
src/coreclr/tools/Common/JitInterface/ThunkGenerator/InstructionSetDesc.txt
Outdated
Show resolved
Hide resolved
@TIHan @dotnet/jit-contrib PTAL, simple change, emits |
@@ -8813,8 +8837,6 @@ void emitter::emitIns_Call(EmitCallType callType, | |||
{ | |||
/* This is an indirect call (either a virtual call or func ptr call) */ | |||
|
|||
id->idSetIsCallRegPtr(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why was this removed for arm32?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mentioned it in https://github.com/dotnet/runtime/pull/89681/files#r1278410873 comment, basically, it sets a bit that is unused for arm, so I removed that bit because I needed to extend _idIns
's width
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, only one comment on the idSetIsCallingRegPtr.
Contributes to #64457
Current codegen for
Test
:New codegen for
Test
:Apple M2 Max:
Perf difference is not super exciting but it is what it is 🙂