Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ARM64-SVE: refactor lsra buildHWIntrinsic #107459

Open
wants to merge 34 commits into
base: main
Choose a base branch
from
Open

Conversation

a74nh
Copy link
Contributor

@a74nh a74nh commented Sep 6, 2024

Fixes #104842

The logic for hwintrisics has become convoluted. Refactor it, for both SVE and AdvSimd.

Add functions to get the operand (if any) for each requirement - delay slot, consecutive registers, address, etc.

Then use a simple for loop to iterate through each operand and build depending on which requirements match for that operand.

Tested by using stress_test.py on the entire HardwareIntrinsics_Arm set.

@dotnet-issue-labeler dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Sep 6, 2024
@dotnet-policy-service dotnet-policy-service bot added the community-contribution Indicates that the PR has been added by a community member label Sep 6, 2024
@a74nh
Copy link
Contributor Author

a74nh commented Sep 6, 2024

@kunalspathak - Still WIP. For now ignore anything outside of lsraarm64.cpp and lsra.hpp. All the other changes are from other PRs and will be removed was those have been merged.

@a74nh a74nh marked this pull request as ready for review September 9, 2024 15:41
@a74nh
Copy link
Contributor Author

a74nh commented Sep 9, 2024

This PR is ready now.

Requires #107084, #107180 and a workaround for #107537 in order for all the hwintrinsic tests to pass.

Apologies, this is a large change to review, and the github diff is confused about functions I haven't touched. Probably best starting a review from the new version of BuildHWIntrinsic()

I recommend this is not merged until after we've gone past the Net9 RC2 deadline.

@dotnet/arm64-contrib

I'll do a spmidiff next.

@kunalspathak
Copy link
Member

I expected this to be no diff changes, but looks like it is not. Can you please double check the source of differences?

@a74nh
Copy link
Contributor Author

a74nh commented Sep 11, 2024

I expected this to be no diff changes, but looks like it is not. Can you please double check the source of differences?

This looks like it's all to LoadAndInsertScalar.

tldr: there is a bug in HEAD where getVectorAddrOperand() is not used for op3

Long version:

There are multiple versions of LoadAndInsertScalar because it needs to handle variants with multiple op1/return values
All of them have an address operand in op3

public static unsafe (Vector128<byte> Value1, Vector128<byte> Value2) LoadAndInsertScalar((Vector128<byte>, Vector128<byte>) values, [ConstantExpected(Max = (byte)(15))] byte index, byte* address);

public static unsafe Vector128<uint> LoadAndInsertScalar(Vector128<uint> value, [ConstantExpected(Max = (byte)(3))] byte index, uint* address);

In HEAD, for the multiple register versions, NI_AdvSimd_LoadAndInsertScalarVectorXxX, it has special handling:

else if (HWIntrinsicInfo::NeedsConsecutiveRegisters(intrin.id))
....
            case NI_AdvSimd_LoadAndInsertScalarVector64x2:
            case NI_AdvSimd_LoadAndInsertScalarVector64x3:
            case NI_AdvSimd_LoadAndInsertScalarVector64x4:
            case NI_AdvSimd_Arm64_LoadAndInsertScalarVector128x2:
            case NI_AdvSimd_Arm64_LoadAndInsertScalarVector128x3:
            case NI_AdvSimd_Arm64_LoadAndInsertScalarVector128x4:
            {
                assert(intrin.op2 != nullptr);
                assert(intrin.op3 != nullptr);
                assert(isRMW);
                if (!intrin.op2->isContainedIntOrIImmed())
                {
                    srcCount += BuildOperandUses(intrin.op2);
                }

                assert(intrinsicTree->OperIsMemoryLoadOrStore());
                srcCount += BuildAddrUses(intrin.op3);
                buildInternalRegisterUses();
                FALLTHROUGH;
            }

Note that BuildAddrUses() is used for op3

For the single register variant, NI_AdvSimd_LoadAndInsertScalar, it doesn't have NeedsConsecutiveRegisters so falls into the generic op2 handling code, before falling into the generic op3 handling code, which does:

        if (intrin.op3 != nullptr)
        {
            SingleTypeRegSet candidates = lowVectorOperandNum == 3 ? lowVectorCandidates : RBM_NONE;

            if (isRMW)
            {
                srcCount += BuildDelayFreeUses(intrin.op3, (tgtPrefOp2 ? intrin.op2 : intrin.op1), candidates);
            }
            else
            {
                srcCount += BuildOperandUses(intrin.op3, candidates);
            }

This is wrong - it should be using BuildAddrUses() for op3.

In my PR, getVectorAddrOperand() will correctly return op3 for all LoadAndInsertScalarVector and then the main for loop in BuildHWIntrinsic() will correctly call BuildAddrUses()

@a74nh
Copy link
Contributor Author

a74nh commented Sep 11, 2024

tldr: there is a bug in HEAD where getVectorAddrOperand() is not used for op3

In addition, op1 will be called with BuildAddrUses(), which is also wrong.

That happens in:

else if (intrinsicTree->OperIsMemoryLoadOrStore())
        {
            srcCount += BuildAddrUses(intrin.op1);
        }

Copy link
Member

@kunalspathak kunalspathak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks much cleaner. We should also run jitstress and other outerloop legs before merging. I can do it once we are done with addressing the feedback.

src/coreclr/jit/lsraarm64.cpp Outdated Show resolved Hide resolved
src/coreclr/jit/lsraarm64.cpp Outdated Show resolved Hide resolved
src/coreclr/jit/lsraarm64.cpp Outdated Show resolved Hide resolved
src/coreclr/jit/lsraarm64.cpp Outdated Show resolved Hide resolved
src/coreclr/jit/lsraarm64.cpp Outdated Show resolved Hide resolved
src/coreclr/jit/lsraarm64.cpp Outdated Show resolved Hide resolved
src/coreclr/jit/lsraarm64.cpp Outdated Show resolved Hide resolved
assert(candidates == RBM_NONE);

// Some operands have consective op which is also a delay free op
srcCount += BuildConsecutiveRegistersForUse(operand, delayFreeOp);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We also seems to call buildInternalRegisterUses() for consecutive registers. Are we missing it here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also for NI_AdvSimd_VectorTableLookupExtension(), we call like this. We should double check that logic in new code.

  • BuildConsecutiveRegistersForUse
  • buildInternalRegisterUses
  • BuildDef
  • buildInternalRegisterUses
  • BuildDef

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We also seems to call buildInternalRegisterUses() for consecutive registers. Are we missing it here?

All intrinsics will call buildInternalRegisterUses() after the for loop, before building the destination.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also for NI_AdvSimd_VectorTableLookupExtension(), we call like this. We should double check that logic in new code.

  • BuildConsecutiveRegistersForUse
  • buildInternalRegisterUses
  • BuildDef
  • buildInternalRegisterUses
  • BuildDef

In old code:

  • op1 - BuildUse (because tgtPrefUse is set, which is because isRMW)
  • op2 - BuildConsecutiveRegistersForUse
  • op3 - BuildDelayFreeUses
  • buildInternalRegisterUses
  • BuildDef

In new code:

  • delay free = op1 (because isRMW)
  • addr = nullptr
  • consecutive = op2
  • dest consecutive = false
  • embedded = nullptr
  • BuildHWIntrinsicImmediate (which is a nop)
  • op1 - BuildUse (because delayFreeOp == op1)
  • op2 - BuildConsecutiveRegistersForUse (because consecutive == op2)
  • op3 - BuildDelayFreeUses (because delay free != nullptr)
  • buildInternalRegisterUses
  • BuildDef

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're missing the thing I always miss....

Also for NI_AdvSimd_VectorTableLookupExtension(), we call like this. We should double check that logic in new code.

  • BuildConsecutiveRegistersForUse
  • buildInternalRegisterUses
  • BuildDef

there is a return srcCount; here (line 1934)

Which means it never does:

  • buildInternalRegisterUses
  • BuildDef

@kunalspathak kunalspathak added the arm-sve Work related to arm64 SVE/SVE2 support label Sep 12, 2024
@a74nh
Copy link
Contributor Author

a74nh commented Sep 13, 2024

Got some asmdiffs for the SVE tests. Spotted two differences, and one of them is due to issues in HEAD.

I'll raise PRs to fix these (plus one for LoadAndInsertScalar), and then rebase this once merged. I'd like there to be no asmdiff differences in this PR

./4546.dasm:1 :   ; Assembly listing for method JIT.HardwareIntrinsics.Arm._Sve.SimpleBinaryOpTest__Sve_BitwiseClear_long RunClassFldScenario() this (FullOpts)
./4000.dasm:1 :   ; Assembly listing for method JIT.HardwareIntrinsics.Arm._Sve.SimpleBinaryOpTest__Sve_AddSaturate_byte RunBasicScenario_Load() this (FullOpts)
./4130.dasm:1 :   ; Assembly listing for method JIT.HardwareIntrinsics.Arm._Sve.SimpleBinaryOpTest__Sve_And_sbyte RunClassFldScenario() this (FullOpts)
./27034.dasm:1 :   ; Assembly listing for method JIT.HardwareIntrinsics.Arm._Sve.SimpleBinaryOpTest__Sve_Xor_byte RunBasicScenario_Load() this (FullOpts)
./22619.dasm:1 :   ; Assembly listing for method JIT.HardwareIntrinsics.Arm._Sve.SimpleBinaryOpTest__Sve_Or_sbyte RunClassFldScenario() this (FullOpts)
./22615.dasm:1 :   ; Assembly listing for method JIT.HardwareIntrinsics.Arm._Sve.SimpleBinaryOpTest__Sve_Or_sbyte RunBasicScenario_Load() this (FullOpts)
./4170.dasm:1 :   ; Assembly listing for method JIT.HardwareIntrinsics.Arm._Sve.SimpleBinaryOpTest__Sve_And_int RunBasicScenario_Load() this (FullOpts)
./26818.dasm:1 :   ; Assembly listing for method JIT.HardwareIntrinsics.Arm._Sve.SimpleBinaryOpTest__Sve_SubtractSaturate_int RunClassFldScenario() this (FullOpts)
./26730.dasm:1 :   ; Assembly listing for method JIT.HardwareIntrinsics.Arm._Sve.SimpleBinaryOpTest__Sve_Subtract_uint RunClassFldScenario() this (FullOpts)
./4026.dasm:1 :   ; Assembly listing for method JIT.HardwareIntrinsics.Arm._Sve.SimpleBinaryOpTest__Sve_AddSaturate_ushort RunClassFldScenario() this (FullOpts)
./22659.dasm:1 :   ; Assembly listing for method JIT.HardwareIntrinsics.Arm._Sve.SimpleBinaryOpTest__Sve_Or_int RunBasicScenario_Load() this (FullOpts)
./4280.dasm:1 :   ; Assembly listing for method JIT.HardwareIntrinsics.Arm._Sve.SimpleBinaryOpTest__Sve_And_ulong RunBasicScenario_Load() this (FullOpts)
./4258.dasm:1 :   ; Assembly listing for method JIT.HardwareIntrinsics.Arm._Sve.SimpleBinaryOpTest__Sve_And_uint RunBasicScenario_Load() this (FullOpts)
./4192.dasm:1 :   ; Assembly listing for method JIT.HardwareIntrinsics.Arm._Sve.SimpleBinaryOpTest__Sve_And_long RunBasicScenario_Load() this (FullOpts)
./26726.dasm:1 :   ; Assembly listing for method JIT.HardwareIntrinsics.Arm._Sve.SimpleBinaryOpTest__Sve_Subtract_uint RunBasicScenario_Load() this (FullOpts)
./26906.dasm:1 :   ; Assembly listing for method JIT.HardwareIntrinsics.Arm._Sve.SimpleBinaryOpTest__Sve_SubtractSaturate_uint RunClassFldScenario() this (FullOpts)
./3481.dasm:1 :   ; Assembly listing for method JIT.HardwareIntrinsics.Arm._Sve.SimpleBinaryOpTest__Sve_Add_long RunBasicScenario_Load() this (FullOpts)
./26642.dasm:1 :   ; Assembly listing for method JIT.HardwareIntrinsics.Arm._Sve.SimpleBinaryOpTest__Sve_Subtract_int RunClassFldScenario() this (FullOpts)
./3547.dasm:1 :   ; Assembly listing for method JIT.HardwareIntrinsics.Arm._Sve.SimpleBinaryOpTest__Sve_Add_uint RunBasicScenario_Load() this (FullOpts)
./4608.dasm:1 :   ; Assembly listing for method JIT.HardwareIntrinsics.Arm._Sve.SimpleBinaryOpTest__Sve_BitwiseClear_uint RunBasicScenario_Load() this (FullOpts)
./26946.dasm:1 :   ; Assembly listing for method JIT.HardwareIntrinsics.Arm._Sve.SimpleBinaryOpTest__Sve_Xor_sbyte RunBasicScenario_Load() this (FullOpts)
./27082.dasm:1 :   ; Assembly listing for method JIT.HardwareIntrinsics.Arm._Sve.SimpleBinaryOpTest__Sve_Xor_uint RunClassFldScenario() this (FullOpts)
./3529.dasm:1 :   ; Assembly listing for method JIT.HardwareIntrinsics.Arm._Sve.SimpleBinaryOpTest__Sve_Add_ushort RunClassFldScenario() this (FullOpts)
./26968.dasm:1 :   ; Assembly listing for method JIT.HardwareIntrinsics.Arm._Sve.SimpleBinaryOpTest__Sve_Xor_short RunBasicScenario_Load() this (FullOpts)
./27012.dasm:1 :   ; Assembly listing for method JIT.HardwareIntrinsics.Arm._Sve.SimpleBinaryOpTest__Sve_Xor_long RunBasicScenario_Load() this (FullOpts)
./22681.dasm:1 :   ; Assembly listing for method JIT.HardwareIntrinsics.Arm._Sve.SimpleBinaryOpTest__Sve_Or_long RunBasicScenario_Load() this (FullOpts)
./22769.dasm:1 :   ; Assembly listing for method JIT.HardwareIntrinsics.Arm._Sve.SimpleBinaryOpTest__Sve_Or_ulong RunBasicScenario_Load() this (FullOpts)
./4218.dasm:1 :   ; Assembly listing for method JIT.HardwareIntrinsics.Arm._Sve.SimpleBinaryOpTest__Sve_And_byte RunClassFldScenario() this (FullOpts)
./27056.dasm:1 :   ; Assembly listing for method JIT.HardwareIntrinsics.Arm._Sve.SimpleBinaryOpTest__Sve_Xor_ushort RunBasicScenario_Load() this (FullOpts)
./4520.dasm:1 :   ; Assembly listing for method JIT.HardwareIntrinsics.Arm._Sve.SimpleBinaryOpTest__Sve_BitwiseClear_int RunBasicScenario_Load() this (FullOpts)
./26704.dasm:1 :   ; Assembly listing for method JIT.HardwareIntrinsics.Arm._Sve.SimpleBinaryOpTest__Sve_Subtract_ushort RunBasicScenario_Load() this (FullOpts)
./26994.dasm:1 :   ; Assembly listing for method JIT.HardwareIntrinsics.Arm._Sve.SimpleBinaryOpTest__Sve_Xor_int RunClassFldScenario() this (FullOpts)
./4214.dasm:1 :   ; Assembly listing for method JIT.HardwareIntrinsics.Arm._Sve.SimpleBinaryOpTest__Sve_And_byte RunBasicScenario_Load() this (FullOpts)
./22747.dasm:1 :   ; Assembly listing for method JIT.HardwareIntrinsics.Arm._Sve.SimpleBinaryOpTest__Sve_Or_uint RunBasicScenario_Load() this (FullOpts)
./22729.dasm:1 :   ; Assembly listing for method JIT.HardwareIntrinsics.Arm._Sve.SimpleBinaryOpTest__Sve_Or_ushort RunClassFldScenario() this (FullOpts)
./26792.dasm:1 :   ; Assembly listing for method JIT.HardwareIntrinsics.Arm._Sve.SimpleBinaryOpTest__Sve_SubtractSaturate_short RunBasicScenario_Load() this (FullOpts)
./3415.dasm:1 :   ; Assembly listing for method JIT.HardwareIntrinsics.Arm._Sve.SimpleBinaryOpTest__Sve_Add_sbyte RunBasicScenario_Load() this (FullOpts)
./3441.dasm:1 :   ; Assembly listing for method JIT.HardwareIntrinsics.Arm._Sve.SimpleBinaryOpTest__Sve_Add_short RunClassFldScenario() this (FullOpts)
./26554.dasm:1 :   ; Assembly listing for method JIT.HardwareIntrinsics.Arm._Sve.SimpleBinaryOpTest__Sve_Subtract_float RunClassFldScenario() this (FullOpts)
./26902.dasm:1 :   ; Assembly listing for method JIT.HardwareIntrinsics.Arm._Sve.SimpleBinaryOpTest__Sve_SubtractSaturate_uint RunBasicScenario_Load() this (FullOpts)
./4634.dasm:1 :   ; Assembly listing for method JIT.HardwareIntrinsics.Arm._Sve.SimpleBinaryOpTest__Sve_BitwiseClear_ulong RunClassFldScenario() this (FullOpts)
./22637.dasm:1 :   ; Assembly listing for method JIT.HardwareIntrinsics.Arm._Sve.SimpleBinaryOpTest__Sve_Or_short RunBasicScenario_Load() this (FullOpts)
./3507.dasm:1 :   ; Assembly listing for method JIT.HardwareIntrinsics.Arm._Sve.SimpleBinaryOpTest__Sve_Add_byte RunClassFldScenario() this (FullOpts)
./26616.dasm:1 :   ; Assembly listing for method JIT.HardwareIntrinsics.Arm._Sve.SimpleBinaryOpTest__Sve_Subtract_short RunBasicScenario_Load() this (FullOpts)
./26880.dasm:1 :   ; Assembly listing for method JIT.HardwareIntrinsics.Arm._Sve.SimpleBinaryOpTest__Sve_SubtractSaturate_ushort RunBasicScenario_Load() this (FullOpts)
./22707.dasm:1 :   ; Assembly listing for method JIT.HardwareIntrinsics.Arm._Sve.SimpleBinaryOpTest__Sve_Or_byte RunClassFldScenario() this (FullOpts)
./22703.dasm:1 :   ; Assembly listing for method JIT.HardwareIntrinsics.Arm._Sve.SimpleBinaryOpTest__Sve_Or_byte RunBasicScenario_Load() this (FullOpts)
./3459.dasm:1 :   ; Assembly listing for method JIT.HardwareIntrinsics.Arm._Sve.SimpleBinaryOpTest__Sve_Add_int RunBasicScenario_Load() this (FullOpts)
./22641.dasm:1 :   ; Assembly listing for method JIT.HardwareIntrinsics.Arm._Sve.SimpleBinaryOpTest__Sve_Or_short RunClassFldScenario() this (FullOpts)
./3503.dasm:1 :   ; Assembly listing for method JIT.HardwareIntrinsics.Arm._Sve.SimpleBinaryOpTest__Sve_Add_byte RunBasicScenario_Load() this (FullOpts)
./26990.dasm:1 :   ; Assembly listing for method JIT.HardwareIntrinsics.Arm._Sve.SimpleBinaryOpTest__Sve_Xor_int RunBasicScenario_Load() this (FullOpts)
./3419.dasm:1 :   ; Assembly listing for method JIT.HardwareIntrinsics.Arm._Sve.SimpleBinaryOpTest__Sve_Add_sbyte RunClassFldScenario() this (FullOpts)
./26814.dasm:1 :   ; Assembly listing for method JIT.HardwareIntrinsics.Arm._Sve.SimpleBinaryOpTest__Sve_SubtractSaturate_int RunBasicScenario_Load() this (FullOpts)
./27078.dasm:1 :   ; Assembly listing for method JIT.HardwareIntrinsics.Arm._Sve.SimpleBinaryOpTest__Sve_Xor_uint RunBasicScenario_Load() this (FullOpts)

@a74nh
Copy link
Contributor Author

a74nh commented Sep 13, 2024

Latest version fixes up the diffs there were caused by this PR.
Once #107786 and #107791 are merged there should be no remaining diffs in this PR.

@kunalspathak
Copy link
Member

Latest version fixes up the diffs there were caused by this PR. Once #107786 and #107791 are merged there should be no remaining diffs in this PR.

Let's rebase this PR once the above mentioned PRs are merged to confirm there is zero asmdiff.

@a74nh
Copy link
Contributor Author

a74nh commented Sep 26, 2024

Rebased on top of the other fixes. As mentioned in #107786, fixed it so that BuildDelayFreeUses() is only called for matching register types. Need to confirm that there are no spmi diffs

@a74nh
Copy link
Contributor Author

a74nh commented Sep 27, 2024

No asm diffs now:

❯ python3 ./src/coreclr/scripts/superpmi.py collect $CORE_ROOT/corerun "./artifacts/tests/coreclr/linux.arm64.Checked/JIT/HardwareIntrinsics/HardwareIntrinsics_Arm_ro/HardwareIntrinsics_Arm_ro.dll"
[08:27:14] ================ Logging to /home/alahay01/dotnet/runtime_sve_api/artifacts/spmi/superpmi.log
[08:27:14] SuperPMI collect
[08:27:14] SuperPMI JIT Path: /home/alahay01/dotnet/runtime_sve_api/artifacts/tests/coreclr/linux.arm64.Checked/Tests/Core_Root/libclrjit.so
[08:27:14] Collecting using command:
[08:27:14]   /home/alahay01/dotnet/runtime_sve_api/artifacts/tests/coreclr/linux.arm64.Checked/Tests/Core_Root/corerun ./artifacts/tests/coreclr/linux.arm64.Checked/JIT/HardwareIntrinsics/HardwareIntrinsics_Arm_ro/HardwareIntrinsics_Arm_ro.dll
[08:30:53] Merging MC files
[08:30:57] Copy base MCH file to final MCH file
[08:31:08] Creating TOC file
[08:31:10] Generated MCH file: /home/alahay01/dotnet/runtime_sve_api/linux.arm64.Checked.mch

❯ python3 ./src/coreclr/scripts/superpmi.py asmdiffs -mch_files /home/alahay01/dotnet/runtime_sve_api/linux.arm64.Checked.mch
[08:35:57] ================ Logging to /home/alahay01/dotnet/runtime_sve_api/artifacts/spmi/superpmi.1.log
[08:35:57] Using JIT/EE Version from jiteeversionguid.h: b75a5475-ff22-4078-9551-2024ce03d383
[08:35:58] Baseline hash: cdc8418a7f4e51b771db2ae7ee5cde5f479cde7e
[08:35:58] Download: https://clrjit2.blob.core.windows.net/jitrollingbuild/builds/f1bcbeb5fa2fe84698b62d88dd35199f0d7fbedb/linux/arm64/Checked/libclrjit.so -> /home/alahay01/dotnet/runtime_sve_api/artifacts/spmi/basejit/f1bcbeb5fa2fe84698b62d88dd35199f0d7fbedb.linux.arm64.Checked/libclrjit.so
Downloading 5.6/5.6 MB...
[08:35:59] Downloaded https://clrjit2.blob.core.windows.net/jitrollingbuild/builds/f1bcbeb5fa2fe84698b62d88dd35199f0d7fbedb/linux/arm64/Checked/libclrjit.so
[08:35:59] Using baseline /home/alahay01/dotnet/runtime_sve_api/artifacts/spmi/basejit/f1bcbeb5fa2fe84698b62d88dd35199f0d7fbedb.linux.arm64.Checked/libclrjit.so
[08:35:59] Using coredistools found at /home/alahay01/dotnet/runtime_sve_api/artifacts/tests/coreclr/linux.arm64.Checked/Tests/Core_Root/libcoredistools.so
[08:35:59] SuperPMI ASM diffs
[08:35:59] Base JIT Path: /home/alahay01/dotnet/runtime_sve_api/artifacts/spmi/basejit/f1bcbeb5fa2fe84698b62d88dd35199f0d7fbedb.linux.arm64.Checked/libclrjit.so
[08:35:59] Diff JIT Path: /home/alahay01/dotnet/runtime_sve_api/artifacts/tests/coreclr/linux.arm64.Checked/Tests/Core_Root/libclrjit.so
[08:35:59] Using MCH files:
[08:35:59]   /home/alahay01/dotnet/runtime_sve_api/linux.arm64.Checked.mch
[08:35:59] Running asm diffs of /home/alahay01/dotnet/runtime_sve_api/linux.arm64.Checked.mch
[08:36:39] Clean SuperPMI diff (72927 contexts processed)
[08:36:39] Asm diffs summary:
[08:36:39]   Summary Markdown file: /home/alahay01/dotnet/runtime_sve_api/artifacts/spmi/diff_summary.md
[08:36:39]   Short Summary Markdown file: /home/alahay01/dotnet/runtime_sve_api/artifacts/spmi/diff_short_summary.md
[08:36:39]   No asm diffs

❯ cat /home/alahay01/dotnet/runtime_sve_api/artifacts/spmi/diff_summary.md
Diffs are based on <span style="color:#1460aa">72,927</span> contexts (<span style="color:#1460aa">1</span> MinOpts, <span style="color:#1460aa">72,926</span> FullOpts).

No diffs found.

<details>
<summary>Details</summary>
<div style="margin-left:1em">

#### Context information

|Collection|Diffed contexts|MinOpts|FullOpts|Missed, base|Missed, diff|
|---|--:|--:|--:|--:|--:|
|linux.arm64.Checked.mch|72,927|1|72,926|0 (0.00%)|0 (0.00%)|




</div></details>

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arch-arm64 area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI arm-sve Work related to arm64 SVE/SVE2 support community-contribution Indicates that the PR has been added by a community member
Projects
None yet
Development

Successfully merging this pull request may close these issues.

JIT: SVE Cleanup - Simplify handling of RMW intrinsics in LSRA
2 participants