Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[mono] Enable more HardwareIntrinsics tests and fix more amd64 intrinsics bugs #54127

Merged
merged 13 commits into from
Jun 17, 2021

Conversation

imhameed
Copy link
Contributor

@imhameed imhameed commented Jun 13, 2021

Enables nearly all disabled JIT/HardwareIntrinsics tests.
StoreNonTemporal_{r,ro} and Sse42.X64/Crc32_{r,ro} remain disabled due to
issues that are out of scope for this PR.

Changes:

  • Enable pclmul and aes when AOT compiling runtime tests.

  • Add a immediate_unroll_unreachable_default, used to mark the default case
    in an unrolled immediate switch as unreachable.

  • Fix OP_MULX_HL{32,64}: the low-word address is sometimes an integer produced
    by ptrtoint

    Fixes compilation of JIT/HardwareIntrinsics/X86/Bmi2/Bmi2_{r,ro}/**
    and JIT/HardwareIntrinsics/X86/Bmi2.X64/Bmi2.X64_{r,o}/**.

  • Implement immediate unrolling for Ssse3.AlignRight

    Fixes JIT/HardwareIntrinsics/X86/Ssse3/Ssse3_{r,ro}/**.

  • Implement the 3-argument overload of Bmi1.BitFieldExtract

    Fixes JIT/HardwareIntrinsics/X86/Bmi1/Bmi1_{r,ro}/** and
    JIT/HardwareIntrinsics/X86/Bmi1.X64/Bmi1.X64_{r,ro}/**.

  • Implement immediate unrolling for Aes.KeygenAssist

    Fixes JIT/HardwareIntrinsics/X86/Aes/Aes_{r,ro}/**.

  • Implement immediate unrolling for Sse41.MultipleSumAbsoluteDifferences

    Fixes
    JIT/HardwareIntrinsics/X86/Sse41/MultipleSumAbsoluteDifferences_{r,ro}/**.

  • Mask vector selection index in OP_XEXTRACT_* and OP_XINSERT_*

    LLVM insertelement and extractelement yield poison values for lane
    indices that are out of bounds, but the underlying instructions on amd64 only
    care about the lower few bits in the immediate octet and the
    Extract.UInt64.129 etc. tests in
    JIT/HardwareIntrinsics/X86/Sse41.X64/Sse41.X64_r test for this.

    Also, rename OP_SSE41_INSERT to OP_SSE41_INSERTPS and specialize this to
    the insertps overload of Sse41.Insert.

    Implement immediate unrolling for OP_SSE41_INSERTPS.

    Fixes JIT/HardwareIntrinsics/X86/Sse41.X64/Sse41.X64_{r,ro}/** and .
    JIT/HardwareIntrinsics/X86/Sse41/Sse41_{r,ro}/**.

  • Copy the upper lanes over to the destination in OP_SSE_CMPSS and OP_SSE2_CMPSD

    Also fix the overloaded types associated with SSE saturating arithmetic LLVM
    intrinsic functions.

    Fixes JIT/HardwareIntrinsics/X86/Sse2/Sse2_{r,ro}/** and
    JIT/HardwareIntrinsics/X86/Sse/Sse_{r,ro}/**.

  • Implement immediate unrolling for Pclmulqdq.CarrylessMultiply

    Fixes JIT/HardwareIntrinsics/X86/Pclmulqdq/Pclmulqdq_{r,ro}.

@imhameed imhameed added runtime-mono specific to the Mono runtime area-Codegen-LLVM-mono labels Jun 13, 2021
…roduced by ptrtoint

Fixes compilation of `JIT/HardwareIntrinsics/X86/Bmi2/Bmi2_{r,ro}/**`
and `JIT/HardwareIntrinsics/X86/Bmi2.X64/Bmi2.X64_{r,o}/**`.
Fixes `JIT/HardwareIntrinsics/X86/Ssse3/Ssse3_{r,ro}/**`.
@imhameed imhameed force-pushed the monoamd64intrintests2 branch 2 times, most recently from 4507af6 to 5727994 Compare June 14, 2021 17:54
Fixes `JIT/HardwareIntrinsics/X86/Bmi1/Bmi1_{r,ro}/**` and
`JIT/HardwareIntrinsics/X86/Bmi1.X64/Bmi1.X64_{r,ro}/**`.
Fixes `JIT/HardwareIntrinsics/X86/Aes/Aes_{r,ro}/**`.
Fixes
`JIT/HardwareIntrinsics/X86/Sse41/MultipleSumAbsoluteDifferences_{r,ro}/**`.
@imhameed imhameed force-pushed the monoamd64intrintests2 branch 2 times, most recently from 50eeb0e to f8a871e Compare June 14, 2021 22:29
@imhameed imhameed changed the title [mono] Enable more HardwareIntrinsics tests [mono] Enable more HardwareIntrinsics tests and fix more amd64 intrinsics bugs Jun 15, 2021
@imhameed imhameed marked this pull request as ready for review June 15, 2021 02:39
@SamMonoRT
Copy link
Member

@imhameed - please confirm if the failures in the runtime staging lanes are not related to your changes : https://github.com/dotnet/runtime/pull/54127/checks?check_run_id=2830483108

@imhameed
Copy link
Contributor Author

I don't think they're related and those lanes don't look like they've had a successful run in a while.

LLVM `insertelement` and `extractelement` yield poison values for lane
indices that are out of bounds, but the underlying instructions on amd64
only care about the lower few bits in the immediate octet and the
`Extract.UInt64.129` etc. tests in
`JIT/HardwareIntrinsics/X86/Sse41.X64/Sse41.X64_r` test for this.

Also, rename `OP_SSE41_INSERT` to `OP_SSE41_INSERTPS` and specialize
this to the `insertps` overload of Sse41.Insert.

Implement immediate unrolling for `OP_SSE41_INSERTPS`.

Fixes `JIT/HardwareIntrinsics/X86/Sse41.X64/Sse41.X64_{r,ro}/**` and .
`JIT/HardwareIntrinsics/X86/Sse41/Sse41_{r,ro}/**`.
…P_SSE2_CMPSD`

Also fix the overloaded types associated with SSE saturating arithmetic
LLVM intrinsic functions.

Fixes `JIT/HardwareIntrinsics/X86/Sse2/Sse2_{r,ro}/**` and
`JIT/HardwareIntrinsics/X86/Sse/Sse_{r,ro}/**`.
@imhameed imhameed merged commit eb7b3db into dotnet:main Jun 17, 2021
@ghost ghost locked as resolved and limited conversation to collaborators Jul 17, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-Codegen-LLVM-mono runtime-mono specific to the Mono runtime
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants