Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update the CPUID and XSAVE logics for APX #104637

Open
wants to merge 10 commits into
base: main
Choose a base branch
from

Conversation

Ruihan-Yin
Copy link
Contributor

@Ruihan-Yin Ruihan-Yin commented Jul 9, 2024

Previously #103019

Overview on the changes:

  1. Infrastructural changes on XArchIntrinsicConstants: Compress all the Avx512 related flags into 1 - Avx512f+bw+cd+dq+vl to Avx512, this saves more space in XArchIntrinsicConstants so that we can hold more x86 ISAs, like here, APX.
  1. CPUID check updates for APX: CR4[XSAVE](existing) -> XCR0[APX_F] -> CPUID(7,1).EDX[APX_F] - the current status is that due to the missing macro definition for APX on the OS level, the second check will fail anyways, and it may break the build on CI (to be verified).
  • Edit: made some workaround for this part in the code.
  1. APX introduced 16 EGPRs, which are XSTATE enabled, this PR provides the corresponding changes for XSAVE behaviors.

@dotnet-issue-labeler dotnet-issue-labeler bot added the needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners label Jul 9, 2024
@dotnet-policy-service dotnet-policy-service bot added the community-contribution Indicates that the PR has been added by a community member label Jul 9, 2024
@Ruihan-Yin
Copy link
Contributor Author

Hi @tannergooding,

I reopened the PR for APX CPUID updates here, I will resolve the conflict on guid, let CI start soon and fix fails popping up.

I understand this PR is targeting on the next release cycle, say .Net 10, so when your schedule allows, I wonder if you could review this PR for the first round, thanks!

@Ruihan-Yin
Copy link
Contributor Author

resolved the conflict.

@am11 am11 added area-PAL-coreclr and removed needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners labels Jul 19, 2024
@Ruihan-Yin Ruihan-Yin changed the title [NO REVIEW] Update the CPUID and XSAVE logics for APX [NO MERGE] Update the CPUID and XSAVE logics for APX Jul 19, 2024
@Ruihan-Yin
Copy link
Contributor Author

The REX2 enabling PR (#106557) is there, that PR is based on this one, since now main is accepting .net 10 changes, I will rebase the changes to latest main and continue the works.

@Ruihan-Yin
Copy link
Contributor Author

resolved conflicts with main

@Ruihan-Yin Ruihan-Yin changed the title [NO MERGE] Update the CPUID and XSAVE logics for APX Update the CPUID and XSAVE logics for APX Aug 28, 2024
@Ruihan-Yin
Copy link
Contributor Author

@tannergooding This PR is ready for review.

Build failures are related to the changes in https://github.com/dotnet/runtime/blob/main/src/coreclr/pal/src/arch/amd64/context2.S

the native compiler seems to not recognize EGPRs, e.g. r16 as a legal operand. I'm not sure how we can resolve this or if this part is needed for now. And for the changes to accommodate the XSTATE changes for EGPRs, I would be very willing to taking some suggestions from the high level, there might be some changes missing or not reasonable at all. it will be much appreciated if some advice can be shared.

@Ruihan-Yin Ruihan-Yin marked this pull request as ready for review August 29, 2024 20:33
Copy link
Member

@janvorli janvorli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will also need to update the CONTEXT& CONTEXT::operator=(const CONTEXT& ctx) to properly copy the new registers if the XStateFeaturesMask has a mask for this feature. And also make sure that when it is not set, we copy only the necessary part of the context.

@@ -183,6 +183,29 @@ LOCAL_LABEL(Done_Restore_CONTEXT_FLOATING_POINT):
kmovq k6, qword ptr [rdi + (CONTEXT_KMask0 + 6 * 8)]
kmovq k7, qword ptr [rdi + (CONTEXT_KMask0 + 7 * 8)]

// TODO-xarch-apx: the definition of XSTATE mask value for APX is now missing on the OS level,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is Unix only code, so the XSTATE mask is not coming from the OS headers, but rather pal.h for C/C++ code and src/coreclr/pal/src/arch/amd64/asmconstants.h for asm. This comment can be removed.

// TODO-xarch-apx: the definition of XSTATE mask value for APX is now missing on the OS level,
// we are currently using bare value to hack it through the build process, and test the implementation through CI.
// those changes will be removed when we have the OS support for APX.
test BYTE PTR [rdi + CONTEXT_XStateFeaturesMask], 524288
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please add symbolic definition for the mask to src/coreclr/pal/src/arch/amd64/asmconstants.h next to the other XSTATE_xxx definitions?

@@ -3069,7 +3079,7 @@ BOOL Thread::RedirectCurrentThreadAtHandledJITCase(PFN_REDIRECTTARGET pTgt, CONT
if (srcFeatures != 0)
{
#if defined(TARGET_X86) || defined(TARGET_AMD64)
const DWORD64 xStateFeatureMask = XSTATE_MASK_AVX | XSTATE_MASK_AVX512;
const DWORD64 xStateFeatureMask = XSTATE_MASK_AVX | XSTATE_MASK_AVX512 | XSTATE_MASK_APX;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is Windows only code. I think that SetXStateFeaturesMask will fail if we pass in an unknown mask.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It explicitly allows unknown masks, so that you don't need to do version/build specific checks (https://learn.microsoft.com/en-us/windows/win32/api/winbase/nf-winbase-setxstatefeaturesmask):

The system silently ignores any feature specified in the FeatureMask which is not enabled on the processor.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see.

public const int SIZEOF__REGDISPLAY = 0x1a90;
public const int OFFSETOF__REGDISPLAY__SP = 0x1a78;
public const int OFFSETOF__REGDISPLAY__ControlPC = 0x1a80;
public const int SIZEOF__REGDISPLAY = 0x1b90;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure how these values are calculated, the values are updated based on the error massage, so want to double check with the reviewers.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These constants are what the sizeof(...) / offsetof(...) would get during C/C++ compilation. The offsets above are verified to match with the real values during the C/C++ compilation phase, so if the runtime build passes, you are safe.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the explanation!

@tannergooding
Copy link
Member

-- This is still on my backlog to review, just dealing with some remaining .NET 9 RC2 items before I can get to it.

@BruceForstall BruceForstall added the apx Related to the Intel Advanced Performance Extensions (APX) label Sep 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
apx Related to the Intel Advanced Performance Extensions (APX) area-PAL-coreclr community-contribution Indicates that the PR has been added by a community member
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants