Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle more than 64 registers - Part 4 #102921

Merged
merged 51 commits into from
Jun 5, 2024
Merged

Conversation

kunalspathak
Copy link
Member

@kunalspathak kunalspathak commented May 31, 2024

This adds high field in regMaskTP under HAS_MORE_THAN_64_REGISTERS and update various code paths to should take into account querying or updating this field.

I have made commits that are isolated, so putting here for reference for ease review.

[ 6f07fc3 ] (HEAD -> part4, kp/part4) jit format
[ 95ca449 ] genRegNumFromMask() to take SingleTypeRegSet
[ 7de39e7 ] Add assert in genFindLowestBit() and genMaxOneBit() to be only applicable for gpr/float
[ fb0c4ca ] bug fix
[ c72147f ] Initialize regMaskTP in genRegMask()
[ edccae2 ] Change some more methods to return SingleTypeRegMask
[ cedb079 ] Add various variants of genRegNumFromMask() and use them
[ 8b7084a ] Move registerType in Referencable, add getRegisterType()
[ 3741ac6 ] Move mapTypeToRegTypeIndex() and mapRegNumtoRegTypeIndex() in compiler.hpp
[ 687ed43 ] Add operator[]
[ b2d9b1c ] jit format
[ e56dc9b ] fix some other build failures
[ 5f63e60 ] misc cleanup
[ 7542417 ] operators() that operate on regNum
[ 2d1d2bb ] GetRegSetForType() defintion
[ d14f56a ] RemoveRegsetForType()
[ 147a88b ] AddRegsetForType()
[ 535c2e1 ] AddGprRegs()
[ 1d0f492 ] IsRegNumInMask() and IsRegNumPresent()
[ f370cf1 ] Use IsEmpty() and IsNonEmpty()
[ fd18c7f ] RemoveRegNumFromMask() and RemoveRegNum() and consume them
[ ed203f3 ] Use AddRegNumInMask() and AddRegNum()
[ c192d95 ] AddRegNumInMask/AddRegNum
[ a38460a ] change the structure of regMaskTP for performance
[ 4fe1caa ] Add regTypeTag in register*.h files
[ 27b8170 ] Use HAS_MORE_THAN_64_REGISTERS macro
[ 6ffccb3 ] Make high only to arm64
[ fed5c2a ] use genSingleTypeRegMask()
[ b4b08fa ] Add high to all platforms
[ 53c29d1 ] jit fornat
[ ccddf1c ] Furture reduce the TP cost
[ b3218e1 ] Revert "Do not use const regMaskTP& as parameter"
[ 78b34fa ] Do not use const regMaskTP& as parameter
[ e895528 ] Add high field

This reverts commit d53b620.

By not passing `regMaskTP` using constant reference, there is a small cost
we have to pay:

Without constant reference:
Overall: 1.80%
MinOpts: 2.05%
FullOpts: 1.62%

With constant reference:
Overall: 1.74%
MinOpts: 1.94%
FullOpts: 1.6%
Moved the *= operators as instance of `regMaskTP` so the `.low`
private field can directly be manipulated instead of converting
the `64-bit` value in `regMaskTP` before doing any operation.

Overall: 0.74%
MinOpts: 0.82%
FullOpts: 0.68%
@kunalspathak
Copy link
Member Author

TP diffs

image

@kunalspathak kunalspathak marked this pull request as ready for review June 3, 2024 03:42
@kunalspathak
Copy link
Member Author

@dotnet/jit-contrib @jakobbotsch PTAL

#endif // TARGET_XARCH
}
#ifdef TARGET_ARM
if (call->IsVirtualStub())
{
killMask |= compiler->virtualStubParamInfo->GetRegMask();
killMask.AddGprRegs(compiler->virtualStubParamInfo->GetRegMask());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this using AddGprRegs? Isn't virtualStubParamInfo->GetRegMask() returning a regMaskTP?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The GetRegMask() returns _regMask_enum which is uint64_t. So this was the simplest thing to do ATM

@@ -911,15 +919,15 @@ regMaskTP LinearScan::getKillSetForBlockStore(GenTreeBlk* blkNode)
if (isCopyBlk)
{
// rep movs kills RCX, RDI and RSI
killMask = RBM_RCX | RBM_RDI | RBM_RSI;
killMask.AddGprRegs(RBM_RCX | RBM_RDI | RBM_RSI);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aren't RBM_* regMaskTP already? I don't really understand changes here (in multiple places it seems like we shouldn't need to switch to SingleRegTypeMask overloads)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, all the RBM_* are still uint64_t except for allAvailableRegs. Here is what I am planning to do after this PR:

  1. Send a PR that convert the killMask's registerAssignment to regMaskTP (by unioning existing field or equivalent)
  2. I am still deciding if we should convert all or just subset of RBM_*, but most likely I will convert just some RBM_* to regMaskTP that contains mask. All the remaining can continue to be uint64_t. In future, if we feel a need for TP improvement reason, we can convert them all to regMaskTP (once predicate register work is complete).

Comment on lines +13 to +22
SingleTypeRegSet value = genSingleTypeRegMask(reg);
#ifdef HAS_MORE_THAN_64_REGISTERS
if (reg < 64)
{
low |= value;
}
else
{
high |= value;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be more efficient to inline the computation of value in the "else" case as reg - 64?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The mask value for registers > 63, will anyway start from 0x0, so it should be fine, the way we are doing currently.

@@ -305,56 +364,69 @@ struct regMaskTP

SingleTypeRegSet GetPredicateRegSet() const
{
#ifdef HAS_MORE_THAN_64_REGISTERS
return getHigh();
#else
return getLow();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't it be getLow() >> REG_P0 or something like that?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For x64, when we create the RBM_ALLMASK, we already have bits set to 0 except the ones that represent mask registers, hence should be safe to just return getLow().

};

static regMaskTP operator^(regMaskTP first, regMaskTP second)
static regMaskTP operator^(const regMaskTP& first, const regMaskTP& second)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this make a difference?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't think so. I just wanted to make it consistent like other methods.

@jakobbotsch
Copy link
Member

Do you know what's up with the linux-x64 TP diffs? We don't define MORE_THAN_64_REGISTERS there right?

Can you post the detailed TP diff after your latest changes?

@kunalspathak
Copy link
Member Author

Do you know what's up with the linux-x64 TP diffs? We don't define MORE_THAN_64_REGISTERS there right?

yes, that's a good question. Let me see the numbers with latest round and debug from there.

Can you post the detailed TP diff after your latest changes?

Will do.

@kunalspathak
Copy link
Member Author

It's similar to what I posted in #102921 (comment). This is for overall collection, but for MinOpts, the *Minimal gets amplified and contributes more.

?allocateRegistersMinimal@LinearScan@@QEAAXXZ                                                                                                               : 182568724 : +23.14%    : 16.30% : +0.6469%
?mergeRegisterPreferences@Interval@@QEAAX_K@Z                                                                                                               : 116124486 : +19063.90% : 10.36% : +0.4115%
?processBlockStartLocations@LinearScan@@AEAAXPEAUBasicBlock@@@Z                                                                                             : 86667378  : +41.47%    : 7.74%  : +0.3071%
??$allocateRegisters@$0A@@LinearScan@@QEAAXXZ                                                                                                               : 77258469  : +20.03%    : 6.90%  : +0.2737%
?genConsumeReg@CodeGen@@IEAA?AW4_regNumber_enum@@PEAUGenTree@@@Z                                                                                            : 74002956  : +38.26%    : 6.61%  : +0.2622%
?freeRegisters@LinearScan@@AEAAXUregMaskTP@@@Z                                                                                                              : 72465408  : +55.77%    : 6.47%  : +0.2568%
?gcMarkRegPtrVal@GCInfo@@QEAAXW4_regNumber_enum@@W4var_types@@@Z                                                                                            : 65520075  : +190.07%   : 5.85%  : +0.2322%
?emitGCregDeadUpd@emitter@@QEAAXW4_regNumber_enum@@PEAE@Z                                                                                                   : 46105722  : +103.71%   : 4.12%  : +0.1634%
?newRefPosition@LinearScan@@AEAAPEAVRefPosition@@PEAVInterval@@IW4RefType@@PEAUGenTree@@_KI@Z                                                               : 40611925  : +10.51%    : 3.62%  : +0.1439%
?genCodeForBBlist@CodeGen@@IEAAXXZ                                                                                                                          : 30589663  : +14.30%    : 2.73%  : +0.1084%
?gtGetRegMask@GenTree@@QEBA?AUregMaskTP@@XZ                                                                                                                 : 25739940  : +49.48%    : 2.30%  : +0.0912%
??$select@$0A@@RegisterSelection@LinearScan@@QEAA_KPEAVInterval@@PEAVRefPosition@@@Z                                                                        : 22097126  : +3.89%     : 1.97%  : +0.0783%
?updateAssignedInterval@LinearScan@@AEAAXPEAVRegRecord@@PEAVInterval@@@Z                                                                                    : 21170913  : +19.62%    : 1.89%  : +0.0750%
?genCodeForIndir@CodeGen@@IEAAXPEAUGenTreeIndir@@@Z                                                                                                         : 18902549  : NA         : 1.69%  : +0.0670%
?assignPhysReg@LinearScan@@AEAAXPEAVRegRecord@@PEAVInterval@@@Z                                                                                             : 18616711  : +21.15%    : 1.66%  : +0.0660%
?allocateRegMinimal@LinearScan@@AEAA?AW4_regNumber_enum@@PEAVInterval@@PEAVRefPosition@@@Z                                                                  : 18457247  : +3.07%     : 1.65%  : +0.0654%
?BuildNode@LinearScan@@AEAAHPEAUGenTree@@@Z                                                                                                                 : 15926110  : +6.54%     : 1.42%  : +0.0564%
?associateRefPosWithInterval@LinearScan@@AEAAXPEAVRefPosition@@@Z                                                                                           : 13404784  : +6.49%     : 1.20%  : +0.0475%
?emitUpdateLiveGCregs@emitter@@QEAAXW4GCtype@@UregMaskTP@@PEAE@Z                                                                                            : 12141813  : NA         : 1.08%  : +0.0430%
?buildPhysRegRecords@LinearScan@@AEAAXXZ                                                                                                                    : 10030600  : +15.82%    : 0.90%  : +0.0355%
?buildKillPositionsForNode@LinearScan@@AEAA_NPEAUGenTree@@IUregMaskTP@@@Z                                                                                   : 7948467   : +26.95%    : 0.71%  : +0.0282%
?genUpdateRegLife@CodeGenInterface@@QEAAXPEBVLclVarDsc@@_N1@Z                                                                                               : 7850748   : +238.74%   : 0.70%  : +0.0278%
?unassignPhysReg@LinearScan@@AEAAXPEAVRegRecord@@PEAVRefPosition@@@Z                                                                                        : 7525635   : +6.50%     : 0.67%  : +0.0267%
?genProduceReg@CodeGen@@IEAAXPEAUGenTree@@@Z                                                                                                                : 7359582   : +4.72%     : 0.66%  : +0.0261%
?emitGCregLiveUpd@emitter@@QEAAXW4GCtype@@W4_regNumber_enum@@PEAE@Z                                                                                         : 7229355   : +26.09%    : 0.65%  : +0.0256%
??$resolveRegisters@$0A@@LinearScan@@QEAAXXZ                                                                                                                : 6133697   : +3.32%     : 0.55%  : +0.0217%
?emitIns_Call@emitter@@QEAAXW4EmitCallType@1@PEAUCORINFO_METHOD_STRUCT_@@PEAX_JW4emitAttr@@4AEBQEA_KUregMaskTP@@6AEBVDebugInfo@@W4_regNumber_enum@@8I3_N9@Z : 5219317   : +11.53%    : 0.47%  : +0.0185%
??0CodeGen@@QEAA@PEAVCompiler@@@Z                                                                                                                           : 4814688   : NA         : 0.43%  : +0.0171%
?genSetRegToConst@CodeGen@@IEAAXW4_regNumber_enum@@W4var_types@@PEAUGenTree@@@Z                                                                             : 3313331   : +22.93%    : 0.30%  : +0.0117%
??$compChangeLife@$00@Compiler@@QEAAXAEBQEA_K@Z                                                                                                             : 3312382   : +25.51%    : 0.30%  : +0.0117%
??0LinearScan@@QEAA@PEAVCompiler@@@Z                                                                                                                        : 3310098   : +10.63%    : 0.30%  : +0.0117%
??$resolveRegisters@$00@LinearScan@@QEAAXXZ                                                                                                                 : 3214729   : +3.26%     : 0.29%  : +0.0114%
?instGen_Set_Reg_To_Imm@CodeGen@@QEAAXW4emitAttr@@W4_regNumber_enum@@_JW4insFlags@@@Z                                                                       : 2852398   : +5.24%     : 0.25%  : +0.0101%
?emitOutputCall@emitter@@QEAAIPEAUinsGroup@@PEAEPEAUinstrDesc@1@I@Z                                                                                         : 2041454   : +5.22%     : 0.18%  : +0.0072%
?BuildCall@LinearScan@@AEAAHPEAUGenTreeCall@@@Z                                                                                                             : 1996388   : +3.67%     : 0.18%  : +0.0071%
?genEmitCall@CodeGen@@IEAAXHPEAUCORINFO_METHOD_STRUCT_@@PEAXW4emitAttr@@2AEBVDebugInfo@@W4_regNumber_enum@@_N5@Z                                            : 1749883   : +28.61%    : 0.16%  : +0.0062%
?resolveLocalRef@LinearScan@@AEAAXPEAUBasicBlock@@PEAUGenTreeLclVar@@PEAVRefPosition@@@Z                                                                    : 1747318   : +2.74%     : 0.16%  : +0.0062%
?genRestoreCalleeSavedRegistersHelp@CodeGen@@IEAAXUregMaskTP@@HH@Z                                                                                          : 1631108   : +50.21%    : 0.15%  : +0.0058%
?addKillForRegs@LinearScan@@AEAAXUregMaskTP@@I@Z                                                                                                            : 1589200   : +12.20%    : 0.14%  : +0.0056%
?BuildKills@LinearScan@@AEAAXPEAUGenTree@@UregMaskTP@@@Z                                                                                                    : 1420222   : +10852.99% : 0.13%  : +0.0050%
?genBuildRegPairsStack@CodeGen@@KAXUregMaskTP@@PEAV?$ArrayStack@URegPair@CodeGen@@@@@Z                                                                      : 1419987   : +28.03%    : 0.13%  : +0.0050%
?genPushCalleeSavedRegisters@CodeGen@@IEAAXW4_regNumber_enum@@PEA_N@Z                                                                                       : 1404012   : +20.71%    : 0.13%  : +0.0050%
??$buildIntervals@$0A@@LinearScan@@QEAAXXZ                                                                                                                  : 1347566   : +2.36%     : 0.12%  : +0.0048%
?emitEndCodeGen@emitter@@QEAAIPEAVCompiler@@_N11IPEAI2PEAPEAX33333@Z                                                                                        : -1232953  : -0.86%     : 0.11%  : -0.0044%
?BuildUse@LinearScan@@AEAAPEAVRefPosition@@PEAUGenTree@@_KH@Z                                                                                               : -1950613  : -0.94%     : 0.17%  : -0.0069%
?emitEncodeCallGCregs@emitter@@CAXUregMaskTP@@PEAUinstrDesc@1@@Z                                                                                            : -2006375  : -11.29%    : 0.18%  : -0.0071%
??$buildIntervals@$00@LinearScan@@QEAAXXZ                                                                                                                   : -2877829  : -3.81%     : 0.26%  : -0.0102%
??$processBlockEndAllocation@$00@LinearScan@@AEAAXPEAUBasicBlock@@@Z                                                                                        : -4108692  : -99.97%    : 0.37%  : -0.0146%
?getCodeGenerator@@YAPEAVCodeGenInterface@@PEAVCompiler@@@Z                                                                                                 : -4363311  : -100.00%   : 0.39%  : -0.0155%
?resetAllRegistersState@LinearScan@@AEAAXXZ                                                                                                                 : -6253521  : -11.16%    : 0.56%  : -0.0222%
?genCodeForTreeNode@CodeGen@@IEAAXPEAUGenTree@@@Z                                                                                                           : -10523135 : -4.06%     : 0.94%  : -0.0373%
?updateRegisterPreferences@Interval@@QEAAX_K@Z                                                                                                              : -14742851 : -100.00%   : 1.32%  : -0.0522%

@kunalspathak
Copy link
Member Author

yes, that's a good question. Let me see the numbers with latest round and debug from there.

as we saw offline, this is the fact that our linux builds are not link time optimization enabled and hence give the difference because of new methods like AddRegNumInMask(), etc. getting called. If I enable LTO for baseline/this PR (locally), I see may be 0.07% TP diff.

@kunalspathak
Copy link
Member Author

allocateRegistersMinimal

bulk of difference is coming from the fact that we are updating high field as well as the fact that we need to check the TYP_MASK before touching the low or high field. All of the code below runs in a for loop for RefPosition so no surprise that it increases the TP of such methods.

image

image

image

@kunalspathak
Copy link
Member Author

If I remove FORCEINLINE from selectMinimal as well as from try_* methods, they do not show up in the TP diffs much. So the overall cost of allocateRegistersMinimal is self contained in that method itself. Here is the only diff I can spot with selectMinimal no inline:

image

Base: 28258643176, Diff: 29277964749, +3.6071%

?allocateRegistersMinimal@LinearScan@@QEAAXXZ                                                                                                               : 182568724 : +23.14%    : 16.30% : +0.6461%
?mergeRegisterPreferences@Interval@@QEAAX_K@Z                                                                                                               : 116124486 : +19063.90% : 10.37% : +0.4109%
?processBlockStartLocations@LinearScan@@AEAAXPEAUBasicBlock@@@Z                                                                                             : 86667378  : +41.47%    : 7.74%  : +0.3067%
??$allocateRegisters@$0A@@LinearScan@@QEAAXXZ                                                                                                               : 77258469  : +20.03%    : 6.90%  : +0.2734%
?genConsumeReg@CodeGen@@IEAA?AW4_regNumber_enum@@PEAUGenTree@@@Z                                                                                            : 74002956  : +38.26%    : 6.61%  : +0.2619%
?freeRegisters@LinearScan@@AEAAXUregMaskTP@@@Z                                                                                                              : 72465408  : +55.77%    : 6.47%  : +0.2564%
?gcMarkRegPtrVal@GCInfo@@QEAAXW4_regNumber_enum@@W4var_types@@@Z                                                                                            : 65520075  : +190.07%   : 5.85%  : +0.2319%
?emitGCregDeadUpd@emitter@@QEAAXW4_regNumber_enum@@PEAE@Z                                                                                                   : 46105722  : +103.71%   : 4.12%  : +0.1632%
?newRefPosition@LinearScan@@AEAAPEAVRefPosition@@PEAVInterval@@IW4RefType@@PEAUGenTree@@_KI@Z                                                               : 40611925  : +10.51%    : 3.63%  : +0.1437%
?genCodeForBBlist@CodeGen@@IEAAXXZ                                                                                                                          : 30589663  : +14.30%    : 2.73%  : +0.1082%
?gtGetRegMask@GenTree@@QEBA?AUregMaskTP@@XZ                                                                                                                 : 25739940  : +49.48%    : 2.30%  : +0.0911%
??$select@$0A@@RegisterSelection@LinearScan@@QEAA_KPEAVInterval@@PEAVRefPosition@@@Z                                                                        : 23154175  : +4.08%     : 2.07%  : +0.0819%
?updateAssignedInterval@LinearScan@@AEAAXPEAVRegRecord@@PEAVInterval@@@Z                                                                                    : 21170913  : +19.62%    : 1.89%  : +0.0749%
?genCodeForIndir@CodeGen@@IEAAXPEAUGenTreeIndir@@@Z                                                                                                         : 18902539  : NA         : 1.69%  : +0.0669%
?assignPhysReg@LinearScan@@AEAAXPEAVRegRecord@@PEAVInterval@@@Z                                                                                             : 18616711  : +21.15%    : 1.66%  : +0.0659%
?BuildNode@LinearScan@@AEAAHPEAUGenTree@@@Z                                                                                                                 : 15926110  : +6.54%     : 1.42%  : +0.0564%
?associateRefPosWithInterval@LinearScan@@AEAAXPEAVRefPosition@@@Z                                                                                           : 13404784  : +6.49%     : 1.20%  : +0.0474%
?emitUpdateLiveGCregs@emitter@@QEAAXW4GCtype@@UregMaskTP@@PEAE@Z                                                                                            : 12141813  : NA         : 1.08%  : +0.0430%
?selectMinimal@RegisterSelection@LinearScan@@QEAA_KPEAVInterval@@PEAVRefPosition@@@Z                                                                        : 12134074  : +4.71%     : 1.08%  : +0.0429%
?buildPhysRegRecords@LinearScan@@AEAAXXZ                                                                                                                    : 10030600  : +15.82%    : 0.90%  : +0.0355%
?buildKillPositionsForNode@LinearScan@@AEAA_NPEAUGenTree@@IUregMaskTP@@@Z                                                                                   : 7948467   : +26.95%    : 0.71%  : +0.0281%
?genUpdateRegLife@CodeGenInterface@@QEAAXPEBVLclVarDsc@@_N1@Z                                                                                               : 7850748   : +238.74%   : 0.70%  : +0.0278%
?unassignPhysReg@LinearScan@@AEAAXPEAVRegRecord@@PEAVRefPosition@@@Z                                                                                        : 7525635   : +6.50%     : 0.67%  : +0.0266%
?genProduceReg@CodeGen@@IEAAXPEAUGenTree@@@Z                                                                                                                : 7359582   : +4.72%     : 0.66%  : +0.0260%
?emitGCregLiveUpd@emitter@@QEAAXW4GCtype@@W4_regNumber_enum@@PEAE@Z                                                                                         : 7229355   : +26.09%    : 0.65%  : +0.0256%
??$resolveRegisters@$0A@@LinearScan@@QEAAXXZ                                                                                                                : 6133697   : +3.32%     : 0.55%  : +0.0217%
?allocateRegMinimal@LinearScan@@AEAA?AW4_regNumber_enum@@PEAVInterval@@PEAVRefPosition@@@Z                                                                  : 5615586   : +7.80%     : 0.50%  : +0.0199%
?emitIns_Call@emitter@@QEAAXW4EmitCallType@1@PEAUCORINFO_METHOD_STRUCT_@@PEAX_JW4emitAttr@@4AEBQEA_KUregMaskTP@@6AEBVDebugInfo@@W4_regNumber_enum@@8I3_N9@Z : 5219317   : +11.53%    : 0.47%  : +0.0185%
??0CodeGen@@QEAA@PEAVCompiler@@@Z                                                                                                                           : 4814688   : NA         : 0.43%  : +0.0170%
?genSetRegToConst@CodeGen@@IEAAXW4_regNumber_enum@@W4var_types@@PEAUGenTree@@@Z                                                                             : 3313331   : +22.93%    : 0.30%  : +0.0117%
??$compChangeLife@$00@Compiler@@QEAAXAEBQEA_K@Z                                                                                                             : 3312382   : +25.51%    : 0.30%  : +0.0117%
??0LinearScan@@QEAA@PEAVCompiler@@@Z                                                                                                                        : 3310098   : +10.63%    : 0.30%  : +0.0117%
??$resolveRegisters@$00@LinearScan@@QEAAXXZ                                                                                                                 : 3214729   : +3.26%     : 0.29%  : +0.0114%
?instGen_Set_Reg_To_Imm@CodeGen@@QEAAXW4emitAttr@@W4_regNumber_enum@@_JW4insFlags@@@Z                                                                       : 2852398   : +5.24%     : 0.25%  : +0.0101%
?emitOutputCall@emitter@@QEAAIPEAUinsGroup@@PEAEPEAUinstrDesc@1@I@Z                                                                                         : 2041454   : +5.22%     : 0.18%  : +0.0072%
?BuildCall@LinearScan@@AEAAHPEAUGenTreeCall@@@Z                                                                                                             : 1996388   : +3.67%     : 0.18%  : +0.0071%
?genEmitCall@CodeGen@@IEAAXHPEAUCORINFO_METHOD_STRUCT_@@PEAXW4emitAttr@@2AEBVDebugInfo@@W4_regNumber_enum@@_N5@Z                                            : 1749883   : +28.61%    : 0.16%  : +0.0062%
?resolveLocalRef@LinearScan@@AEAAXPEAUBasicBlock@@PEAUGenTreeLclVar@@PEAVRefPosition@@@Z                                                                    : 1747318   : +2.74%     : 0.16%  : +0.0062%
?genRestoreCalleeSavedRegistersHelp@CodeGen@@IEAAXUregMaskTP@@HH@Z                                                                                          : 1631108   : +50.21%    : 0.15%  : +0.0058%
?addKillForRegs@LinearScan@@AEAAXUregMaskTP@@I@Z                                                                                                            : 1589200   : +12.20%    : 0.14%  : +0.0056%
?BuildKills@LinearScan@@AEAAXPEAUGenTree@@UregMaskTP@@@Z                                                                                                    : 1420222   : +10852.99% : 0.13%  : +0.0050%
?genBuildRegPairsStack@CodeGen@@KAXUregMaskTP@@PEAV?$ArrayStack@URegPair@CodeGen@@@@@Z                                                                      : 1419987   : +28.03%    : 0.13%  : +0.0050%
?genPushCalleeSavedRegisters@CodeGen@@IEAAXW4_regNumber_enum@@PEA_N@Z                                                                                       : 1404012   : +20.71%    : 0.13%  : +0.0050%
??$buildIntervals@$0A@@LinearScan@@QEAAXXZ                                                                                                                  : 1347566   : +2.36%     : 0.12%  : +0.0048%
?emitEndCodeGen@emitter@@QEAAIPEAVCompiler@@_N11IPEAI2PEAPEAX33333@Z                                                                                        : -1232953  : -0.86%     : 0.11%  : -0.0044%
?BuildUse@LinearScan@@AEAAPEAVRefPosition@@PEAUGenTree@@_KH@Z                                                                                               : -1950613  : -0.94%     : 0.17%  : -0.0069%
?emitEncodeCallGCregs@emitter@@CAXUregMaskTP@@PEAUinstrDesc@1@@Z                                                                                            : -2006375  : -11.29%    : 0.18%  : -0.0071%
??$buildIntervals@$00@LinearScan@@QEAAXXZ                                                                                                                   : -2881021  : -3.81%     : 0.26%  : -0.0102%
??$processBlockEndAllocation@$00@LinearScan@@AEAAXPEAUBasicBlock@@@Z                                                                                        : -4108692  : -99.97%    : 0.37%  : -0.0145%
?getCodeGenerator@@YAPEAVCodeGenInterface@@PEAVCompiler@@@Z                                                                                                 : -4363311  : -100.00%   : 0.39%  : -0.0154%
?resetAllRegistersState@LinearScan@@AEAAXXZ                                                                                                                 : -6253521  : -11.16%    : 0.56%  : -0.0221%
?genCodeForTreeNode@CodeGen@@IEAAXPEAUGenTree@@@Z                                                                                                           : -10523135 : -4.06%     : 0.94%  : -0.0372%
?updateRegisterPreferences@Interval@@QEAAX_K@Z                                                                                                              : -14742851 : -100.00%   : 1.32%  : -0.0522%

@kunalspathak
Copy link
Member Author

our linux builds are not link time optimization enabled

#103058 should fix that. Let me know if you have more comments or is this good to go.

Comment on lines 853 to +864
#if defined(TARGET_XARCH)
killMask &= ~(RBM_FLT_CALLEE_TRASH | RBM_MSK_CALLEE_TRASH);

#ifdef TARGET_AMD64
killMask.RemoveRegsetForType(RBM_FLT_CALLEE_TRASH.getLow(), FloatRegisterType);
killMask.RemoveRegsetForType(RBM_MSK_CALLEE_TRASH.getLow(), MaskRegisterType);
#else
killMask &= ~RBM_FLT_CALLEE_TRASH;
killMask.RemoveRegsetForType(RBM_FLT_CALLEE_TRASH, FloatRegisterType);
killMask &= ~RBM_MSK_CALLEE_TRASH;
#endif // TARGET_AMD64

#else
killMask.RemoveRegsetForType(RBM_FLT_CALLEE_TRASH, FloatRegisterType);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this will change back to what it was before once we change RBM_* to be regMaskTP?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't want to block on it, but I guess it would be nice to make the RBM_* change first, since it seems a lot of the changes of this PR end up having to deal with the fact that RBM_* aren't regMaskTP.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, there are few places that we have to do, but at this point I think I will go ahead and merge this PR as-is and probably start working on RBM_ change next.

Copy link
Member

@jakobbotsch jakobbotsch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. We should open an issue about doing a more detailed investigation of where the throughput cost inside allocateRegistersMinimal comes from since at this point we do not have a good understanding of that.

@kunalspathak
Copy link
Member Author

LGTM. We should open an issue about doing a more detailed investigation of where the throughput cost inside allocateRegistersMinimal comes from since at this point we do not have a good understanding of that.

Opened #103079

@kunalspathak kunalspathak merged commit afc2df0 into dotnet:main Jun 5, 2024
107 checks passed
@kunalspathak kunalspathak deleted the part4 branch June 5, 2024 16:55
@github-actions github-actions bot locked and limited conversation to collaborators Jul 6, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants