Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[JitDiff X64] [Daniel-Svensson] Some System.Decimal performance improvements #528

Open
MihuBot opened this issue Jul 15, 2024 · 3 comments

Comments

@MihuBot
Copy link
Owner

MihuBot commented Jul 15, 2024

Job completed in 14 minutes.
dotnet/runtime#99212

Diffs

Found 260 files with textual diffs.

Summary of Code Size diffs:
(Lower is better)

Total bytes of base: 39178912
Total bytes of diff: 39178748
Total bytes of delta: -164 (-0.00 % of base)
Total relative delta: -1.62
    diff is an improvement.
    relative diff is an improvement.


Total byte diff includes 179 bytes from reconciling methods
	Base had    2 unique methods,      139 unique bytes
	Diff had    7 unique methods,      318 unique bytes

Top file improvements (bytes):
        -164 : System.Private.CoreLib.dasm (-0.00 % of base)

1 total files with Code Size differences (1 improved, 0 regressed), 258 unchanged.

Top method regressions (bytes):
          95 (5.21 % of base) : System.Private.CoreLib.dasm - System.Decimal+DecCalc:VarDecDiv(byref,byref) (FullOpts)
          94 (Infinity of base) : System.Private.CoreLib.dasm - System.Math:<BigMul>g__SoftwareFallback|49_0(ulong,ulong,byref):ulong (FullOpts) (0 base, 1 diff methods)
          45 (Infinity of base) : System.Private.CoreLib.dasm - System.Math:<CopySign>g__SoftwareFallback|55_0(double,double):double (FullOpts) (0 base, 1 diff methods)
          45 (Infinity of base) : System.Private.CoreLib.dasm - System.Math:BigMul(uint,ulong,byref):ulong (FullOpts) (0 base, 1 diff methods)
          45 (Infinity of base) : System.Private.CoreLib.dasm - System.Math:BigMul(ulong,uint,byref):ulong (FullOpts) (0 base, 1 diff methods)
          44 (Infinity of base) : System.Private.CoreLib.dasm - System.Decimal+DecCalc:IncreaseScale(byref,uint) (FullOpts) (0 base, 1 diff methods)
          28 (Infinity of base) : System.Private.CoreLib.dasm - System.Decimal+DecCalc:Div64By32(ulong,uint):System.ValueTuple`2[uint,uint] (FullOpts) (0 base, 1 diff methods)
          17 (Infinity of base) : System.Private.CoreLib.dasm - System.Decimal+DecCalc:Div128By64(ulong,ulong):ulong (FullOpts) (0 base, 1 diff methods)
           4 (13.79 % of base) : System.Private.CoreLib.dasm - System.Decimal+DecCalc:IncreaseScale64(byref,uint) (FullOpts)

Top method improvements (bytes):
        -158 (-19.73 % of base) : System.Private.CoreLib.dasm - System.Decimal+DecCalc:VarDecMul(byref,byref) (FullOpts)
        -142 (-88.75 % of base) : System.Private.CoreLib.dasm - System.Decimal+DecCalc:Div96By64(byref,ulong):uint (FullOpts)
         -94 (-100.00 % of base) : System.Private.CoreLib.dasm - System.Math:<BigMul>g__SoftwareFallback|47_0(ulong,ulong,byref):ulong (FullOpts) (1 base, 0 diff methods)
         -45 (-100.00 % of base) : System.Private.CoreLib.dasm - System.Math:<CopySign>g__SoftwareFallback|53_0(double,double):double (FullOpts) (1 base, 0 diff methods)
         -40 (-39.22 % of base) : System.Private.CoreLib.dasm - System.Decimal+DecCalc:Div96By32(byref,uint):uint (FullOpts)
         -26 (-11.21 % of base) : System.Private.CoreLib.dasm - System.Decimal+DecCalc:VarDecCmpSub(byref,byref):int (FullOpts)
         -23 (-3.58 % of base) : System.Private.CoreLib.dasm - System.Decimal+DecCalc:VarDecFromR4(float,byref) (FullOpts)
         -21 (-1.63 % of base) : System.Private.CoreLib.dasm - System.Decimal+DecCalc:DecAddSub(byref,byref,ubyte) (FullOpts)
         -12 (-7.23 % of base) : System.Private.CoreLib.dasm - System.Decimal+DecCalc:Div128By96(byref,byref):uint (FullOpts)
          -9 (-4.07 % of base) : System.Private.CoreLib.dasm - System.Decimal+DecCalc:VarCyFromDec(byref):long (FullOpts)
          -9 (-1.46 % of base) : System.Private.CoreLib.dasm - System.Decimal+DecCalc:VarDecMod(byref,byref) (FullOpts)
          -2 (-4.08 % of base) : System.Private.CoreLib.dasm - System.Decimal+DecCalc:IncreaseScale(byref,uint):uint (FullOpts)

Top method regressions (percentages):
          17 (Infinity of base) : System.Private.CoreLib.dasm - System.Decimal+DecCalc:Div128By64(ulong,ulong):ulong (FullOpts) (0 base, 1 diff methods)
          28 (Infinity of base) : System.Private.CoreLib.dasm - System.Decimal+DecCalc:Div64By32(ulong,uint):System.ValueTuple`2[uint,uint] (FullOpts) (0 base, 1 diff methods)
          44 (Infinity of base) : System.Private.CoreLib.dasm - System.Decimal+DecCalc:IncreaseScale(byref,uint) (FullOpts) (0 base, 1 diff methods)
          94 (Infinity of base) : System.Private.CoreLib.dasm - System.Math:<BigMul>g__SoftwareFallback|49_0(ulong,ulong,byref):ulong (FullOpts) (0 base, 1 diff methods)
          45 (Infinity of base) : System.Private.CoreLib.dasm - System.Math:<CopySign>g__SoftwareFallback|55_0(double,double):double (FullOpts) (0 base, 1 diff methods)
          45 (Infinity of base) : System.Private.CoreLib.dasm - System.Math:BigMul(uint,ulong,byref):ulong (FullOpts) (0 base, 1 diff methods)
          45 (Infinity of base) : System.Private.CoreLib.dasm - System.Math:BigMul(ulong,uint,byref):ulong (FullOpts) (0 base, 1 diff methods)
           4 (13.79 % of base) : System.Private.CoreLib.dasm - System.Decimal+DecCalc:IncreaseScale64(byref,uint) (FullOpts)
          95 (5.21 % of base) : System.Private.CoreLib.dasm - System.Decimal+DecCalc:VarDecDiv(byref,byref) (FullOpts)

Top method improvements (percentages):
         -94 (-100.00 % of base) : System.Private.CoreLib.dasm - System.Math:<BigMul>g__SoftwareFallback|47_0(ulong,ulong,byref):ulong (FullOpts) (1 base, 0 diff methods)
         -45 (-100.00 % of base) : System.Private.CoreLib.dasm - System.Math:<CopySign>g__SoftwareFallback|53_0(double,double):double (FullOpts) (1 base, 0 diff methods)
        -142 (-88.75 % of base) : System.Private.CoreLib.dasm - System.Decimal+DecCalc:Div96By64(byref,ulong):uint (FullOpts)
         -40 (-39.22 % of base) : System.Private.CoreLib.dasm - System.Decimal+DecCalc:Div96By32(byref,uint):uint (FullOpts)
        -158 (-19.73 % of base) : System.Private.CoreLib.dasm - System.Decimal+DecCalc:VarDecMul(byref,byref) (FullOpts)
         -26 (-11.21 % of base) : System.Private.CoreLib.dasm - System.Decimal+DecCalc:VarDecCmpSub(byref,byref):int (FullOpts)
         -12 (-7.23 % of base) : System.Private.CoreLib.dasm - System.Decimal+DecCalc:Div128By96(byref,byref):uint (FullOpts)
          -2 (-4.08 % of base) : System.Private.CoreLib.dasm - System.Decimal+DecCalc:IncreaseScale(byref,uint):uint (FullOpts)
          -9 (-4.07 % of base) : System.Private.CoreLib.dasm - System.Decimal+DecCalc:VarCyFromDec(byref):long (FullOpts)
         -23 (-3.58 % of base) : System.Private.CoreLib.dasm - System.Decimal+DecCalc:VarDecFromR4(float,byref) (FullOpts)
         -21 (-1.63 % of base) : System.Private.CoreLib.dasm - System.Decimal+DecCalc:DecAddSub(byref,byref,ubyte) (FullOpts)
          -9 (-1.46 % of base) : System.Private.CoreLib.dasm - System.Decimal+DecCalc:VarDecMod(byref,byref) (FullOpts)

21 total methods with Code Size differences (12 improved, 9 regressed), 230455 unchanged.

--------------------------------------------------------------------------------

Artifacts:

@MihuBot
Copy link
Owner Author

MihuBot commented Jul 15, 2024

Top method regressions

95 (5.21 % of base) - System.Decimal+DecCalc:VarDecDiv(byref,byref)
 ; Assembly listing for method System.Decimal+DecCalc:VarDecDiv(byref,byref) (FullOpts)
 ; Emitting BLENDED_CODE for X64 with AVX512 - Unix
 ; FullOpts code
 ; optimized code
 ; rbp based frame
 ; fully interruptible
 ; No PGO data
-; 0 inlinees with PGO data; 54 single block inlinees; 11 inlinees without PGO data
+; 0 inlinees with PGO data; 74 single block inlinees; 18 inlinees without PGO data
 ; Final local variable assignments
 ;
-;  V00 arg0         [V00,T12] ( 15, 10   )   byref  ->  rbx         single-def
-;  V01 arg1         [V01,T19] ( 13,  9.50)   byref  ->  r15         single-def
-;  V02 loc0         [V02    ] ( 41, 65.50)  struct (16) [rbp-0x38]  do-not-enreg[XSF] addr-exposed ld-addr-op <System.Decimal+DecCalc+Buf12>
-;  V03 loc1         [V03,T02] ( 10, 40   )     int  ->  [rbp-0x3C] 
-;  V04 loc2         [V04,T01] ( 22, 56   )     int  ->  registers  
+;  V00 arg0         [V00,T17] ( 15, 10   )   byref  ->  rbx         single-def
+;  V01 arg1         [V01,T26] ( 13,  9.50)   byref  ->  r15         single-def
+;  V02 loc0         [V02    ] ( 55, 93.50)  struct (16) [rbp-0x38]  do-not-enreg[XSF] addr-exposed ld-addr-op <System.Decimal+DecCalc+Buf12>
+;  V03 loc1         [V03,T02] (  8, 32   )     int  ->  [rbp-0x3C] 
+;  V04 loc2         [V04,T01] ( 22, 56   )     int  ->   r8        
 ;  V05 loc3         [V05,T00] ( 40, 70.50)     int  ->  r14         ld-addr-op
-;  V06 loc4         [V06,T21] (  5,  8   )   ubyte  ->  r13        
+;  V06 loc4         [V06,T29] (  5,  8   )   ubyte  ->  r13        
 ;  V07 loc5         [V07,T03] ( 16, 29   )     int  ->  rcx        
-;  V08 loc6         [V08,T20] (  7, 10.50)     int  ->  r12        
-;  V09 loc7         [V09,T11] (  7, 14   )     int  ->  [rbp-0x40] 
-;  V10 loc8         [V10,T10] (  4, 16   )     int  ->  rax        
+;  V08 loc6         [V08,T28] (  9,  8   )     int  ->  r12        
+;  V09 loc7         [V09,T16] (  6, 13.50)     int  ->  [rbp-0x40] 
+;* V10 loc8         [V10    ] (  0,  0   )     int  ->  zero-ref   
 ;* V11 loc9         [V11    ] (  0,  0   )  struct (16) zero-ref    ld-addr-op <System.ReadOnlySpan`1[uint]>
-;  V12 loc10        [V12    ] ( 30, 53.50)  struct (16) [rbp-0x50]  do-not-enreg[XSF] addr-exposed ld-addr-op <System.Decimal+DecCalc+Buf16>
-;  V13 loc11        [V13,T25] (  7,  7   )    long  ->  r12        
-;  V14 loc12        [V14,T36] (  3,  1.50)    long  ->  rax        
+;  V12 loc10        [V12    ] ( 32, 58   )  struct (16) [rbp-0x50]  do-not-enreg[XSF] addr-exposed ld-addr-op <System.Decimal+DecCalc+Buf16>
+;  V13 loc11        [V13,T37] (  6,  6.50)    long  ->  r12        
+;  V14 loc12        [V14,T49] (  3,  1.50)    long  ->   r8        
 ;  V15 loc13        [V15    ] (  8,  7.50)  struct (16) [rbp-0x60]  do-not-enreg[XSF] addr-exposed ld-addr-op <System.Decimal+DecCalc+Buf12>
 ;  V16 loc14        [V16,T08] ( 18, 23   )     int  ->  rcx         ld-addr-op
 ;  V17 loc15        [V17,T09] ( 15, 18   )    long  ->  rdi         ld-addr-op
-;  V18 loc16        [V18,T37] (  3,  1.50)    long  ->  rdi        
-;  V19 loc17        [V19,T38] (  3,  1.50)     int  ->  rdi        
+;  V18 loc16        [V18,T50] (  3,  1.50)    long  ->  rdi        
+;  V19 loc17        [V19,T51] (  3,  1.50)     int  ->  rdi        
 ;# V20 OutArgs      [V20    ] (  1,  1   )  struct ( 0) [rsp+0x00]  do-not-enreg[XS] addr-exposed "OutgoingArgSpace"
 ;* V21 tmp1         [V21    ] (  0,  0   )     int  ->  zero-ref    "impAppendStmt"
 ;* V22 tmp2         [V22    ] (  0,  0   )    long  ->  zero-ref    "impAppendStmt"
 ;* V23 tmp3         [V23    ] (  0,  0   )    long  ->  zero-ref    "impAppendStmt"
 ;* V24 tmp4         [V24    ] (  0,  0   )    long  ->  zero-ref    "impAppendStmt"
 ;* V25 tmp5         [V25    ] (  0,  0   )    long  ->  zero-ref    "impAppendStmt"
-;  V26 tmp6         [V26,T22] (  2,  8   )     int  ->  rax         "dup spill"
+;  V26 tmp6         [V26,T30] (  2,  8   )     int  ->   r8         "dup spill"
 ;* V27 tmp7         [V27    ] (  0,  0   )     int  ->  zero-ref    "impAppendStmt"
 ;* V28 tmp8         [V28    ] (  0,  0   )    long  ->  zero-ref    "impAppendStmt"
-;  V29 tmp9         [V29,T32] (  3,  3   )    long  ->  rax         "dup spill"
-;  V30 tmp10        [V30,T23] (  2,  8   )     int  ->  rax         "dup spill"
+;  V29 tmp9         [V29,T47] (  3,  3   )    long  ->   r8         "dup spill"
+;  V30 tmp10        [V30,T31] (  2,  8   )     int  ->   r8         "dup spill"
 ;* V31 tmp11        [V31    ] (  0,  0   )     int  ->  zero-ref    "impAppendStmt"
-;  V32 tmp12        [V32,T24] (  2,  8   )     int  ->  rax         "dup spill"
-;  V33 tmp13        [V33,T04] (  3, 24   )    long  ->  rdi         "dup spill"
-;  V34 tmp14        [V34,T46] (  3,  0   )     ref  ->  rbx         class-hnd exact single-def "NewObj constructor temp" <System.DivideByZeroException>
+;  V32 tmp12        [V32,T32] (  2,  8   )     int  ->   r8         "dup spill"
+;* V33 tmp13        [V33    ] (  0,  0   )  struct ( 8) zero-ref    multireg-ret "dup spill" <System.ValueTuple`2[uint,uint]>
+;  V34 tmp14        [V34,T61] (  3,  0   )     ref  ->  rbx         class-hnd exact single-def "NewObj constructor temp" <System.DivideByZeroException>
 ;* V35 tmp15        [V35    ] (  0,  0   )    long  ->  zero-ref    "Inlining Arg"
-;  V36 tmp16        [V36,T26] (  3,  6   )     int  ->  rcx         "Inline return value spill temp"
-;  V37 tmp17        [V37,T13] (  3, 12   )     int  ->  rcx         "Inlining Arg"
-;* V38 tmp18        [V38    ] (  0,  0   )  struct (16) zero-ref    "ReadOnlySpan<T> for CreateSpan<T>" <System.ReadOnlySpan`1[uint]>
-;* V39 tmp19        [V39    ] (  0,  0   )   ubyte  ->  zero-ref    "Inline return value spill temp"
-;  V40 tmp20        [V40,T05] (  3, 24   )    long  ->  rdi         "dup spill"
-;* V41 tmp21        [V41    ] (  0,  0   )    long  ->  zero-ref    "Inline stloc first use temp"
-;  V42 tmp22        [V42,T27] (  3,  6   )     int  ->  rdx         "Inline stloc first use temp"
-;* V43 tmp23        [V43    ] (  0,  0   )    long  ->  zero-ref    "Inlining Arg"
-;  V44 tmp24        [V44,T33] (  2,  2   )    long  ->  rdi         "Inlining Arg"
-;  V45 tmp25        [V45,T28] (  3,  6   )     int  ->  rax         "Inline return value spill temp"
-;  V46 tmp26        [V46,T14] (  3, 12   )     int  ->  rax         "Inlining Arg"
-;* V47 tmp27        [V47    ] (  0,  0   )  struct (16) zero-ref    "ReadOnlySpan<T> for CreateSpan<T>" <System.ReadOnlySpan`1[uint]>
-;  V48 tmp28        [V48,T15] (  3, 12   )    long  ->  rdi         "Inline stloc first use temp"
-;* V49 tmp29        [V49    ] (  0,  0   )     int  ->  zero-ref    "Inlining Arg"
-;* V50 tmp30        [V50    ] (  0,  0   )     int  ->  zero-ref    "Inlining Arg"
-;* V51 tmp31        [V51    ] (  0,  0   )   ubyte  ->  zero-ref    "Inline return value spill temp"
-;  V52 tmp32        [V52,T06] (  3, 24   )    long  ->  rdx         "dup spill"
-;* V53 tmp33        [V53    ] (  0,  0   )    long  ->  zero-ref    "Inline stloc first use temp"
-;  V54 tmp34        [V54,T29] (  3,  6   )     int  ->  rdx         "Inline stloc first use temp"
-;* V55 tmp35        [V55    ] (  0,  0   )    long  ->  zero-ref    "Inlining Arg"
-;  V56 tmp36        [V56,T30] (  3,  6   )     int  ->  rax         "Inline return value spill temp"
-;  V57 tmp37        [V57,T16] (  3, 12   )     int  ->  rax         "Inlining Arg"
-;* V58 tmp38        [V58    ] (  0,  0   )    long  ->  zero-ref    "Inlining Arg"
-;* V59 tmp39        [V59    ] (  0,  0   )  struct (16) zero-ref    "ReadOnlySpan<T> for CreateSpan<T>" <System.ReadOnlySpan`1[uint]>
-;* V60 tmp40        [V60    ] (  0,  0   )   ubyte  ->  zero-ref    "Inline return value spill temp"
-;  V61 tmp41        [V61,T07] (  3, 24   )    long  ->  rdx         "dup spill"
-;* V62 tmp42        [V62    ] (  0,  0   )    long  ->  zero-ref    "Inline stloc first use temp"
-;  V63 tmp43        [V63,T31] (  3,  6   )     int  ->  rdx         "Inline stloc first use temp"
-;* V64 tmp44        [V64    ] (  0,  0   )   ubyte  ->  zero-ref    "Inline return value spill temp"
-;  V65 tmp45        [V65,T17] (  3, 12   )    long  ->  rsi         "Inline stloc first use temp"
-;  V66 tmp46        [V66,T18] (  3, 12   )     int  ->  rdx         "Inline stloc first use temp"
-;* V67 tmp47        [V67    ] (  0,  0   )   ubyte  ->  zero-ref    "Inline return value spill temp"
-;  V68 tmp48        [V68,T39] (  3,  1.50)    long  ->  rsi         "Inline stloc first use temp"
-;  V69 tmp49        [V69,T40] (  3,  1.50)     int  ->  rdx         "Inline stloc first use temp"
-;* V70 tmp50        [V70    ] (  0,  0   )   ubyte  ->  zero-ref    "Inline return value spill temp"
-;  V71 tmp51        [V71,T41] (  3,  1.50)    long  ->  rsi         "Inline stloc first use temp"
-;  V72 tmp52        [V72,T42] (  3,  1.50)     int  ->  rdx         "Inline stloc first use temp"
-;* V73 tmp53        [V73    ] (  0,  0   )   ubyte  ->  zero-ref    "Inline return value spill temp"
-;  V74 tmp54        [V74,T43] (  3,  1.50)    long  ->  rsi         "Inline stloc first use temp"
-;  V75 tmp55        [V75,T44] (  3,  1.50)     int  ->  rdx         "Inline stloc first use temp"
-;* V76 tmp56        [V76    ] (  0,  0   )     int  ->  zero-ref    "Inlining Arg"
-;* V77 tmp57        [V77    ] (  0,  0   )     int  ->  zero-ref    "Inlining Arg"
-;* V78 tmp58        [V78    ] (  0,  0   )     int  ->  zero-ref    "Inlining Arg"
+;* V36 tmp16        [V36    ] (  0,  0   )     int  ->  zero-ref    "Inline return value spill temp"
+;  V37 tmp17        [V37,T44] (  6,  3   )     int  ->  rdx         "Inline stloc first use temp"
+;* V38 tmp18        [V38    ] (  0,  0   )  struct ( 8) zero-ref    do-not-enreg[SR] multireg-ret "Return value temp for multireg return" <System.ValueTuple`2[uint,uint]>
+;* V39 tmp19        [V39    ] (  0,  0   )  struct ( 8) zero-ref    multireg-ret "Inline stloc first use temp" <System.ValueTuple`2[uint,uint]>
+;* V40 tmp20        [V40    ] (  0,  0   )  struct ( 8) zero-ref    do-not-enreg[SR] multireg-ret "Return value temp for multireg return" <System.ValueTuple`2[uint,uint]>
+;* V41 tmp21        [V41    ] (  0,  0   )  struct ( 8) zero-ref    do-not-enreg[SR] multireg-ret "Return value temp for multireg return" <System.ValueTuple`2[uint,uint]>
+;  V42 tmp22        [V42,T38] (  3,  6   )     int  ->   r8         "Inline return value spill temp"
+;  V43 tmp23        [V43,T18] (  3, 12   )     int  ->   r8         "Inlining Arg"
+;* V44 tmp24        [V44    ] (  0,  0   )  struct (16) zero-ref    "ReadOnlySpan<T> for CreateSpan<T>" <System.ReadOnlySpan`1[uint]>
+;  V45 tmp25        [V45,T33] (  2,  8   )    long  ->  rsi         ld-addr-op "Inline ldloca(s) first use temp"
+;  V46 tmp26        [V46,T19] (  3, 12   )    long  ->  rdx         "Inline stloc first use temp"
+;* V47 tmp27        [V47    ] (  0,  0   )    long  ->  zero-ref    "Inlining Arg"
+;* V48 tmp28        [V48    ] (  0,  0   )    long  ->  zero-ref    "Inline return value spill temp"
+;* V49 tmp29        [V49    ] (  0,  0   )    long  ->  zero-ref    "Inlining Arg"
+;  V50 tmp30        [V50    ] (  2,  8   )    long  ->  [rbp-0x68]  do-not-enreg[X] addr-exposed ld-addr-op "Inline ldloca(s) first use temp"
+;  V51 tmp31        [V51,T12] (  2, 16   )    long  ->  rdx         "impAppendStmt"
+;* V52 tmp32        [V52    ] (  0,  0   )    long  ->  zero-ref    "Inlining Arg"
+;* V53 tmp33        [V53    ] (  0,  0   )     int  ->  zero-ref    "Inlining Arg"
+;* V54 tmp34        [V54    ] (  0,  0   )  struct ( 8) zero-ref    do-not-enreg[SBR] multireg-ret "Inline return value spill temp" <System.ValueTuple`2[uint,uint]>
+;  V55 tmp35        [V55,T04] (  3, 24   )    long  ->  rax         "Inlining Arg"
+;* V56 tmp36        [V56    ] (  0,  0   )  struct ( 8) zero-ref    do-not-enreg[SR] multireg-ret "Return value temp for multireg return" <System.ValueTuple`2[uint,uint]>
+;* V57 tmp37        [V57    ] (  0,  0   )   ubyte  ->  zero-ref    "Inline return value spill temp"
+;  V58 tmp38        [V58,T05] (  3, 24   )    long  ->  rsi         "dup spill"
+;* V59 tmp39        [V59    ] (  0,  0   )    long  ->  zero-ref    "Inline stloc first use temp"
+;  V60 tmp40        [V60,T39] (  3,  6   )     int  ->  rdi         "Inline stloc first use temp"
+;* V61 tmp41        [V61    ] (  0,  0   )    long  ->  zero-ref    "Inlining Arg"
+;  V62 tmp42        [V62,T48] (  2,  2   )    long  ->  rax         "Inlining Arg"
+;* V63 tmp43        [V63    ] (  0,  0   )    long  ->  zero-ref    "Inline return value spill temp"
+;* V64 tmp44        [V64    ] (  0,  0   )   byref  ->  zero-ref    "Inline stloc first use temp"
+;* V65 tmp45        [V65    ] (  0,  0   )    long  ->  zero-ref    "impAppendStmt"
+;* V66 tmp46        [V66    ] (  0,  0   )  struct (16) zero-ref    do-not-enreg[SBR] multireg-ret "Return value temp for multireg return" <System.ValueTuple`2[ulong,ulong]>
+;* V67 tmp47        [V67    ] (  0,  0   )  struct (16) zero-ref    multireg-ret "Inline stloc first use temp" <System.ValueTuple`2[ulong,ulong]>
+;* V68 tmp48        [V68    ] (  0,  0   )    long  ->  zero-ref    "Inlining Arg"
+;  V69 tmp49        [V69,T40] (  3,  6   )     int  ->   r8         "Inline return value spill temp"
+;  V70 tmp50        [V70,T20] (  3, 12   )     int  ->   r8         "Inlining Arg"
+;* V71 tmp51        [V71    ] (  0,  0   )  struct (16) zero-ref    "ReadOnlySpan<T> for CreateSpan<T>" <System.ReadOnlySpan`1[uint]>
+;  V72 tmp52        [V72,T34] (  2,  8   )    long  ->  rsi         ld-addr-op "Inline ldloca(s) first use temp"
+;  V73 tmp53        [V73,T21] (  3, 12   )    long  ->  rdx         "Inline stloc first use temp"
+;* V74 tmp54        [V74    ] (  0,  0   )    long  ->  zero-ref    "Inlining Arg"
+;* V75 tmp55        [V75    ] (  0,  0   )    long  ->  zero-ref    "Inline return value spill temp"
+;* V76 tmp56        [V76    ] (  0,  0   )    long  ->  zero-ref    "Inlining Arg"
+;  V77 tmp57        [V77    ] (  2,  8   )    long  ->  [rbp-0x70]  do-not-enreg[X] addr-exposed ld-addr-op "Inline ldloca(s) first use temp"
+;  V78 tmp58        [V78,T13] (  2, 16   )    long  ->  rdx         "impAppendStmt"
 ;* V79 tmp59        [V79    ] (  0,  0   )    long  ->  zero-ref    "Inlining Arg"
 ;* V80 tmp60        [V80    ] (  0,  0   )     int  ->  zero-ref    "Inlining Arg"
-;* V81 tmp61        [V81    ] (  0,  0   )   byref  ->  zero-ref    "field V11._reference (fldOffset=0x0)" P-INDEP
-;* V82 tmp62        [V82    ] (  0,  0   )     int  ->  zero-ref    "field V11._length (fldOffset=0x8)" P-INDEP
-;* V83 tmp63        [V83    ] (  0,  0   )   byref  ->  zero-ref    "field V38._reference (fldOffset=0x0)" P-INDEP
-;* V84 tmp64        [V84    ] (  0,  0   )     int  ->  zero-ref    "field V38._length (fldOffset=0x8)" P-INDEP
-;* V85 tmp65        [V85    ] (  0,  0   )   byref  ->  zero-ref    "field V47._reference (fldOffset=0x0)" P-INDEP
-;* V86 tmp66        [V86    ] (  0,  0   )     int  ->  zero-ref    "field V47._length (fldOffset=0x8)" P-INDEP
-;* V87 tmp67        [V87    ] (  0,  0   )   byref  ->  zero-ref    "field V59._reference (fldOffset=0x0)" P-INDEP
-;* V88 tmp68        [V88    ] (  0,  0   )     int  ->  zero-ref    "field V59._length (fldOffset=0x8)" P-INDEP
-;  V89 tmp69        [V89,T34] (  2,  2   )    long  ->  rdi         "Cast away GC"
-;  V90 tmp70        [V90,T35] (  2,  2   )    long  ->  rdi         "argument with side effect"
-;  V91 cse0         [V91,T45] (  3,  1.50)     int  ->  rcx         "CSE #11: conservative"
+;  V81 tmp61        [V81,T35] (  2,  8   )    long  ->  rsi         ld-addr-op "Inline ldloca(s) first use temp"
+;* V82 tmp62        [V82    ] (  0,  0   )    long  ->  zero-ref    "Inlining Arg"
+;* V83 tmp63        [V83    ] (  0,  0   )    long  ->  zero-ref    "Inline return value spill temp"
+;* V84 tmp64        [V84    ] (  0,  0   )    long  ->  zero-ref    "Inlining Arg"
+;  V85 tmp65        [V85    ] (  2,  8   )    long  ->  [rbp-0x78]  do-not-enreg[X] addr-exposed ld-addr-op "Inline ldloca(s) first use temp"
+;  V86 tmp66        [V86,T14] (  2, 16   )    long  ->  rdi         "impAppendStmt"
+;* V87 tmp67        [V87    ] (  0,  0   )    long  ->  zero-ref    "Inlining Arg"
+;* V88 tmp68        [V88    ] (  0,  0   )   ubyte  ->  zero-ref    "Inline return value spill temp"
+;  V89 tmp69        [V89,T06] (  3, 24   )    long  ->  rdx         "dup spill"
+;* V90 tmp70        [V90    ] (  0,  0   )    long  ->  zero-ref    "Inline stloc first use temp"
+;  V91 tmp71        [V91,T41] (  3,  6   )     int  ->  rdx         "Inline stloc first use temp"
+;* V92 tmp72        [V92    ] (  0,  0   )    long  ->  zero-ref    "Inlining Arg"
+;  V93 tmp73        [V93,T42] (  3,  6   )     int  ->   r8         "Inline return value spill temp"
+;  V94 tmp74        [V94,T22] (  3, 12   )     int  ->   r8         "Inlining Arg"
+;* V95 tmp75        [V95    ] (  0,  0   )    long  ->  zero-ref    "Inlining Arg"
+;* V96 tmp76        [V96    ] (  0,  0   )  struct (16) zero-ref    "ReadOnlySpan<T> for CreateSpan<T>" <System.ReadOnlySpan`1[uint]>
+;  V97 tmp77        [V97,T36] (  2,  8   )    long  ->  rsi         ld-addr-op "Inline ldloca(s) first use temp"
+;* V98 tmp78        [V98    ] (  0,  0   )    long  ->  zero-ref    "Inline stloc first use temp"
+;* V99 tmp79        [V99    ] (  0,  0   )    long  ->  zero-ref    "Inlining Arg"
+;* V100 tmp80       [V100    ] (  0,  0   )    long  ->  zero-ref    "Inline return value spill temp"
+;* V101 tmp81       [V101    ] (  0,  0   )    long  ->  zero-ref    "Inlining Arg"
+;  V102 tmp82       [V102    ] (  2,  8   )    long  ->  [rbp-0x80]  do-not-enreg[X] addr-exposed ld-addr-op "Inline ldloca(s) first use temp"
+;  V103 tmp83       [V103,T15] (  2, 16   )    long  ->  rdi         "impAppendStmt"
+;* V104 tmp84       [V104    ] (  0,  0   )    long  ->  zero-ref    "Inlining Arg"
+;* V105 tmp85       [V105    ] (  0,  0   )     int  ->  zero-ref    "Inlining Arg"
+;* V106 tmp86       [V106    ] (  0,  0   )    long  ->  zero-ref    "Inlining Arg"
+;* V107 tmp87       [V107    ] (  0,  0   )   ubyte  ->  zero-ref    "Inline return value spill temp"
+;  V108 tmp88       [V108,T07] (  3, 24   )    long  ->  rdx         "dup spill"
+;* V109 tmp89       [V109    ] (  0,  0   )    long  ->  zero-ref    "Inline stloc first use temp"
+;  V110 tmp90       [V110,T43] (  3,  6   )     int  ->  rdx         "Inline stloc first use temp"
+;* V111 tmp91       [V111    ] (  0,  0   )   ubyte  ->  zero-ref    "Inline return value spill temp"
+;  V112 tmp92       [V112,T23] (  3, 12   )    long  ->  rsi         "Inline stloc first use temp"
+;  V113 tmp93       [V113,T24] (  3, 12   )     int  ->  rdx         "Inline stloc first use temp"
+;* V114 tmp94       [V114    ] (  0,  0   )   ubyte  ->  zero-ref    "Inline return value spill temp"
+;  V115 tmp95       [V115,T52] (  3,  1.50)    long  ->  rsi         "Inline stloc first use temp"
+;  V116 tmp96       [V116,T53] (  3,  1.50)     int  ->  rdx         "Inline stloc first use temp"
+;* V117 tmp97       [V117    ] (  0,  0   )   ubyte  ->  zero-ref    "Inline return value spill temp"
+;  V118 tmp98       [V118,T54] (  3,  1.50)    long  ->  rsi         "Inline stloc first use temp"
+;  V119 tmp99       [V119,T55] (  3,  1.50)     int  ->  rdx         "Inline stloc first use temp"
+;* V120 tmp100      [V120    ] (  0,  0   )   ubyte  ->  zero-ref    "Inline return value spill temp"
+;  V121 tmp101      [V121,T56] (  3,  1.50)    long  ->  rsi         "Inline stloc first use temp"
+;  V122 tmp102      [V122,T57] (  3,  1.50)     int  ->  rdx         "Inline stloc first use temp"
+;* V123 tmp103      [V123    ] (  0,  0   )     int  ->  zero-ref    "Inlining Arg"
+;* V124 tmp104      [V124    ] (  0,  0   )     int  ->  zero-ref    "Inlining Arg"
+;* V125 tmp105      [V125    ] (  0,  0   )     int  ->  zero-ref    "Inlining Arg"
+;* V126 tmp106      [V126    ] (  0,  0   )    long  ->  zero-ref    "Inlining Arg"
+;* V127 tmp107      [V127    ] (  0,  0   )     int  ->  zero-ref    "Inlining Arg"
+;* V128 tmp108      [V128    ] (  0,  0   )   byref  ->  zero-ref    "field V11._reference (fldOffset=0x0)" P-INDEP
+;* V129 tmp109      [V129    ] (  0,  0   )     int  ->  zero-ref    "field V11._length (fldOffset=0x8)" P-INDEP
+;  V130 tmp110      [V130,T25] (  3, 12   )     int  ->  rax         "field V33.Item1 (fldOffset=0x0)" P-INDEP
+;  V131 tmp111      [V131,T27] (  3,  8.50)     int  ->  rdx         "field V33.Item2 (fldOffset=0x4)" P-INDEP
+;  V132 tmp112      [V132,T45] (  6,  3   )     int  ->  rax         "field V39.Item1 (fldOffset=0x0)" P-INDEP
+;  V133 tmp113      [V133,T46] (  6,  3   )     int  ->  rdx         "field V39.Item2 (fldOffset=0x4)" P-INDEP
+;* V134 tmp114      [V134    ] (  0,  0   )   byref  ->  zero-ref    "field V44._reference (fldOffset=0x0)" P-INDEP
+;* V135 tmp115      [V135    ] (  0,  0   )     int  ->  zero-ref    "field V44._length (fldOffset=0x8)" P-INDEP
+;* V136 tmp116      [V136    ] (  0,  0   )     int  ->  zero-ref    "field V54.Item1 (fldOffset=0x0)" P-DEP
+;* V137 tmp117      [V137    ] (  0,  0   )     int  ->  zero-ref    "field V54.Item2 (fldOffset=0x4)" P-DEP
+;* V138 tmp118      [V138    ] (  0,  0   )    long  ->  zero-ref    "field V66.Item1 (fldOffset=0x0)" P-DEP
+;* V139 tmp119      [V139    ] (  0,  0   )    long  ->  zero-ref    "field V66.Item2 (fldOffset=0x8)" P-DEP
+;  V140 tmp120      [V140,T59] (  2,  1   )    long  ->  rax         "field V67.Item1 (fldOffset=0x0)" P-INDEP
+;  V141 tmp121      [V141,T60] (  2,  1   )    long  ->  rdx         "field V67.Item2 (fldOffset=0x8)" P-INDEP
+;* V142 tmp122      [V142    ] (  0,  0   )   byref  ->  zero-ref    "field V71._reference (fldOffset=0x0)" P-INDEP
+;* V143 tmp123      [V143    ] (  0,  0   )     int  ->  zero-ref    "field V71._length (fldOffset=0x8)" P-INDEP
+;* V144 tmp124      [V144    ] (  0,  0   )   byref  ->  zero-ref    "field V96._reference (fldOffset=0x0)" P-INDEP
+;* V145 tmp125      [V145    ] (  0,  0   )     int  ->  zero-ref    "field V96._length (fldOffset=0x8)" P-INDEP
+;  V146 cse0        [V146,T58] (  3,  1.50)     int  ->  rdi         "CSE #14: conservative"
+;  V147 cse1        [V147,T10] (  4, 16   )    long  ->  rdi         "CSE #03: moderate"
+;  V148 cse2        [V148,T11] (  4, 16   )    long  ->  rdi         "CSE #10: moderate"
 ;
-; Lcl frame size = 56
+; Lcl frame size = 88
 
 G_M21333_IG01:
        push     rbp
        push     r15
        push     r14
        push     r13
        push     r12
        push     rbx
-       sub      rsp, 56
-       lea      rbp, [rsp+0x60]
+       sub      rsp, 88
+       lea      rbp, [rsp+0x80]
        mov      rbx, rdi
        mov      r15, rsi
-						;; size=25 bbWeight=1 PerfScore 7.25
+						;; size=28 bbWeight=1 PerfScore 7.25
 G_M21333_IG02:
-       mov      edi, dword ptr [rbx]
-       sub      edi, dword ptr [r15]
-       shr      edi, 16
-       movsx    r14, dil
+       mov      eax, dword ptr [rbx]
+       sub      eax, dword ptr [r15]
+       shr      eax, 16
+       movsx    r14, al
        xor      r13d, r13d
-       mov      edi, dword ptr [r15+0x04]
-       or       edi, dword ptr [r15+0x0C]
-       jne      SHORT G_M21333_IG06
+       mov      eax, dword ptr [r15+0x04]
+       or       eax, dword ptr [r15+0x0C]
+       jne      SHORT G_M21333_IG04
 						;; size=25 bbWeight=1 PerfScore 12.00
 G_M21333_IG03:
        mov      r12d, dword ptr [r15+0x08]
        test     r12d, r12d
-       je       G_M21333_IG45
-       mov      rdi, qword ptr [rbx+0x08]
-       mov      qword ptr [rbp-0x38], rdi
-       mov      edi, dword ptr [rbx+0x04]
-       mov      dword ptr [rbp-0x30], edi
-       lea      rdi, [rbp-0x38]
-       mov      esi, r12d
-       mov      rax, 0xD1FFAB1E      ; code for System.Decimal+DecCalc:Div96By32(byref,uint):uint
-       call     [rax]System.Decimal+DecCalc:Div96By32(byref,uint):uint
-						;; size=46 bbWeight=0.50 PerfScore 6.62
+       je       G_M21333_IG47
+       mov      rax, qword ptr [rbx+0x08]
+       mov      qword ptr [rbp-0x38], rax
+       mov      eax, dword ptr [rbx+0x04]
+       mov      dword ptr [rbp-0x30], eax
+       xor      edx, edx
+       cmp      dword ptr [rbp-0x30], 0
+       jne      G_M21333_IG23
+       cmp      dword ptr [rbp-0x34], r12d
+       jae      G_M21333_IG24
+       mov      edx, dword ptr [rbp-0x34]
+       xor      eax, eax
+       mov      dword ptr [rbp-0x34], eax
+       jmp      G_M21333_IG25
+       align    [0 bytes for IG37]
+						;; size=62 bbWeight=0.50 PerfScore 9.88
 G_M21333_IG04:
-       mov      dword ptr [rbp-0x40], eax
-       test     eax, eax
-       jne      G_M21333_IG25
-						;; size=11 bbWeight=4 PerfScore 9.00
-G_M21333_IG05:
-       test     r14d, r14d
-       jge      G_M21333_IG33
-       mov      ecx, r14d
-       neg      ecx
-       cmp      ecx, 9
-       jge      G_M21333_IG28
-       jmp      G_M21333_IG29
-						;; size=28 bbWeight=2 PerfScore 10.00
-G_M21333_IG06:
        mov      ecx, dword ptr [r15+0x04]
        test     ecx, ecx
-       jne      SHORT G_M21333_IG07
+       jne      SHORT G_M21333_IG05
        mov      ecx, dword ptr [r15+0x0C]
 						;; size=12 bbWeight=0.50 PerfScore 2.62
-G_M21333_IG07:
-       xor      eax, eax
-       lzcnt    eax, ecx
-       mov      rdi, qword ptr [rbx+0x08]
-       shlx     rdi, rdi, rax
-       mov      qword ptr [rbp-0x50], rdi
-       mov      edi, dword ptr [rbx+0x0C]
-       mov      esi, dword ptr [rbx+0x04]
-       shl      rsi, 32
-       add      rdi, rsi
-       mov      ecx, eax
-       neg      ecx
-       add      ecx, 32
-       and      ecx, 63
-       shrx     rdi, rdi, rcx
-       mov      qword ptr [rbp-0x48], rdi
-       mov      rdi, qword ptr [r15+0x08]
-       shlx     r12, rdi, rax
+G_M21333_IG05:
+       xor      r8d, r8d
+       lzcnt    r8d, ecx
+       mov      rax, qword ptr [rbx+0x08]
+       shlx     rax, rax, r8
+       mov      qword ptr [rbp-0x50], rax
+       mov      eax, dword ptr [rbx+0x0C]
+       mov      edx, dword ptr [rbx+0x04]
+       shl      rdx, 32
+       add      rax, rdx
+       mov      edi, r8d
+       neg      edi
+       add      edi, 32
+       and      edi, 63
+       shrx     rax, rax, rdi
+       mov      qword ptr [rbp-0x48], rax
+       mov      rax, qword ptr [r15+0x08]
+       shlx     r12, rax, r8
        cmp      dword ptr [r15+0x04], 0
-       jne      SHORT G_M21333_IG10
-       xor      edi, edi
-       mov      dword ptr [rbp-0x30], edi
-       lea      rdi, bword ptr [rbp-0x4C]
-       mov      rsi, r12
-       mov      rax, 0xD1FFAB1E      ; code for System.Decimal+DecCalc:Div96By64(byref,ulong):uint
-       call     [rax]System.Decimal+DecCalc:Div96By64(byref,ulong):uint
-       mov      dword ptr [rbp-0x34], eax
-       lea      rdi, [rbp-0x50]
-       mov      rsi, r12
-       mov      rax, 0xD1FFAB1E      ; code for System.Decimal+DecCalc:Div96By64(byref,ulong):uint
-       call     [rax]System.Decimal+DecCalc:Div96By64(byref,ulong):uint
-       mov      dword ptr [rbp-0x38], eax
-						;; size=116 bbWeight=0.50 PerfScore 15.38
-G_M21333_IG08:
+       jne      SHORT G_M21333_IG08
+       xor      eax, eax
+       mov      dword ptr [rbp-0x30], eax
+       mov      rax, qword ptr [rbp-0x50]
+       mov      rdx, qword ptr [rbp-0x48]
+       div      rdx:rax, r12
+       mov      qword ptr [rbp-0x50], rdx
+       mov      qword ptr [rbp-0x38], rax
+						;; size=94 bbWeight=0.50 PerfScore 42.88
+G_M21333_IG06:
        cmp      qword ptr [rbp-0x50], 0
-       jne      G_M21333_IG19
+       jne      G_M21333_IG17
 						;; size=11 bbWeight=4 PerfScore 12.00
-G_M21333_IG09:
+G_M21333_IG07:
        test     r14d, r14d
-       jge      G_M21333_IG33
-       mov      eax, r14d
-       neg      eax
-       cmp      eax, 9
-       jge      G_M21333_IG21
-       jmp      G_M21333_IG22
-						;; size=28 bbWeight=2 PerfScore 10.00
-G_M21333_IG10:
+       jge      G_M21333_IG35
+       mov      r8d, r14d
+       neg      r8d
+       cmp      r8d, 9
+       jge      G_M21333_IG19
+       jmp      G_M21333_IG20
+						;; size=30 bbWeight=2 PerfScore 10.00
+G_M21333_IG08:
        mov      qword ptr [rbp-0x60], r12
-       mov      edi, dword ptr [r15+0x0C]
-       mov      esi, dword ptr [r15+0x04]
-       shl      rsi, 32
-       add      rdi, rsi
-       shrx     rdi, rdi, rcx
+       mov      esi, dword ptr [r15+0x0C]
+       mov      eax, dword ptr [r15+0x04]
+       shl      rax, 32
+       add      rsi, rax
+       shrx     rdi, rsi, rdi
        mov      dword ptr [rbp-0x58], edi
        lea      rdi, [rbp-0x50]
        lea      rsi, [rbp-0x60]
        mov      rax, 0xD1FFAB1E      ; code for System.Decimal+DecCalc:Div128By96(byref,byref):uint
        call     [rax]System.Decimal+DecCalc:Div128By96(byref,byref):uint
        mov      edi, eax
        mov      qword ptr [rbp-0x38], rdi
        xor      edi, edi
        mov      dword ptr [rbp-0x30], edi
 						;; size=58 bbWeight=0.50 PerfScore 7.00
-G_M21333_IG11:
+G_M21333_IG09:
        mov      edi, dword ptr [rbp-0x48]
        or       rdi, qword ptr [rbp-0x50]
-       jne      SHORT G_M21333_IG13
+       jne      SHORT G_M21333_IG11
 						;; size=9 bbWeight=4 PerfScore 16.00
-G_M21333_IG12:
+G_M21333_IG10:
        test     r14d, r14d
-       jge      G_M21333_IG33
-       mov      eax, r14d
-       neg      eax
-       cmp      eax, 9
-       jge      G_M21333_IG15
-       jmp      G_M21333_IG16
-						;; size=28 bbWeight=2 PerfScore 10.00
-G_M21333_IG13:
+       jge      G_M21333_IG35
+       mov      r8d, r14d
+       neg      r8d
+       cmp      r8d, 9
+       jge      G_M21333_IG13
+       jmp      G_M21333_IG14
+						;; size=30 bbWeight=2 PerfScore 10.00
+G_M21333_IG11:
        mov      r13d, 1
        cmp      r14d, 28
-       je       SHORT G_M21333_IG14
+       je       SHORT G_M21333_IG12
        lea      rdi, [rbp-0x38]
        mov      esi, r14d
        mov      rax, 0xD1FFAB1E      ; code for System.Decimal+DecCalc:SearchScale(byref,int):int
        call     [rax]System.Decimal+DecCalc:SearchScale(byref,int):int
-       test     eax, eax
-       jne      SHORT G_M21333_IG16
-						;; size=35 bbWeight=2 PerfScore 13.50
-G_M21333_IG14:
+       mov      r8d, eax
+       test     r8d, r8d
+       jne      SHORT G_M21333_IG14
+						;; size=39 bbWeight=2 PerfScore 14.00
+G_M21333_IG12:
        cmp      dword ptr [rbp-0x48], 0
-       jl       G_M21333_IG27
+       jl       G_M21333_IG30
        mov      ecx, dword ptr [rbp-0x4C]
        shr      ecx, 31
-       mov      rax, qword ptr [rbp-0x50]
-       add      rax, rax
-       mov      qword ptr [rbp-0x50], rax
+       mov      r8, qword ptr [rbp-0x50]
+       add      r8, r8
+       mov      qword ptr [rbp-0x50], r8
        mov      edi, dword ptr [rbp-0x48]
        lea      edi, [rcx+2*rdi]
        mov      dword ptr [rbp-0x48], edi
        mov      edi, dword ptr [rbp-0x48]
        cmp      edi, dword ptr [rbp-0x58]
-       ja       G_M21333_IG27
+       ja       G_M21333_IG30
        mov      edi, dword ptr [rbp-0x48]
        cmp      edi, dword ptr [rbp-0x58]
-       jne      G_M21333_IG33
+       jne      G_M21333_IG35
        mov      rdi, qword ptr [rbp-0x50]
        cmp      rdi, qword ptr [rbp-0x60]
-       ja       G_M21333_IG27
+       ja       G_M21333_IG30
        mov      rdi, qword ptr [rbp-0x50]
        cmp      rdi, qword ptr [rbp-0x60]
-       jne      G_M21333_IG33
+       jne      G_M21333_IG35
        test     byte  ptr [rbp-0x38], 1
-       je       G_M21333_IG33
-       jmp      G_M21333_IG27
+       je       G_M21333_IG35
+       jmp      G_M21333_IG30
 						;; size=103 bbWeight=0.50 PerfScore 15.12
-G_M21333_IG15:
-       mov      eax, 9
-						;; size=5 bbWeight=2 PerfScore 0.50
-G_M21333_IG16:
-       cmp      eax, 10
-       jae      G_M21333_IG46
-       mov      edi, eax
+G_M21333_IG13:
+       mov      r8d, 9
+						;; size=6 bbWeight=2 PerfScore 0.50
+G_M21333_IG14:
+       cmp      r8d, 10
+       jae      G_M21333_IG48
+       mov      edi, r8d
        mov      rsi, 0xD1FFAB1E      ; static handle
        mov      r12d, dword ptr [rsi+4*rdi]
-       add      r14d, eax
+       add      r14d, r8d
        lea      rdi, [rbp-0x38]
        mov      esi, r12d
        mov      rax, 0xD1FFAB1E      ; code for System.Decimal+DecCalc:IncreaseScale(byref,uint):uint
        call     [rax]System.Decimal+DecCalc:IncreaseScale(byref,uint):uint
        test     eax, eax
-       jne      G_M21333_IG44
-       lea      rdi, [rbp-0x50]
-       mov      esi, r12d
-       mov      rax, 0xD1FFAB1E      ; code for System.Decimal+DecCalc:IncreaseScale(byref,uint):uint
-       call     [rax]System.Decimal+DecCalc:IncreaseScale(byref,uint):uint
-       mov      dword ptr [rbp-0x44], eax
+       jne      G_M21333_IG46
+       mov      rdx, qword ptr [rbp-0x50]
+       mov      edi, r12d
+       lea      rsi, [rbp-0x80]
+       mulx     rdi, rax, rdi
+       mov      qword ptr [rsi], rax
+       mov      rsi, qword ptr [rbp-0x80]
+       mov      qword ptr [rbp-0x50], rsi
+       mov      esi, dword ptr [rbp-0x48]
+       mov      eax, r12d
+       imul     rsi, rax
+       add      rdi, rsi
+       mov      qword ptr [rbp-0x48], rdi
        lea      rdi, [rbp-0x50]
        lea      rsi, [rbp-0x60]
        mov      rax, 0xD1FFAB1E      ; code for System.Decimal+DecCalc:Div128By96(byref,byref):uint
        call     [rax]System.Decimal+DecCalc:Div128By96(byref,byref):uint
        mov      ecx, eax
        mov      edx, ecx
        add      rdx, qword ptr [rbp-0x38]
        mov      qword ptr [rbp-0x38], rdx
        mov      edi, ecx
        cmp      rdx, rdi
-       jae      G_M21333_IG11
-						;; size=120 bbWeight=4 PerfScore 94.00
-G_M21333_IG17:
+       jae      G_M21333_IG09
+						;; size=144 bbWeight=4 PerfScore 123.00
+G_M21333_IG15:
        mov      edx, dword ptr [rbp-0x30]
        inc      edx
        mov      dword ptr [rbp-0x30], edx
        test     edx, edx
-       jne      G_M21333_IG11
+       jne      G_M21333_IG09
 						;; size=16 bbWeight=2 PerfScore 7.00
-G_M21333_IG18:
+G_M21333_IG16:
        mov      rdx, qword ptr [rbp-0x50]
        or       rdx, qword ptr [rbp-0x48]
        setne    dl
        movzx    rdx, dl
        lea      rdi, [rbp-0x38]
        mov      esi, r14d
        mov      rax, 0xD1FFAB1E      ; code for System.Decimal+DecCalc:OverflowUnscale(byref,int,ubyte):int
        call     [rax]System.Decimal+DecCalc:OverflowUnscale(byref,int,ubyte):int
        mov      r14d, eax
-       jmp      G_M21333_IG33
-       align    [0 bytes for IG35]
+       jmp      G_M21333_IG35
 						;; size=41 bbWeight=0.50 PerfScore 5.25
-G_M21333_IG19:
+G_M21333_IG17:
        mov      r13d, 1
        cmp      r14d, 28
-       je       SHORT G_M21333_IG20
+       je       SHORT G_M21333_IG18
        lea      rdi, [rbp-0x38]
        mov      esi, r14d
        mov      rax, 0xD1FFAB1E      ; code for System.Decimal+DecCalc:SearchScale(byref,int):int
        call     [rax]System.Decimal+DecCalc:SearchScale(byref,int):int
-       test     eax, eax
-       jne      SHORT G_M21333_IG22
-						;; size=35 bbWeight=2 PerfScore 13.50
-G_M21333_IG20:
-       mov      rax, qword ptr [rbp-0x50]
-       test     rax, rax
-       jl       G_M21333_IG27
-       add      rax, rax
-       cmp      rax, r12
-       ja       G_M21333_IG27
-       jne      G_M21333_IG33
+       mov      r8d, eax
+       test     r8d, r8d
+       jne      SHORT G_M21333_IG20
+						;; size=39 bbWeight=2 PerfScore 14.00
+G_M21333_IG18:
+       mov      r8, qword ptr [rbp-0x50]
+       test     r8, r8
+       jl       G_M21333_IG30
+       add      r8, r8
+       cmp      r8, r12
+       ja       G_M21333_IG30
+       jne      G_M21333_IG35
        test     byte  ptr [rbp-0x38], 1
-       je       G_M21333_IG33
-       jmp      G_M21333_IG27
+       je       G_M21333_IG35
+       jmp      G_M21333_IG30
 						;; size=46 bbWeight=0.50 PerfScore 4.88
-G_M21333_IG21:
-       mov      eax, 9
-						;; size=5 bbWeight=2 PerfScore 0.50
-G_M21333_IG22:
-       cmp      eax, 10
-       jae      G_M21333_IG46
+G_M21333_IG19:
+       mov      r8d, 9
+						;; size=6 bbWeight=2 PerfScore 0.50
+G_M21333_IG20:
+       cmp      r8d, 10
+       jae      G_M21333_IG48
+       mov      edx, r8d
+       mov      rdi, 0xD1FFAB1E      ; static handle
+       mov      edx, dword ptr [rdi+4*rdx]
+       mov      eax, edx
+       add      r14d, r8d
+       mov      rdx, qword ptr [rbp-0x38]
        mov      edi, eax
-       mov      rsi, 0xD1FFAB1E      ; static handle
-       mov      edi, dword ptr [rsi+4*rdi]
-       mov      ecx, edi
-       add      r14d, eax
-       lea      rdi, [rbp-0x38]
-       mov      dword ptr [rbp-0x3C], ecx
-       mov      esi, ecx
-       mov      rax, 0xD1FFAB1E      ; code for System.Decimal+DecCalc:IncreaseScale(byref,uint):uint
-       call     [rax]System.Decimal+DecCalc:IncreaseScale(byref,uint):uint
-       test     eax, eax
-       jne      G_M21333_IG44
-       mov      edi, dword ptr [rbp-0x50]
-       mov      eax, dword ptr [rbp-0x3C]
-       mov      esi, eax
-       imul     rdi, rsi
-       mov      dword ptr [rbp-0x50], edi
-       mov      esi, dword ptr [rbp-0x4C]
-       imul     rsi, rax
-       shr      rdi, 32
-       add      rdi, rsi
-       mov      qword ptr [rbp-0x4C], rdi
+       lea      rsi, [rbp-0x70]
+       mulx     rdx, rax, rdi
+       mov      qword ptr [rsi], rax
+       mov      rsi, qword ptr [rbp-0x70]
+       mov      qword ptr [rbp-0x38], rsi
+       mov      esi, dword ptr [rbp-0x30]
+       imul     rsi, rdi
+       add      rdx, rsi
+       mov      dword ptr [rbp-0x30], edx
+       shr      rdx, 32
+       test     edx, edx
+       jne      G_M21333_IG46
+       mov      rdx, qword ptr [rbp-0x50]
+       lea      rsi, [rbp-0x78]
+       mulx     rdi, rax, rdi
+       mov      qword ptr [rsi], rax
+       mov      rsi, qword ptr [rbp-0x78]
+       mov      dword ptr [rbp-0x48], edi
+       mov      qword ptr [rbp-0x50], rsi
        lea      rdi, [rbp-0x50]
        mov      rsi, r12
        mov      rax, 0xD1FFAB1E      ; code for System.Decimal+DecCalc:Div96By64(byref,ulong):uint
        call     [rax]System.Decimal+DecCalc:Div96By64(byref,ulong):uint
        mov      ecx, eax
        mov      edx, ecx
        add      rdx, qword ptr [rbp-0x38]
        mov      qword ptr [rbp-0x38], rdx
        mov      edi, ecx
        cmp      rdx, rdi
-       jae      G_M21333_IG08
-						;; size=133 bbWeight=4 PerfScore 118.00
-G_M21333_IG23:
+       jae      G_M21333_IG06
+						;; size=151 bbWeight=4 PerfScore 142.00
+G_M21333_IG21:
        mov      edx, dword ptr [rbp-0x30]
        inc      edx
        mov      dword ptr [rbp-0x30], edx
        test     edx, edx
-       jne      G_M21333_IG08
+       jne      G_M21333_IG06
 						;; size=16 bbWeight=2 PerfScore 7.00
-G_M21333_IG24:
+G_M21333_IG22:
        cmp      qword ptr [rbp-0x50], 0
        setne    dl
        movzx    rdx, dl
        lea      rdi, [rbp-0x38]
        mov      esi, r14d
        mov      rax, 0xD1FFAB1E      ; code for System.Decimal+DecCalc:OverflowUnscale(byref,int,ubyte):int
        call     [rax]System.Decimal+DecCalc:OverflowUnscale(byref,int,ubyte):int
        mov      r14d, eax
-       jmp      G_M21333_IG33
+       jmp      G_M21333_IG35
 						;; size=38 bbWeight=0.50 PerfScore 4.75
+G_M21333_IG23:
+       mov      eax, dword ptr [rbp-0x30]
+       xor      edx, edx
+       div      edx:eax, r12d
+       mov      dword ptr [rbp-0x30], eax
+						;; size=11 bbWeight=0.50 PerfScore 13.62
+G_M21333_IG24:
+       mov      eax, dword ptr [rbp-0x34]
+       div      edx:eax, r12d
+       mov      dword ptr [rbp-0x34], eax
+						;; size=9 bbWeight=0.50 PerfScore 13.50
 G_M21333_IG25:
+       mov      eax, dword ptr [rbp-0x38]
+       div      edx:eax, r12d
+       mov      dword ptr [rbp-0x38], eax
+       mov      eax, edx
+						;; size=11 bbWeight=0.50 PerfScore 13.62
+G_M21333_IG26:
+       mov      dword ptr [rbp-0x40], eax
+       test     eax, eax
+       jne      SHORT G_M21333_IG28
+						;; size=7 bbWeight=4 PerfScore 9.00
+G_M21333_IG27:
+       test     r14d, r14d
+       jge      G_M21333_IG35
+       mov      r8d, r14d
+       neg      r8d
+       cmp      r8d, 9
+       jge      G_M21333_IG31
+       jmp      G_M21333_IG32
+						;; size=30 bbWeight=2 PerfScore 10.00
+G_M21333_IG28:
        mov      r13d, 1
        cmp      r14d, 28
-       je       SHORT G_M21333_IG26
+       je       SHORT G_M21333_IG29
        lea      rdi, [rbp-0x38]
        mov      esi, r14d
        mov      rcx, 0xD1FFAB1E      ; code for System.Decimal+DecCalc:SearchScale(byref,int):int
        call     [rcx]System.Decimal+DecCalc:SearchScale(byref,int):int
-       test     eax, eax
-       jne      SHORT G_M21333_IG30
-						;; size=35 bbWeight=2 PerfScore 13.50
-G_M21333_IG26:
+       mov      r8d, eax
+       test     r8d, r8d
+       jne      SHORT G_M21333_IG32
+						;; size=39 bbWeight=2 PerfScore 14.00
+G_M21333_IG29:
        mov      eax, dword ptr [rbp-0x40]
        lea      ecx, [rax+rax]
        cmp      ecx, eax
-       jb       SHORT G_M21333_IG27
+       jb       SHORT G_M21333_IG30
        cmp      ecx, r12d
-       jb       G_M21333_IG33
-       ja       SHORT G_M21333_IG27
+       jb       G_M21333_IG35
+       ja       SHORT G_M21333_IG30
        test     byte  ptr [rbp-0x38], 1
-       je       G_M21333_IG33
+       je       G_M21333_IG35
 						;; size=31 bbWeight=0.50 PerfScore 4.00
-G_M21333_IG27:
+G_M21333_IG30:
        mov      rdi, qword ptr [rbp-0x38]
        inc      rdi
        mov      qword ptr [rbp-0x38], rdi
        test     rdi, rdi
-       jne      G_M21333_IG33
+       jne      G_M21333_IG35
        mov      edi, dword ptr [rbp-0x30]
        inc      edi
        mov      dword ptr [rbp-0x30], edi
        test     edi, edi
-       jne      G_M21333_IG33
+       jne      G_M21333_IG35
        lea      rdi, [rbp-0x38]
        mov      esi, r14d
        mov      edx, 1
        mov      rax, 0xD1FFAB1E      ; code for System.Decimal+DecCalc:OverflowUnscale(byref,int,ubyte):int
        call     [rax]System.Decimal+DecCalc:OverflowUnscale(byref,int,ubyte):int
        mov      r14d, eax
-       jmp      G_M21333_IG33
+       jmp      G_M21333_IG35
 						;; size=68 bbWeight=0.50 PerfScore 6.75
-G_M21333_IG28:
-       mov      ecx, 9
-						;; size=5 bbWeight=2 PerfScore 0.50
-G_M21333_IG29:
-       mov      edi, ecx
-       mov      eax, edi
-						;; size=4 bbWeight=2 PerfScore 1.00
-G_M21333_IG30:
-       cmp      eax, 10
-       jae      G_M21333_IG46
-       mov      edi, eax
-       mov      rsi, 0xD1FFAB1E      ; static handle
-       mov      edi, dword ptr [rsi+4*rdi]
-       mov      ecx, edi
-       add      r14d, eax
-       lea      rdi, [rbp-0x38]
-       mov      dword ptr [rbp-0x3C], ecx
-       mov      esi, ecx
-       mov      rax, 0xD1FFAB1E      ; code for System.Decimal+DecCalc:IncreaseScale(byref,uint):uint
-       call     [rax]System.Decimal+DecCalc:IncreaseScale(byref,uint):uint
-       test     eax, eax
-       jne      G_M21333_IG44
-       mov      edi, dword ptr [rbp-0x40]
-       mov      esi, dword ptr [rbp-0x3C]
-       imul     rdi, rsi
-       mov      esi, r12d
-       mov      rax, rdi
-       xor      edx, edx
-       div      rdx:rax, rsi
-       mov      edx, eax
-       imul     edx, r12d
-       sub      edi, edx
-       mov      edx, edi
-       mov      edi, eax
-       add      rdi, qword ptr [rbp-0x38]
-       mov      qword ptr [rbp-0x38], rdi
-       mov      esi, eax
-       cmp      rdi, rsi
-       mov      eax, edx
-       jae      G_M21333_IG04
-						;; size=112 bbWeight=4 PerfScore 336.00
 G_M21333_IG31:
-       mov      edx, dword ptr [rbp-0x30]
-       inc      edx
+       mov      r8d, 9
+						;; size=6 bbWeight=2 PerfScore 0.50
+G_M21333_IG32:
+       cmp      r8d, 10
+       jae      G_M21333_IG48
+       mov      edx, r8d
+       mov      rdi, 0xD1FFAB1E      ; static handle
+       mov      edx, dword ptr [rdi+4*rdx]
+       mov      dword ptr [rbp-0x3C], edx
+       add      r14d, r8d
+       mov      rdx, qword ptr [rbp-0x38]
+       mov      edi, dword ptr [rbp-0x3C]
+       lea      rsi, [rbp-0x68]
+       mulx     rdx, rcx, rdi
+       mov      qword ptr [rsi], rcx
+       mov      rsi, qword ptr [rbp-0x68]
+       mov      qword ptr [rbp-0x38], rsi
+       mov      esi, dword ptr [rbp-0x30]
+       imul     rsi, rdi
+       add      rdx, rsi
        mov      dword ptr [rbp-0x30], edx
+       shr      rdx, 32
        test     edx, edx
-       jne      G_M21333_IG04
+       jne      G_M21333_IG46
+       mov      eax, dword ptr [rbp-0x40]
+       imul     rax, rdi
+       mov      rdx, rax
+       shr      rdx, 32
+       div      edx:eax, r12d
+       mov      edi, edx
+       mov      esi, eax
+       add      rsi, qword ptr [rbp-0x38]
+       mov      qword ptr [rbp-0x38], rsi
+       cmp      rsi, rax
+       mov      eax, edi
+       jae      G_M21333_IG26
+						;; size=124 bbWeight=4 PerfScore 213.00
+G_M21333_IG33:
+       mov      edi, dword ptr [rbp-0x30]
+       inc      edi
+       mov      dword ptr [rbp-0x30], edi
+       test     edi, edi
+       jne      G_M21333_IG26
 						;; size=16 bbWeight=2 PerfScore 7.00
-G_M21333_IG32:
-       test     eax, eax
+G_M21333_IG34:
+       test     edx, edx
        setne    dl
        movzx    rdx, dl
        lea      rdi, [rbp-0x38]
        mov      esi, r14d
        mov      rax, 0xD1FFAB1E      ; code for System.Decimal+DecCalc:OverflowUnscale(byref,int,ubyte):int
        call     [rax]System.Decimal+DecCalc:OverflowUnscale(byref,int,ubyte):int
        mov      r14d, eax
 						;; size=30 bbWeight=0.50 PerfScore 2.88
-G_M21333_IG33:
+G_M21333_IG35:
        test     r13d, r13d
-       je       SHORT G_M21333_IG37
+       je       SHORT G_M21333_IG39
 						;; size=5 bbWeight=1 PerfScore 1.25
-G_M21333_IG34:
+G_M21333_IG36:
        mov      ecx, dword ptr [rbp-0x38]
        mov      rdi, qword ptr [rbp-0x34]
-       jmp      SHORT G_M21333_IG36
+       jmp      SHORT G_M21333_IG38
 						;; size=9 bbWeight=0.50 PerfScore 2.00
-G_M21333_IG35:
+G_M21333_IG37:
        mov      rdi, rsi
        mov      ecx, edx
        add      r14d, -8
 						;; size=9 bbWeight=4 PerfScore 3.00
-G_M21333_IG36:
+G_M21333_IG38:
        test     cl, cl
-       jne      SHORT G_M21333_IG38
+       jne      SHORT G_M21333_IG40
        cmp      r14d, 8
-       jl       SHORT G_M21333_IG38
+       jl       SHORT G_M21333_IG40
        mov      rdx, 0xD1FFAB1E
        mov      rax, rdi
        mul      rdx:rax, rdx
        mov      rsi, rdx
        shr      rsi, 26
        imul     rax, rsi, 0xD1FFAB1E
        mov      rdx, rdi
        sub      rdx, rax
        shl      rdx, 32
        mov      eax, ecx
        add      rdx, rax
        mov      r8, 0xD1FFAB1E
        mov      rax, rdx
        mul      rdx:rax, r8
        shr      rdx, 26
        imul     eax, edx, 0xD1FFAB1E
        cmp      ecx, eax
-       jne      SHORT G_M21333_IG38
-       jmp      SHORT G_M21333_IG35
+       jne      SHORT G_M21333_IG40
+       jmp      SHORT G_M21333_IG37
 						;; size=87 bbWeight=4 PerfScore 78.00
-G_M21333_IG37:
+G_M21333_IG39:
        mov      rcx, qword ptr [rbp-0x38]
        mov      qword ptr [rbx+0x08], rcx
        mov      edi, dword ptr [rbp-0x30]
        mov      dword ptr [rbx+0x04], edi
-       jmp      G_M21333_IG42
+       jmp      G_M21333_IG44
 						;; size=19 bbWeight=0.50 PerfScore 3.00
-G_M21333_IG38:
+G_M21333_IG40:
        test     cl, 15
-       jne      SHORT G_M21333_IG39
+       jne      SHORT G_M21333_IG41
        cmp      r14d, 4
-       jl       SHORT G_M21333_IG39
+       jl       SHORT G_M21333_IG41
        mov      rdx, 0xD1FFAB1E
        mov      rax, rdi
        mul      rdx:rax, rdx
        mov      rsi, rdx
        shr      rsi, 11
        imul     rax, rsi, 0x2710
        mov      rdx, rdi
        sub      rdx, rax
        shl      rdx, 32
        mov      eax, ecx
        add      rdx, rax
        mov      r8, 0xD1FFAB1E
        mov      rax, rdx
        mul      rdx:rax, r8
        shr      rdx, 11
        imul     eax, edx, 0x2710
        cmp      ecx, eax
-       jne      SHORT G_M21333_IG39
+       jne      SHORT G_M21333_IG41
        mov      rdi, rsi
        mov      ecx, edx
        add      r14d, -4
 						;; size=95 bbWeight=0.50 PerfScore 9.12
-G_M21333_IG39:
+G_M21333_IG41:
        test     cl, 3
-       jne      SHORT G_M21333_IG40
+       jne      SHORT G_M21333_IG42
        cmp      r14d, 2
-       jl       SHORT G_M21333_IG40
+       jl       SHORT G_M21333_IG42
        mov      rdx, 0xD1FFAB1E
        mov      rax, rdi
        shr      rax, 2
        mul      rdx:rax, rdx
        mov      rsi, rdx
        shr      rsi, 2
        imul     rax, rsi, 100
        mov      rdx, rdi
        sub      rdx, rax
        shl      rdx, 32
        mov      eax, ecx
        add      rax, rdx
        mov      rdx, 0xD1FFAB1E
        shr      rax, 2
        mul      rdx:rax, rdx
        shr      rdx, 2
        imul     eax, edx, 100
        cmp      ecx, eax
-       jne      SHORT G_M21333_IG40
+       jne      SHORT G_M21333_IG42
        mov      rdi, rsi
        mov      ecx, edx
        add      r14d, -2
 						;; size=94 bbWeight=0.50 PerfScore 9.50
-G_M21333_IG40:
+G_M21333_IG42:
        test     cl, 1
-       jne      SHORT G_M21333_IG41
+       jne      SHORT G_M21333_IG43
        test     r14d, r14d
-       jle      SHORT G_M21333_IG41
+       jle      SHORT G_M21333_IG43
        mov      rdx, 0xD1FFAB1E
        mov      rax, rdi
        mul      rdx:rax, rdx
        mov      rsi, rdx
        shr      rsi, 3
        lea      rax, [rsi+4*rsi]
        add      rax, rax
        mov      rdx, rdi
        sub      rdx, rax
        shl      rdx, 32
        mov      eax, ecx
        add      rdx, rax
        mov      r8, 0xD1FFAB1E
        mov      rax, rdx
        mul      rdx:rax, r8
        shr      rdx, 3
        lea      eax, [rdx+4*rdx]
        add      eax, eax
        cmp      ecx, eax
-       jne      SHORT G_M21333_IG41
+       jne      SHORT G_M21333_IG43
        mov      rdi, rsi
        mov      ecx, edx
        dec      r14d
 						;; size=92 bbWeight=0.50 PerfScore 7.88
-G_M21333_IG41:
+G_M21333_IG43:
        mov      dword ptr [rbx+0x08], ecx
        mov      dword ptr [rbx+0x0C], edi
        shr      rdi, 32
        mov      dword ptr [rbx+0x04], edi
 						;; size=13 bbWeight=0.50 PerfScore 1.75
-G_M21333_IG42:
+G_M21333_IG44:
        mov      eax, dword ptr [rbx]
        xor      eax, dword ptr [r15]
        and      eax, 0xD1FFAB1E
        mov      ecx, r14d
        shl      ecx, 16
        or       eax, ecx
        mov      dword ptr [rbx], eax
 						;; size=20 bbWeight=1 PerfScore 7.25
-G_M21333_IG43:
-       add      rsp, 56
+G_M21333_IG45:
+       add      rsp, 88
        pop      rbx
        pop      r12
        pop      r13
        pop      r14
        pop      r15
        pop      rbp
        ret      
 						;; size=15 bbWeight=1 PerfScore 4.25
-G_M21333_IG44:
+G_M21333_IG46:
        mov      rax, 0xD1FFAB1E      ; code for System.SR:get_Overflow_Decimal():System.String
        call     [rax]System.SR:get_Overflow_Decimal():System.String
        mov      rdi, rax
        mov      rax, 0xD1FFAB1E      ; code for System.Number:ThrowOverflowException(System.String)
        call     [rax]System.Number:ThrowOverflowException(System.String)
        int3     
 						;; size=28 bbWeight=0 PerfScore 0.00
-G_M21333_IG45:
+G_M21333_IG47:
        mov      rdi, 0xD1FFAB1E      ; System.DivideByZeroException
        call     CORINFO_HELP_NEWSFAST
        mov      rbx, rax
        mov      rdi, rbx
        mov      rax, 0xD1FFAB1E      ; code for System.DivideByZeroException:.ctor():this
        call     [rax]System.DivideByZeroException:.ctor():this
        mov      rdi, rbx
        call     CORINFO_HELP_THROW
        int3     
 						;; size=42 bbWeight=0 PerfScore 0.00
-G_M21333_IG46:
+G_M21333_IG48:
        call     CORINFO_HELP_RNGCHKFAIL
        int3     
 						;; size=6 bbWeight=0 PerfScore 0.00
 
-; Total bytes of code 1825, prolog size 25, PerfScore 900.50, instruction count 489, allocated bytes for code 1825 (MethodHash=142cacaa) for method System.Decimal+DecCalc:VarDecDiv(byref,byref) (FullOpts)
+; Total bytes of code 1920, prolog size 28, PerfScore 902.50, instruction count 513, allocated bytes for code 1920 (MethodHash=142cacaa) for method System.Decimal+DecCalc:VarDecDiv(byref,byref) (FullOpts)
4 (13.79 % of base) - System.Decimal+DecCalc:IncreaseScale64(byref,uint)
 ; Assembly listing for method System.Decimal+DecCalc:IncreaseScale64(byref,uint) (FullOpts)
 ; Emitting BLENDED_CODE for X64 with AVX512 - Unix
 ; FullOpts code
 ; optimized code
 ; rsp based frame
 ; partially interruptible
 ; No PGO data
-; 0 inlinees with PGO data; 3 single block inlinees; 0 inlinees without PGO data
+; 0 inlinees with PGO data; 3 single block inlinees; 1 inlinees without PGO data
 ; Final local variable assignments
 ;
-;  V00 arg0         [V00,T00] (  6,  6   )   byref  ->  rdi         single-def
+;  V00 arg0         [V00,T00] (  5,  5   )   byref  ->  rdi         single-def
 ;  V01 arg1         [V01,T01] (  3,  3   )     int  ->  rsi         single-def
-;  V02 loc0         [V02,T02] (  3,  3   )    long  ->  rax        
+;  V02 loc0         [V02,T03] (  2,  2   )    long  ->  rcx         ld-addr-op
 ;# V03 OutArgs      [V03    ] (  1,  1   )  struct ( 0) [rsp+0x00]  do-not-enreg[XS] addr-exposed "OutgoingArgSpace"
-;* V04 tmp1         [V04    ] (  0,  0   )     int  ->  zero-ref    "Inlining Arg"
-;* V05 tmp2         [V05    ] (  0,  0   )     int  ->  zero-ref    "Inlining Arg"
-;  V06 cse0         [V06,T03] (  3,  3   )    long  ->  rcx         "CSE #01: aggressive"
+;* V04 tmp1         [V04    ] (  0,  0   )    long  ->  zero-ref    "Inlining Arg"
+;* V05 tmp2         [V05    ] (  0,  0   )    long  ->  zero-ref    "Inline return value spill temp"
+;* V06 tmp3         [V06    ] (  0,  0   )    long  ->  zero-ref    "Inlining Arg"
+;  V07 tmp4         [V07    ] (  2,  2   )    long  ->  [rsp+0x00]  do-not-enreg[X] addr-exposed ld-addr-op "Inline ldloca(s) first use temp"
+;  V08 tmp5         [V08,T02] (  2,  4   )    long  ->  rax         "impAppendStmt"
+;* V09 tmp6         [V09    ] (  0,  0   )    long  ->  zero-ref    "Inlining Arg"
 ;
-; Lcl frame size = 0
+; Lcl frame size = 8
 
 G_M64539_IG01:
-						;; size=0 bbWeight=1 PerfScore 0.00
+       push     rax
+						;; size=1 bbWeight=1 PerfScore 1.00
 G_M64539_IG02:
-       mov      eax, dword ptr [rdi]
-       mov      ecx, esi
-       imul     rax, rcx
-       mov      dword ptr [rdi], eax
-       mov      edx, dword ptr [rdi+0x04]
-       imul     rcx, rdx
-       shr      rax, 32
-       add      rax, rcx
-       mov      qword ptr [rdi+0x04], rax
-						;; size=28 bbWeight=1 PerfScore 11.00
+       mov      rdx, qword ptr [rdi]
+       mov      eax, esi
+       lea      rcx, [rsp]
+       mulx     rax, rsi, rax
+       mov      qword ptr [rcx], rsi
+       mov      rcx, qword ptr [rsp]
+       mov      dword ptr [rdi+0x08], eax
+       mov      qword ptr [rdi], rcx
+						;; size=27 bbWeight=1 PerfScore 9.75
 G_M64539_IG03:
+       add      rsp, 8
        ret      
-						;; size=1 bbWeight=1 PerfScore 1.00
+						;; size=5 bbWeight=1 PerfScore 1.25
 
-; Total bytes of code 29, prolog size 0, PerfScore 12.00, instruction count 10, allocated bytes for code 29 (MethodHash=beae03e4) for method System.Decimal+DecCalc:IncreaseScale64(byref,uint) (FullOpts)
+; Total bytes of code 33, prolog size 1, PerfScore 12.00, instruction count 11, allocated bytes for code 33 (MethodHash=beae03e4) for method System.Decimal+DecCalc:IncreaseScale64(byref,uint) (FullOpts)

@MihuBot
Copy link
Owner Author

MihuBot commented Jul 15, 2024

Top method improvements

-158 (-19.73 % of base) - System.Decimal+DecCalc:VarDecMul(byref,byref)
 ; Assembly listing for method System.Decimal+DecCalc:VarDecMul(byref,byref) (FullOpts)
 ; Emitting BLENDED_CODE for X64 with AVX512 - Unix
 ; FullOpts code
 ; optimized code
 ; rbp based frame
 ; fully interruptible
 ; No PGO data
-; 0 inlinees with PGO data; 69 single block inlinees; 0 inlinees without PGO data
+; 0 inlinees with PGO data; 52 single block inlinees; 5 inlinees without PGO data
 ; Final local variable assignments
 ;
-;  V00 arg0         [V00,T02] ( 29, 17   )   byref  ->  rbx         single-def
-;  V01 arg1         [V01,T03] ( 25, 14   )   byref  ->  r15         single-def
-;  V02 loc0         [V02,T06] ( 13,  7   )     int  ->  registers  
-;  V03 loc1         [V03,T00] ( 47, 23.50)    long  ->  registers  
-;  V04 loc2         [V04,T01] ( 11, 19.50)     int  ->  rsi        
-;  V05 loc3         [V05    ] ( 17, 12   )  struct (24) [rbp-0x28]  do-not-enreg[XSF] addr-exposed ld-addr-op <System.Decimal+DecCalc+Buf24>
-;* V06 loc4         [V06,T16] (  0,  0   )    long  ->  zero-ref   
-;  V07 loc5         [V07,T08] (  6,  3   )    long  ->  registers  
-;  V08 loc6         [V08,T07] (  7,  3.50)    long  ->  rsi        
+;  V00 arg0         [V00,T01] ( 22, 13.50)   byref  ->  rbx         single-def
+;  V01 arg1         [V01,T02] ( 17, 10   )   byref  ->  r15         single-def
+;  V02 loc0         [V02,T04] ( 13,  7   )     int  ->  [rbp-0x14] 
+;  V03 loc1         [V03,T03] ( 16,  8   )    long  ->  registers   ld-addr-op
+;  V04 loc2         [V04,T00] ( 11, 19.50)     int  ->  rsi        
+;  V05 loc3         [V05    ] ( 13, 10   )  struct (24) [rbp-0x30]  do-not-enreg[XSF] addr-exposed ld-addr-op <System.Decimal+DecCalc+Buf24>
+;* V06 loc4         [V06,T23] (  0,  0   )    long  ->  zero-ref   
+;  V07 loc5         [V07,T07] (  6,  3   )    long  ->  registers  
+;  V08 loc6         [V08,T06] (  7,  3.50)    long  ->  rsi        
 ;* V09 loc7         [V09    ] (  0,  0   )    long  ->  zero-ref   
 ;* V10 loc8         [V10    ] (  0,  0   )  struct (16) zero-ref    ld-addr-op <System.ReadOnlySpan`1[ulong]>
-;  V11 loc9         [V11,T04] ( 17,  8.50)    long  ->  rdi        
-;  V12 loc10        [V12,T05] ( 14,  7   )     int  ->  rsi        
-;# V13 OutArgs      [V13    ] (  1,  1   )  struct ( 0) [rsp+0x00]  do-not-enreg[XS] addr-exposed "OutgoingArgSpace"
-;* V14 tmp1         [V14    ] (  0,  0   )     int  ->  zero-ref    "impAppendStmt"
-;* V15 tmp2         [V15    ] (  0,  0   )     int  ->  zero-ref    "impAppendStmt"
-;* V16 tmp3         [V16    ] (  0,  0   )     int  ->  zero-ref    "impAppendStmt"
-;* V17 tmp4         [V17    ] (  0,  0   )     int  ->  zero-ref    "impAppendStmt"
-;* V18 tmp5         [V18    ] (  0,  0   )     int  ->  zero-ref    "impAppendStmt"
-;* V19 tmp6         [V19    ] (  0,  0   )     int  ->  zero-ref    "impAppendStmt"
-;* V20 tmp7         [V20    ] (  0,  0   )     int  ->  zero-ref    "impAppendStmt"
-;* V21 tmp8         [V21    ] (  0,  0   )     int  ->  zero-ref    "impAppendStmt"
-;* V22 tmp9         [V22    ] (  0,  0   )     int  ->  zero-ref    "impAppendStmt"
-;* V23 tmp10        [V23    ] (  0,  0   )     int  ->  zero-ref    "impAppendStmt"
-;* V24 tmp11        [V24    ] (  0,  0   )     int  ->  zero-ref    "impAppendStmt"
-;* V25 tmp12        [V25    ] (  0,  0   )     int  ->  zero-ref    "impAppendStmt"
-;* V26 tmp13        [V26    ] (  0,  0   )     int  ->  zero-ref    "impAppendStmt"
-;* V27 tmp14        [V27    ] (  0,  0   )     int  ->  zero-ref    "impAppendStmt"
-;* V28 tmp15        [V28    ] (  0,  0   )     int  ->  zero-ref    "impAppendStmt"
-;* V29 tmp16        [V29    ] (  0,  0   )     int  ->  zero-ref    "impAppendStmt"
-;* V30 tmp17        [V30    ] (  0,  0   )     int  ->  zero-ref    "impAppendStmt"
-;* V31 tmp18        [V31    ] (  0,  0   )     int  ->  zero-ref    "impAppendStmt"
-;* V32 tmp19        [V32    ] (  0,  0   )     int  ->  zero-ref    "impAppendStmt"
-;* V33 tmp20        [V33    ] (  0,  0   )     int  ->  zero-ref    "impAppendStmt"
-;* V34 tmp21        [V34    ] (  0,  0   )  struct (16) zero-ref    "dup spill" <System.ValueTuple`2[ulong,ulong]>
-;* V35 tmp22        [V35    ] (  0,  0   )     int  ->  zero-ref    "Inlining Arg"
-;* V36 tmp23        [V36    ] (  0,  0   )  struct (16) zero-ref    "ReadOnlySpan<T> for CreateSpan<T>" <System.ReadOnlySpan`1[ulong]>
-;  V37 tmp24        [V37,T09] (  5,  2.50)    long  ->  rax         "Inline stloc first use temp"
-;* V38 tmp25        [V38    ] (  0,  0   )  struct (16) zero-ref    ld-addr-op "NewObj constructor temp" <System.ValueTuple`2[ulong,ulong]>
-;* V39 tmp26        [V39    ] (  0,  0   )    long  ->  zero-ref    "Inlining Arg"
-;* V40 tmp27        [V40    ] (  0,  0   )     int  ->  zero-ref    "Inlining Arg"
-;* V41 tmp28        [V41    ] (  0,  0   )     int  ->  zero-ref    "Inlining Arg"
-;* V42 tmp29        [V42    ] (  0,  0   )     int  ->  zero-ref    "Inlining Arg"
-;* V43 tmp30        [V43    ] (  0,  0   )     int  ->  zero-ref    "Inlining Arg"
-;* V44 tmp31        [V44    ] (  0,  0   )     int  ->  zero-ref    "Inlining Arg"
-;* V45 tmp32        [V45    ] (  0,  0   )     int  ->  zero-ref    "Inlining Arg"
-;* V46 tmp33        [V46    ] (  0,  0   )     int  ->  zero-ref    "Inlining Arg"
-;* V47 tmp34        [V47    ] (  0,  0   )     int  ->  zero-ref    "Inlining Arg"
-;* V48 tmp35        [V48    ] (  0,  0   )     int  ->  zero-ref    "Inlining Arg"
-;* V49 tmp36        [V49    ] (  0,  0   )     int  ->  zero-ref    "Inlining Arg"
-;* V50 tmp37        [V50    ] (  0,  0   )     int  ->  zero-ref    "Inlining Arg"
-;* V51 tmp38        [V51    ] (  0,  0   )     int  ->  zero-ref    "Inlining Arg"
-;* V52 tmp39        [V52    ] (  0,  0   )     int  ->  zero-ref    "Inlining Arg"
-;* V53 tmp40        [V53    ] (  0,  0   )     int  ->  zero-ref    "Inlining Arg"
-;* V54 tmp41        [V54    ] (  0,  0   )     int  ->  zero-ref    "Inlining Arg"
-;  V55 tmp42        [V55,T10] (  2,  2   )    long  ->  rax         "Inlining Arg"
-;* V56 tmp43        [V56    ] (  0,  0   )    long  ->  zero-ref    "Inlining Arg"
-;* V57 tmp44        [V57    ] (  0,  0   )     int  ->  zero-ref    "Inlining Arg"
-;* V58 tmp45        [V58    ] (  0,  0   )   byref  ->  zero-ref    single-def "field V10._reference (fldOffset=0x0)" P-INDEP
-;* V59 tmp46        [V59    ] (  0,  0   )     int  ->  zero-ref    "field V10._length (fldOffset=0x8)" P-INDEP
-;* V60 tmp47        [V60    ] (  0,  0   )    long  ->  zero-ref    "field V34.Item1 (fldOffset=0x0)" P-INDEP
-;* V61 tmp48        [V61    ] (  0,  0   )    long  ->  zero-ref    "field V34.Item2 (fldOffset=0x8)" P-INDEP
-;* V62 tmp49        [V62    ] (  0,  0   )   byref  ->  zero-ref    single-def "field V36._reference (fldOffset=0x0)" P-INDEP
-;* V63 tmp50        [V63    ] (  0,  0   )     int  ->  zero-ref    "field V36._length (fldOffset=0x8)" P-INDEP
-;* V64 tmp51        [V64    ] (  0,  0   )    long  ->  zero-ref    "field V38.Item1 (fldOffset=0x0)" P-INDEP
-;  V65 tmp52        [V65,T11] (  3,  1.50)    long  ->  rdi         "field V38.Item2 (fldOffset=0x8)" P-INDEP
-;  V66 cse0         [V66,T12] (  3,  1.50)     int  ->  rcx         "CSE #06: moderate"
-;  V67 cse1         [V67,T13] (  3,  1.50)     int  ->  rsi         "CSE #11: moderate"
-;  V68 cse2         [V68,T14] (  3,  1.50)     int  ->  rdi         "CSE #13: moderate"
-;  V69 cse3         [V69,T15] (  3,  1.50)     int  ->  rcx         "CSE #14: moderate"
+;  V11 loc9         [V11,T21] (  2,  1   )    long  ->  rsi         ld-addr-op
+;  V12 loc10        [V12,T22] (  2,  1   )    long  ->  rdi         ld-addr-op
+;  V13 loc11        [V13,T08] (  6,  3   )    long  ->  rax        
+;  V14 loc12        [V14,T05] ( 12,  6   )    long  ->  rsi        
+;# V15 OutArgs      [V15    ] (  1,  1   )  struct ( 0) [rsp+0x00]  do-not-enreg[XS] addr-exposed "OutgoingArgSpace"
+;* V16 tmp1         [V16    ] (  0,  0   )     int  ->  zero-ref    "impAppendStmt"
+;* V17 tmp2         [V17    ] (  0,  0   )     int  ->  zero-ref    "impAppendStmt"
+;* V18 tmp3         [V18    ] (  0,  0   )    long  ->  zero-ref    "impAppendStmt"
+;* V19 tmp4         [V19    ] (  0,  0   )     int  ->  zero-ref    "impAppendStmt"
+;* V20 tmp5         [V20    ] (  0,  0   )     int  ->  zero-ref    "impAppendStmt"
+;* V21 tmp6         [V21    ] (  0,  0   )    long  ->  zero-ref    "impAppendStmt"
+;* V22 tmp7         [V22    ] (  0,  0   )    long  ->  zero-ref    "impAppendStmt"
+;* V23 tmp8         [V23    ] (  0,  0   )    long  ->  zero-ref    "impAppendStmt"
+;* V24 tmp9         [V24    ] (  0,  0   )    long  ->  zero-ref    "impAppendStmt"
+;* V25 tmp10        [V25    ] (  0,  0   )     int  ->  zero-ref    "impAppendStmt"
+;* V26 tmp11        [V26    ] (  0,  0   )     int  ->  zero-ref    "impAppendStmt"
+;* V27 tmp12        [V27    ] (  0,  0   )     int  ->  zero-ref    "impAppendStmt"
+;* V28 tmp13        [V28    ] (  0,  0   )    long  ->  zero-ref    "impAppendStmt"
+;* V29 tmp14        [V29    ] (  0,  0   )     int  ->  zero-ref    "impAppendStmt"
+;* V30 tmp15        [V30    ] (  0,  0   )     int  ->  zero-ref    "impAppendStmt"
+;* V31 tmp16        [V31    ] (  0,  0   )  struct (16) zero-ref    "dup spill" <System.ValueTuple`2[ulong,ulong]>
+;* V32 tmp17        [V32    ] (  0,  0   )     int  ->  zero-ref    "Inlining Arg"
+;* V33 tmp18        [V33    ] (  0,  0   )  struct (16) zero-ref    "ReadOnlySpan<T> for CreateSpan<T>" <System.ReadOnlySpan`1[ulong]>
+;  V34 tmp19        [V34,T12] (  5,  2.50)    long  ->  rax         "Inline stloc first use temp"
+;* V35 tmp20        [V35    ] (  0,  0   )  struct (16) zero-ref    ld-addr-op "NewObj constructor temp" <System.ValueTuple`2[ulong,ulong]>
+;* V36 tmp21        [V36    ] (  0,  0   )    long  ->  zero-ref    "Inlining Arg"
+;* V37 tmp22        [V37    ] (  0,  0   )    long  ->  zero-ref    "Inlining Arg"
+;* V38 tmp23        [V38    ] (  0,  0   )    long  ->  zero-ref    "Inline return value spill temp"
+;* V39 tmp24        [V39    ] (  0,  0   )    long  ->  zero-ref    "Inlining Arg"
+;  V40 tmp25        [V40    ] (  2,  1   )    long  ->  [rbp-0x38]  do-not-enreg[X] addr-exposed ld-addr-op "Inline ldloca(s) first use temp"
+;  V41 tmp26        [V41,T09] (  3,  3   )    long  ->  rdi         "impAppendStmt"
+;* V42 tmp27        [V42    ] (  0,  0   )    long  ->  zero-ref    "Inlining Arg"
+;* V43 tmp28        [V43    ] (  0,  0   )     int  ->  zero-ref    "Inlining Arg"
+;* V44 tmp29        [V44    ] (  0,  0   )    long  ->  zero-ref    "Inlining Arg"
+;* V45 tmp30        [V45    ] (  0,  0   )     int  ->  zero-ref    "Inlining Arg"
+;* V46 tmp31        [V46    ] (  0,  0   )    long  ->  zero-ref    "Inline return value spill temp"
+;* V47 tmp32        [V47    ] (  0,  0   )    long  ->  zero-ref    "Inlining Arg"
+;  V48 tmp33        [V48    ] (  2,  1   )    long  ->  [rbp-0x40]  do-not-enreg[X] addr-exposed ld-addr-op "Inline ldloca(s) first use temp"
+;  V49 tmp34        [V49,T10] (  3,  3   )    long  ->  rdx         "impAppendStmt"
+;* V50 tmp35        [V50    ] (  0,  0   )    long  ->  zero-ref    "Inlining Arg"
+;* V51 tmp36        [V51    ] (  0,  0   )     int  ->  zero-ref    "Inlining Arg"
+;* V52 tmp37        [V52    ] (  0,  0   )    long  ->  zero-ref    "Inlining Arg"
+;* V53 tmp38        [V53    ] (  0,  0   )    long  ->  zero-ref    "Inline return value spill temp"
+;* V54 tmp39        [V54    ] (  0,  0   )    long  ->  zero-ref    "Inlining Arg"
+;  V55 tmp40        [V55    ] (  2,  1   )    long  ->  [rbp-0x48]  do-not-enreg[X] addr-exposed ld-addr-op "Inline ldloca(s) first use temp"
+;  V56 tmp41        [V56,T11] (  3,  3   )    long  ->  rax         "impAppendStmt"
+;* V57 tmp42        [V57    ] (  0,  0   )    long  ->  zero-ref    "Inlining Arg"
+;* V58 tmp43        [V58    ] (  0,  0   )     int  ->  zero-ref    "Inlining Arg"
+;* V59 tmp44        [V59    ] (  0,  0   )     int  ->  zero-ref    "Inlining Arg"
+;* V60 tmp45        [V60    ] (  0,  0   )    long  ->  zero-ref    "Inline return value spill temp"
+;* V61 tmp46        [V61    ] (  0,  0   )    long  ->  zero-ref    "Inlining Arg"
+;  V62 tmp47        [V62    ] (  2,  1   )    long  ->  [rbp-0x50]  do-not-enreg[X] addr-exposed ld-addr-op "Inline ldloca(s) first use temp"
+;  V63 tmp48        [V63,T13] (  2,  2   )    long  ->  rdx         "impAppendStmt"
+;* V64 tmp49        [V64    ] (  0,  0   )     int  ->  zero-ref    "Inlining Arg"
+;* V65 tmp50        [V65    ] (  0,  0   )    long  ->  zero-ref    "Inline return value spill temp"
+;* V66 tmp51        [V66    ] (  0,  0   )    long  ->  zero-ref    "Inlining Arg"
+;  V67 tmp52        [V67    ] (  2,  1   )    long  ->  [rbp-0x58]  do-not-enreg[X] addr-exposed ld-addr-op "Inline ldloca(s) first use temp"
+;  V68 tmp53        [V68,T14] (  2,  2   )    long  ->  rdx         "impAppendStmt"
+;* V69 tmp54        [V69    ] (  0,  0   )    long  ->  zero-ref    "Inlining Arg"
+;* V70 tmp55        [V70    ] (  0,  0   )     int  ->  zero-ref    "Inlining Arg"
+;* V71 tmp56        [V71    ] (  0,  0   )   byref  ->  zero-ref    single-def "field V10._reference (fldOffset=0x0)" P-INDEP
+;* V72 tmp57        [V72    ] (  0,  0   )     int  ->  zero-ref    "field V10._length (fldOffset=0x8)" P-INDEP
+;* V73 tmp58        [V73    ] (  0,  0   )    long  ->  zero-ref    "field V31.Item1 (fldOffset=0x0)" P-INDEP
+;* V74 tmp59        [V74    ] (  0,  0   )    long  ->  zero-ref    "field V31.Item2 (fldOffset=0x8)" P-INDEP
+;* V75 tmp60        [V75    ] (  0,  0   )   byref  ->  zero-ref    single-def "field V33._reference (fldOffset=0x0)" P-INDEP
+;* V76 tmp61        [V76    ] (  0,  0   )     int  ->  zero-ref    "field V33._length (fldOffset=0x8)" P-INDEP
+;* V77 tmp62        [V77    ] (  0,  0   )    long  ->  zero-ref    "field V35.Item1 (fldOffset=0x0)" P-INDEP
+;  V78 tmp63        [V78,T15] (  3,  1.50)    long  ->  rdi         "field V35.Item2 (fldOffset=0x8)" P-INDEP
+;  V79 cse0         [V79,T16] (  3,  1.50)    long  ->  rdi         "CSE #14: moderate"
+;  V80 cse1         [V80,T17] (  3,  1.50)     int  ->  rdi         "CSE #13: moderate"
+;  V81 cse2         [V81,T18] (  3,  1.50)     int  ->  rsi         "CSE #03: moderate"
+;  V82 cse3         [V82,T19] (  3,  1.50)     int  ->  rdi         "CSE #09: moderate"
+;  V83 cse4         [V83,T20] (  3,  1.50)     int  ->  rdx         "CSE #12: moderate"
 ;
-; Lcl frame size = 32
+; Lcl frame size = 80
 
 G_M55226_IG01:
        push     rbp
        push     r15
        push     rbx
-       sub      rsp, 32
-       lea      rbp, [rsp+0x30]
+       sub      rsp, 80
+       lea      rbp, [rsp+0x60]
        mov      rbx, rdi
        mov      r15, rsi
 						;; size=19 bbWeight=1 PerfScore 4.25
 G_M55226_IG02:
        mov      eax, dword ptr [rbx]
        add      eax, dword ptr [r15]
        shr      eax, 16
        movzx    rdx, al
        mov      eax, dword ptr [rbx+0x04]
        or       eax, dword ptr [rbx+0x0C]
        jne      G_M55226_IG08
 						;; size=23 bbWeight=1 PerfScore 11.75
 G_M55226_IG03:
        mov      eax, dword ptr [r15+0x04]
        or       eax, dword ptr [r15+0x0C]
-       jne      G_M55226_IG16
+       jne      G_M55226_IG11
        mov      edi, dword ptr [rbx+0x08]
        mov      esi, dword ptr [r15+0x08]
        imul     rdi, rsi
        cmp      edx, 28
        jle      SHORT G_M55226_IG06
        cmp      edx, 47
-       jg       G_M55226_IG22
+       jg       G_M55226_IG21
        add      edx, -29
        cmp      edx, 19
-       jae      G_M55226_IG24
+       jae      G_M55226_IG23
        mov      eax, edx
        mov      rdx, 0xD1FFAB1E      ; static handle
        mov      rsi, qword ptr [rdx+8*rax]
        mov      rax, rdi
        xor      edx, edx
        div      rdx:rax, rsi
        mov      rdx, rax
        imul     rdx, rsi
        sub      rdi, rdx
        mov      rdx, rax
        shr      rsi, 1
        cmp      rdi, rsi
        jb       SHORT G_M55226_IG05
        ja       SHORT G_M55226_IG04
        test     al, 1
        je       SHORT G_M55226_IG05
 						;; size=102 bbWeight=0.50 PerfScore 43.38
 G_M55226_IG04:
        lea      rdx, [rax+0x01]
 						;; size=4 bbWeight=0.50 PerfScore 0.25
 G_M55226_IG05:
        mov      eax, 28
        mov      rdi, rdx
        mov      edx, eax
 						;; size=10 bbWeight=0.50 PerfScore 0.38
 G_M55226_IG06:
        mov      qword ptr [rbx+0x08], rdi
        mov      edi, dword ptr [r15]
        xor      edi, dword ptr [rbx]
        and      edi, 0xD1FFAB1E
-       mov      esi, edx
-       shl      esi, 16
-       or       edi, esi
-       mov      dword ptr [rbx], edi
-						;; size=24 bbWeight=0.50 PerfScore 4.12
+       shl      edx, 16
+       or       edx, edi
+       mov      dword ptr [rbx], edx
+						;; size=22 bbWeight=0.50 PerfScore 4.00
 G_M55226_IG07:
-       add      rsp, 32
+       add      rsp, 80
        pop      rbx
        pop      r15
        pop      rbp
        ret      
 						;; size=9 bbWeight=0.50 PerfScore 1.38
 G_M55226_IG08:
        mov      edi, dword ptr [r15+0x04]
        or       edi, dword ptr [r15+0x0C]
-       jne      G_M55226_IG13
+       jne      SHORT G_M55226_IG09
+       mov      dword ptr [rbp-0x14], edx
+       mov      rdx, qword ptr [rbx+0x08]
        mov      edi, dword ptr [r15+0x08]
-       mov      esi, dword ptr [rbx+0x08]
-       imul     rdi, rsi
-       mov      dword ptr [rbp-0x28], edi
-       mov      esi, dword ptr [r15+0x08]
-       mov      eax, dword ptr [rbx+0x0C]
-       imul     rsi, rax
-       shr      rdi, 32
-       add      rdi, rsi
-       mov      dword ptr [rbp-0x24], edi
-       shr      rdi, 32
-       mov      esi, dword ptr [rbx+0x04]
-       test     esi, esi
-       je       G_M55226_IG18
+       lea      rsi, [rbp-0x40]
+       mulx     rdx, rax, rdi
+       mov      qword ptr [rsi], rax
+       mov      rdi, qword ptr [rbp-0x40]
+       mov      rax, rdx
+       mov      qword ptr [rbp-0x30], rdi
+       mov      edi, dword ptr [rbx+0x04]
+       test     edi, edi
+       je       G_M55226_IG16
        mov      eax, dword ptr [r15+0x08]
-       imul     rsi, rax
-       add      rsi, rdi
-       mov      rdi, rsi
-       mov      esi, 0xD1FFAB1E
-       cmp      rdi, rsi
-       jbe      G_M55226_IG18
-						;; size=92 bbWeight=0.50 PerfScore 15.25
+       imul     rax, rdi
+       add      rax, rdx
+       mov      edx, 0xD1FFAB1E
+       cmp      rax, rdx
+       ja       G_M55226_IG12
+       jmp      G_M55226_IG16
+						;; size=85 bbWeight=0.50 PerfScore 14.38
 G_M55226_IG09:
-       mov      qword ptr [rbp-0x20], rdi
+       mov      dword ptr [rbp-0x14], edx
+       mov      rdx, qword ptr [rbx+0x08]
+       mov      rax, qword ptr [r15+0x08]
+       lea      rdi, [rbp-0x48]
+       mulx     rax, rsi, rax
+       mov      qword ptr [rdi], rsi
+       mov      rdx, qword ptr [rbp-0x48]
+       mov      qword ptr [rbp-0x30], rdx
+       mov      edx, dword ptr [rbx+0x04]
+       mov      edi, dword ptr [r15+0x04]
+       mov      esi, edx
+       or       esi, edi
+       je       SHORT G_M55226_IG10
+       mov      esi, edx
+       imul     rsi, rdi
+       mov      rdx, qword ptr [rbx+0x08]
+       lea      rcx, [rbp-0x50]
+       mulx     rdx, r8, rdi
+       mov      qword ptr [rcx], r8
+       mov      rdi, qword ptr [rbp-0x50]
+       add      rsi, rdx
+       add      rax, rdi
+       lea      rdx, [rsi+0x01]
+       cmp      rax, rdi
+       cmovb    rsi, rdx
+       mov      rdx, qword ptr [r15+0x08]
+       mov      edi, dword ptr [rbx+0x04]
+       lea      rcx, [rbp-0x58]
+       mulx     rdx, r8, rdi
+       mov      qword ptr [rcx], r8
+       mov      rdi, qword ptr [rbp-0x58]
+       add      rsi, rdx
+       add      rax, rdi
+       lea      rdx, [rsi+0x01]
+       cmp      rax, rdi
+       cmovb    rsi, rdx
+       mov      qword ptr [rbp-0x28], rax
+       mov      qword ptr [rbp-0x20], rsi
+       mov      esi, 5
+       jmp      G_M55226_IG17
+       align    [0 bytes for IG18]
+						;; size=145 bbWeight=0.50 PerfScore 21.75
+G_M55226_IG10:
+       mov      qword ptr [rbp-0x28], rax
+       mov      esi, 3
+       jmp      G_M55226_IG17
+						;; size=14 bbWeight=0.50 PerfScore 1.62
+G_M55226_IG11:
+       mov      dword ptr [rbp-0x14], edx
+       mov      rdx, qword ptr [r15+0x08]
+       mov      edi, dword ptr [rbx+0x08]
+       lea      rsi, [rbp-0x38]
+       mulx     rdi, rax, rdi
+       mov      qword ptr [rsi], rax
+       mov      rsi, qword ptr [rbp-0x38]
+       mov      rax, rdi
+       mov      qword ptr [rbp-0x30], rsi
+       mov      esi, dword ptr [r15+0x04]
+       test     esi, esi
+       je       SHORT G_M55226_IG16
+       mov      eax, dword ptr [rbx+0x08]
+       imul     rax, rsi
+       add      rax, rdi
+       mov      edi, 0xD1FFAB1E
+       cmp      rax, rdi
+       jbe      SHORT G_M55226_IG16
+						;; size=61 bbWeight=0.50 PerfScore 10.38
+G_M55226_IG12:
+       mov      qword ptr [rbp-0x28], rax
        mov      esi, 3
 						;; size=9 bbWeight=0.50 PerfScore 0.62
-G_M55226_IG10:
-       lea      rdi, [rbp-0x28]
+G_M55226_IG13:
+       lea      rdi, [rbp-0x30]
+       mov      edx, dword ptr [rbp-0x14]
        mov      rax, 0xD1FFAB1E      ; code for System.Decimal+DecCalc:ScaleResult(ulong,uint,int):int
        call     [rax]System.Decimal+DecCalc:ScaleResult(ulong,uint,int):int
        mov      edx, eax
-						;; size=18 bbWeight=0.50 PerfScore 2.00
-G_M55226_IG11:
-       mov      rax, qword ptr [rbp-0x28]
+						;; size=21 bbWeight=0.50 PerfScore 2.50
+G_M55226_IG14:
+       mov      rax, qword ptr [rbp-0x30]
        mov      qword ptr [rbx+0x08], rax
-       mov      eax, dword ptr [rbp-0x20]
+       mov      eax, dword ptr [rbp-0x28]
        mov      dword ptr [rbx+0x04], eax
        mov      eax, dword ptr [r15]
        xor      eax, dword ptr [rbx]
        and      eax, 0xD1FFAB1E
        shl      edx, 16
        or       eax, edx
        mov      dword ptr [rbx], eax
 						;; size=31 bbWeight=0.50 PerfScore 5.50
-G_M55226_IG12:
-       add      rsp, 32
+G_M55226_IG15:
+       add      rsp, 80
        pop      rbx
        pop      r15
        pop      rbp
        ret      
 						;; size=9 bbWeight=0.50 PerfScore 1.38
-G_M55226_IG13:
-       mov      edi, dword ptr [rbx+0x08]
-       mov      eax, dword ptr [r15+0x08]
-       imul     rdi, rax
-       mov      dword ptr [rbp-0x28], edi
-       mov      eax, dword ptr [rbx+0x08]
-       mov      ecx, dword ptr [r15+0x0C]
-       imul     rax, rcx
-       shr      rdi, 32
-       add      rdi, rax
-       mov      eax, dword ptr [rbx+0x0C]
-       mov      ecx, dword ptr [r15+0x08]
-       imul     rax, rcx
-       add      rax, rdi
-       mov      dword ptr [rbp-0x24], eax
-       mov      rcx, rax
-       shr      rcx, 32
-       mov      rsi, rax
-       shr      rsi, 32
-       mov      r8, 0xD1FFAB1E
-       or       rsi, r8
-       cmp      rax, rdi
-       mov      rdi, rsi
-       cmovae   rdi, rcx
-       mov      eax, dword ptr [rbx+0x0C]
-       mov      ecx, dword ptr [r15+0x0C]
-       imul     rax, rcx
-       add      rax, rdi
-       mov      ecx, dword ptr [rbx+0x04]
-       mov      edi, dword ptr [r15+0x04]
-       mov      esi, ecx
-       or       esi, edi
-       je       G_M55226_IG15
-       mov      esi, dword ptr [rbx+0x08]
-       imul     rdi, rsi
-       add      rax, rdi
-       xor      esi, esi
-       mov      r8d, 1
-       cmp      rax, rdi
-       cmovb    esi, r8d
-       mov      edi, ecx
-       mov      ecx, dword ptr [r15+0x08]
-       imul     rdi, rcx
-       add      rax, rdi
-       mov      dword ptr [rbp-0x20], eax
-       lea      ecx, [rsi+0x01]
-       cmp      rax, rdi
-       cmovb    esi, ecx
-       mov      edi, esi
-       shl      rdi, 32
-       shr      rax, 32
-       or       rdi, rax
-       mov      eax, dword ptr [rbx+0x0C]
-       mov      esi, dword ptr [r15+0x04]
-       imul     rax, rsi
-       add      rax, rdi
-       xor      esi, esi
-       cmp      rax, rdi
-       cmovb    esi, r8d
-       mov      edi, dword ptr [rbx+0x04]
-       mov      ecx, dword ptr [r15+0x0C]
-       imul     rdi, rcx
-       add      rax, rdi
-       mov      dword ptr [rbp-0x1C], eax
-       lea      ecx, [rsi+0x01]
-       cmp      rax, rdi
-       cmovb    esi, ecx
-       mov      ecx, dword ptr [rbx+0x04]
-       mov      edi, dword ptr [r15+0x04]
-       imul     rcx, rdi
-       mov      edi, esi
-       shl      rdi, 32
-       shr      rax, 32
-       or       rax, rdi
-						;; size=253 bbWeight=0.50 PerfScore 35.75
-G_M55226_IG14:
-       add      rax, rcx
-       mov      qword ptr [rbp-0x18], rax
-       mov      esi, 5
-       jmp      SHORT G_M55226_IG19
-       align    [6 bytes for IG20]
-						;; size=20 bbWeight=0.50 PerfScore 1.75
-G_M55226_IG15:
-       mov      qword ptr [rbp-0x20], rax
-       mov      esi, 3
-       jmp      SHORT G_M55226_IG19
-						;; size=11 bbWeight=0.50 PerfScore 1.62
 G_M55226_IG16:
-       mov      eax, dword ptr [rbx+0x08]
-       mov      esi, dword ptr [r15+0x08]
-       imul     rax, rsi
        mov      dword ptr [rbp-0x28], eax
-       mov      ecx, dword ptr [rbx+0x08]
-       mov      edi, dword ptr [r15+0x0C]
-       imul     rcx, rdi
-       shr      rax, 32
-       add      rax, rcx
-       mov      dword ptr [rbp-0x24], eax
-       shr      rax, 32
-       mov      ecx, dword ptr [r15+0x04]
-       test     ecx, ecx
-       je       SHORT G_M55226_IG17
-       mov      edi, dword ptr [rbx+0x08]
-       imul     rcx, rdi
-       add      rcx, rax
-       mov      rax, rcx
-       mov      ecx, 0xD1FFAB1E
-       cmp      rax, rcx
-       mov      rdi, rax
-       jbe      SHORT G_M55226_IG18
-       jmp      G_M55226_IG09
-						;; size=78 bbWeight=0.50 PerfScore 13.38
-G_M55226_IG17:
-       mov      rdi, rax
-						;; size=3 bbWeight=0.25 PerfScore 0.06
-G_M55226_IG18:
-       mov      dword ptr [rbp-0x20], edi
        mov      esi, 2
 						;; size=8 bbWeight=0.50 PerfScore 0.62
-G_M55226_IG19:
-       lea      rax, [rbp-0x28]
-       movsxd   rcx, esi
+G_M55226_IG17:
+       lea      rax, [rbp-0x30]
+       mov      ecx, esi
        cmp      dword ptr [rax+4*rcx], 0
-       jne      SHORT G_M55226_IG21
-						;; size=13 bbWeight=0.50 PerfScore 2.38
-G_M55226_IG20:
+       jne      SHORT G_M55226_IG19
+						;; size=12 bbWeight=0.50 PerfScore 2.38
+G_M55226_IG18:
        test     esi, esi
-       je       SHORT G_M55226_IG22
+       je       SHORT G_M55226_IG21
        dec      esi
-       lea      rax, [rbp-0x28]
-       movsxd   rcx, esi
+       lea      rax, [rbp-0x30]
+       mov      ecx, esi
        cmp      dword ptr [rax+4*rcx], 0
-       je       SHORT G_M55226_IG20
-						;; size=19 bbWeight=4 PerfScore 25.00
-G_M55226_IG21:
+       je       SHORT G_M55226_IG18
+						;; size=18 bbWeight=4 PerfScore 25.00
+G_M55226_IG19:
        cmp      esi, 2
-       ja       G_M55226_IG10
+       ja       SHORT G_M55226_IG13
+       mov      edx, dword ptr [rbp-0x14]
        cmp      edx, 28
-       jle      G_M55226_IG11
-       jmp      G_M55226_IG10
-						;; size=23 bbWeight=0.50 PerfScore 2.25
-G_M55226_IG22:
+       jle      SHORT G_M55226_IG14
+						;; size=13 bbWeight=0.50 PerfScore 1.75
+G_M55226_IG20:
+       mov      dword ptr [rbp-0x14], edx
+       jmp      SHORT G_M55226_IG13
+						;; size=5 bbWeight=0.25 PerfScore 0.75
+G_M55226_IG21:
        vxorps   xmm0, xmm0, xmm0
        vmovdqu  xmmword ptr [rbx], xmm0
 						;; size=8 bbWeight=0.50 PerfScore 1.17
-G_M55226_IG23:
-       add      rsp, 32
+G_M55226_IG22:
+       add      rsp, 80
        pop      rbx
        pop      r15
        pop      rbp
        ret      
 						;; size=9 bbWeight=0.50 PerfScore 1.38
-G_M55226_IG24:
+G_M55226_IG23:
        call     CORINFO_HELP_RNGCHKFAIL
        int3     
 						;; size=6 bbWeight=0 PerfScore 0.00
 
-; Total bytes of code 801, prolog size 19, PerfScore 175.60, instruction count 237, allocated bytes for code 801 (MethodHash=34b92845) for method System.Decimal+DecCalc:VarDecMul(byref,byref) (FullOpts)
+; Total bytes of code 643, prolog size 19, PerfScore 156.54, instruction count 194, allocated bytes for code 643 (MethodHash=34b92845) for method System.Decimal+DecCalc:VarDecMul(byref,byref) (FullOpts)
-142 (-88.75 % of base) - System.Decimal+DecCalc:Div96By64(byref,ulong):uint
 ; Assembly listing for method System.Decimal+DecCalc:Div96By64(byref,ulong):uint (FullOpts)
 ; Emitting BLENDED_CODE for X64 with AVX512 - Unix
 ; FullOpts code
 ; optimized code
 ; rbp based frame
-; fully interruptible
+; partially interruptible
 ; No PGO data
-; 0 inlinees with PGO data; 9 single block inlinees; 0 inlinees without PGO data
+; 0 inlinees with PGO data; 2 single block inlinees; 0 inlinees without PGO data
 ; Final local variable assignments
 ;
-;  V00 arg0         [V00,T03] (  9,  6   )   byref  ->  rdi         single-def
-;  V01 arg1         [V01,T01] ( 12, 21   )    long  ->  rsi         single-def
-;  V02 loc0         [V02,T00] ( 16, 29   )    long  ->   r8        
-;  V03 loc1         [V03,T05] (  3,  2.50)     int  ->  rcx        
-;  V04 loc2         [V04,T02] (  9, 18.50)     int  ->  rcx        
-;  V05 loc3         [V05,T06] (  4,  2   )     int  ->   r8        
-;  V06 loc4         [V06,T07] (  4,  2   )    long  ->   r9        
-;  V07 loc5         [V07,T08] (  3,  1.50)    long  ->  rax        
-;* V08 loc6         [V08    ] (  0,  0   )   byref  ->  zero-ref    single-def
-;* V09 loc7         [V09    ] (  0,  0   )  struct (16) zero-ref    <System.ValueTuple`2[ulong,ulong]>
-;# V10 OutArgs      [V10    ] (  1,  1   )  struct ( 0) [rsp+0x00]  do-not-enreg[XS] addr-exposed "OutgoingArgSpace"
-;  V11 tmp1         [V11,T09] (  3,  1.50)    long  ->  rax         "Inline stloc first use temp"
-;* V12 tmp2         [V12    ] (  0,  0   )  struct (16) zero-ref    ld-addr-op "NewObj constructor temp" <System.ValueTuple`2[ulong,ulong]>
-;* V13 tmp3         [V13    ] (  0,  0   )    long  ->  zero-ref    "Inlining Arg"
-;* V14 tmp4         [V14    ] (  0,  0   )    long  ->  zero-ref    "Inlining Arg"
-;* V15 tmp5         [V15    ] (  0,  0   )     int  ->  zero-ref    "Inlining Arg"
-;  V16 tmp6         [V16,T04] (  3,  3   )     int  ->  rax         "Single return block return value"
-;* V17 tmp7         [V17    ] (  0,  0   )    long  ->  zero-ref    "field V09.Item1 (fldOffset=0x0)" P-INDEP
-;* V18 tmp8         [V18    ] (  0,  0   )    long  ->  zero-ref    "field V09.Item2 (fldOffset=0x8)" P-INDEP
-;* V19 tmp9         [V19    ] (  0,  0   )    long  ->  zero-ref    "field V12.Item1 (fldOffset=0x0)" P-INDEP
-;  V20 tmp10        [V20,T11] (  2,  1   )    long  ->   r8         "field V12.Item2 (fldOffset=0x8)" P-INDEP
-;  V21 cse0         [V21,T10] (  3,  1.50)    long  ->  rcx         "CSE #02: moderate"
+;  V00 arg0         [V00,T00] (  5,  5   )   byref  ->  rdi         single-def
+;  V01 arg1         [V01,T01] (  3,  3   )    long  ->  rsi         single-def
+;* V02 loc0         [V02    ] (  0,  0   )    long  ->  zero-ref   
+;* V03 loc1         [V03    ] (  0,  0   )     int  ->  zero-ref   
+;* V04 loc2         [V04    ] (  0,  0   )     int  ->  zero-ref   
+;* V05 loc3         [V05    ] (  0,  0   )     int  ->  zero-ref   
+;* V06 loc4         [V06    ] (  0,  0   )    long  ->  zero-ref   
+;* V07 loc5         [V07    ] (  0,  0   )     int  ->  zero-ref   
+;* V08 loc6         [V08    ] (  0,  0   )    long  ->  zero-ref   
+;* V09 loc7         [V09    ] (  0,  0   )   byref  ->  zero-ref    single-def
+;* V10 loc8         [V10    ] (  0,  0   )  struct (16) zero-ref    multireg-ret <System.ValueTuple`2[ulong,ulong]>
+;# V11 OutArgs      [V11    ] (  1,  1   )  struct ( 0) [rsp+0x00]  do-not-enreg[XS] addr-exposed "OutgoingArgSpace"
+;* V12 tmp1         [V12    ] (  0,  0   )  struct (16) zero-ref    do-not-enreg[SBR] multireg-ret "Return value temp for multireg return" <System.ValueTuple`2[ulong,ulong]>
+;* V13 tmp2         [V13    ] (  0,  0   )    long  ->  zero-ref    "Inlining Arg"
+;  V14 tmp3         [V14,T02] (  2,  2   )    long  ->  rax         "field V10.Item1 (fldOffset=0x0)" P-INDEP
+;  V15 tmp4         [V15,T03] (  2,  2   )    long  ->  rdx         "field V10.Item2 (fldOffset=0x8)" P-INDEP
+;* V16 tmp5         [V16    ] (  0,  0   )    long  ->  zero-ref    "field V12.Item1 (fldOffset=0x0)" P-DEP
+;* V17 tmp6         [V17    ] (  0,  0   )    long  ->  zero-ref    "field V12.Item2 (fldOffset=0x8)" P-DEP
 ;
 ; Lcl frame size = 0
 
 G_M1421_IG01:
        push     rbp
        mov      rbp, rsp
 						;; size=4 bbWeight=1 PerfScore 1.25
 G_M1421_IG02:
-       mov      ecx, dword ptr [rdi+0x08]
-       test     ecx, ecx
-       jne      SHORT G_M1421_IG05
-						;; size=7 bbWeight=1 PerfScore 3.25
-G_M1421_IG03:
-       mov      r8, qword ptr [rdi]
-       cmp      r8, rsi
-       jb       SHORT G_M1421_IG08
-       mov      rax, r8
-       xor      edx, edx
+       mov      rax, qword ptr [rdi]
+       mov      edx, dword ptr [rdi+0x08]
        div      rdx:rax, rsi
-       imul     rsi, rax
-       sub      r8, rsi
-       mov      qword ptr [rdi], r8
-						;; size=26 bbWeight=0.50 PerfScore 34.00
-G_M1421_IG04:
-       pop      rbp
-       ret      
-						;; size=2 bbWeight=0.50 PerfScore 0.75
-G_M1421_IG05:
-       mov      r8, rsi
-       shr      r8, 32
-       cmp      ecx, r8d
-       jb       SHORT G_M1421_IG07
-       mov      r8, qword ptr [rdi]
-       mov      rax, rsi
-       shl      rax, 32
-       sub      r8, rax
-       xor      ecx, ecx
-       align    [0 bytes for IG06]
-						;; size=27 bbWeight=0.50 PerfScore 2.62
-G_M1421_IG06:
-       dec      ecx
-       add      r8, rsi
-       cmp      r8, rsi
-       jae      SHORT G_M1421_IG06
-       jmp      SHORT G_M1421_IG12
-						;; size=12 bbWeight=4 PerfScore 15.00
-G_M1421_IG07:
-       mov      r9, qword ptr [rdi+0x04]
-       mov      ecx, r8d
-       cmp      r9, rcx
-       jae      SHORT G_M1421_IG10
-						;; size=12 bbWeight=0.50 PerfScore 1.75
-G_M1421_IG08:
-       xor      eax, eax
-						;; size=2 bbWeight=0.50 PerfScore 0.12
-G_M1421_IG09:
+       mov      qword ptr [rdi], rdx
+						;; size=12 bbWeight=1 PerfScore 66.00
+G_M1421_IG03:
        pop      rbp
        ret      
-						;; size=2 bbWeight=0.50 PerfScore 0.75
-G_M1421_IG10:
-       mov      rax, r9
-       xor      edx, edx
-       div      rdx:rax, rcx
-       mov      ecx, eax
-       imul     r8d, ecx
-       mov      eax, r8d
-       sub      r9, rax
-       shl      r9, 32
-       mov      r8d, dword ptr [rdi]
-       or       r8, r9
-       mov      eax, ecx
-       mov      edx, esi
-       imul     rax, rdx
-       sub      r8, rax
-       not      rax
-       cmp      rax, r8
-       jae      SHORT G_M1421_IG12
-       align    [0 bytes for IG11]
-						;; size=49 bbWeight=0.50 PerfScore 35.62
-G_M1421_IG11:
-       dec      ecx
-       add      r8, rsi
-       cmp      r8, rsi
-       jae      SHORT G_M1421_IG11
-						;; size=10 bbWeight=4 PerfScore 7.00
-G_M1421_IG12:
-       mov      qword ptr [rdi], r8
-       mov      eax, ecx
-       jmp      SHORT G_M1421_IG04
-						;; size=7 bbWeight=0.50 PerfScore 1.62
+						;; size=2 bbWeight=1 PerfScore 1.50
 
-; Total bytes of code 160, prolog size 4, PerfScore 103.75, instruction count 63, allocated bytes for code 163 (MethodHash=4595fa72) for method System.Decimal+DecCalc:Div96By64(byref,ulong):uint (FullOpts)
+; Total bytes of code 18, prolog size 4, PerfScore 68.75, instruction count 8, allocated bytes for code 18 (MethodHash=4595fa72) for method System.Decimal+DecCalc:Div96By64(byref,ulong):uint (FullOpts)
-40 (-39.22 % of base) - System.Decimal+DecCalc:Div96By32(byref,uint):uint
 ; Assembly listing for method System.Decimal+DecCalc:Div96By32(byref,uint):uint (FullOpts)
 ; Emitting BLENDED_CODE for X64 with AVX512 - Unix
 ; FullOpts code
 ; optimized code
 ; rbp based frame
 ; partially interruptible
 ; No PGO data
-; 0 inlinees with PGO data; 4 single block inlinees; 0 inlinees without PGO data
 ; Final local variable assignments
 ;
-;  V00 arg0         [V00,T00] (  9,  6   )   byref  ->  rdi         single-def
-;  V01 arg1         [V01,T01] (  6,  4   )     int  ->  rsi         single-def
-;  V02 loc0         [V02,T02] ( 11,  5.50)    long  ->  rcx        
-;  V03 loc1         [V03,T03] (  6,  3   )    long  ->  rax        
-;  V04 loc2         [V04,T05] (  3,  1.50)     int  ->  rax        
-;# V05 OutArgs      [V05    ] (  1,  1   )  struct ( 0) [rsp+0x00]  do-not-enreg[XS] addr-exposed "OutgoingArgSpace"
-;  V06 cse0         [V06,T04] (  6,  3   )    long  ->   r8         multi-def "CSE #01: aggressive"
+;  V00 arg0         [V00,T00] ( 11,  8   )   byref  ->  rdi         single-def
+;  V01 arg1         [V01,T01] (  6,  4.50)     int  ->  rsi         single-def
+;  V02 loc0         [V02,T02] (  6,  4   )     int  ->  rdx        
+;* V03 loc1         [V03    ] (  0,  0   )  struct ( 8) zero-ref    multireg-ret <System.ValueTuple`2[uint,uint]>
+;* V04 loc2         [V04    ] (  0,  0   )    long  ->  zero-ref   
+;* V05 loc3         [V05    ] (  0,  0   )    long  ->  zero-ref   
+;* V06 loc4         [V06    ] (  0,  0   )  struct (16) zero-ref    <System.ValueTuple`2[ulong,ulong]>
+;# V07 OutArgs      [V07    ] (  1,  1   )  struct ( 0) [rsp+0x00]  do-not-enreg[XS] addr-exposed "OutgoingArgSpace"
+;* V08 tmp1         [V08    ] (  0,  0   )  struct ( 8) zero-ref    do-not-enreg[SR] multireg-ret "Return value temp for multireg return" <System.ValueTuple`2[uint,uint]>
+;  V09 tmp2         [V09,T05] (  2,  2   )   byref  ->  rcx         single-def "impAppendStmt"
+;* V10 tmp3         [V10    ] (  0,  0   )  struct ( 8) zero-ref    do-not-enreg[SR] multireg-ret "Return value temp for multireg return" <System.ValueTuple`2[uint,uint]>
+;  V11 tmp4         [V11,T06] (  2,  2   )   byref  ->  rcx         single-def "impAppendStmt"
+;* V12 tmp5         [V12    ] (  0,  0   )  struct ( 8) zero-ref    do-not-enreg[SR] multireg-ret "Return value temp for multireg return" <System.ValueTuple`2[uint,uint]>
+;* V13 tmp6         [V13    ] (  0,  0   )   byref  ->  zero-ref    single-def "impAppendStmt"
+;  V14 tmp7         [V14,T03] (  6,  4   )     int  ->  rax         "field V03.Item1 (fldOffset=0x0)" P-INDEP
+;  V15 tmp8         [V15,T04] (  6,  4   )     int  ->  rdx         "field V03.Item2 (fldOffset=0x4)" P-INDEP
+;* V16 tmp9         [V16    ] (  0,  0   )    long  ->  zero-ref    "field V06.Item1 (fldOffset=0x0)" P-INDEP
+;* V17 tmp10        [V17    ] (  0,  0   )    long  ->  zero-ref    "field V06.Item2 (fldOffset=0x8)" P-INDEP
+;  V18 cse0         [V18,T07] (  3,  1.50)     int  ->  rax         "CSE #01: moderate"
 ;
 ; Lcl frame size = 0
 
 G_M12183_IG01:
        push     rbp
        mov      rbp, rsp
 						;; size=4 bbWeight=1 PerfScore 1.25
 G_M12183_IG02:
+       xor      edx, edx
        cmp      dword ptr [rdi+0x08], 0
-       je       SHORT G_M12183_IG06
-						;; size=6 bbWeight=1 PerfScore 4.00
+       jne      SHORT G_M12183_IG04
+						;; size=8 bbWeight=1 PerfScore 4.25
 G_M12183_IG03:
-       mov      rcx, qword ptr [rdi+0x04]
-       mov      r8d, esi
-       mov      rax, rcx
-       xor      edx, edx
-       div      rdx:rax, r8
-       mov      qword ptr [rdi+0x04], rax
-       imul     eax, esi
-       sub      rcx, rax
-       shl      rcx, 32
-       mov      eax, dword ptr [rdi]
-       or       rcx, rax
-       jne      SHORT G_M12183_IG08
-						;; size=36 bbWeight=0.50 PerfScore 35.38
-G_M12183_IG04:
+       mov      eax, dword ptr [rdi+0x04]
+       cmp      eax, esi
+       jae      SHORT G_M12183_IG05
+       mov      edx, eax
        xor      eax, eax
-						;; size=2 bbWeight=0.50 PerfScore 0.12
+       mov      dword ptr [rdi+0x04], eax
+       jmp      SHORT G_M12183_IG06
+						;; size=16 bbWeight=0.50 PerfScore 3.38
+G_M12183_IG04:
+       lea      rcx, bword ptr [rdi+0x08]
+       mov      eax, dword ptr [rdi+0x08]
+       xor      edx, edx
+       div      edx:eax, esi
+       mov      dword ptr [rcx], eax
+						;; size=13 bbWeight=0.50 PerfScore 14.38
 G_M12183_IG05:
-       pop      rbp
-       ret      
-						;; size=2 bbWeight=0.50 PerfScore 0.75
+       lea      rcx, bword ptr [rdi+0x04]
+       mov      eax, dword ptr [rdi+0x04]
+       div      edx:eax, esi
+       mov      dword ptr [rcx], eax
+						;; size=11 bbWeight=0.50 PerfScore 14.25
 G_M12183_IG06:
-       mov      rcx, qword ptr [rdi]
-       test     rcx, rcx
-       je       SHORT G_M12183_IG04
-       mov      r8d, esi
-       mov      rax, rcx
-       xor      edx, edx
-       div      rdx:rax, r8
-       mov      qword ptr [rdi], rax
-       imul     r8d, eax
-       mov      eax, ecx
-       sub      eax, r8d
-						;; size=31 bbWeight=0.50 PerfScore 34.25
-G_M12183_IG07:
-       pop      rbp
-       ret      
-						;; size=2 bbWeight=0.50 PerfScore 0.75
-G_M12183_IG08:
-       mov      rax, rcx
-       xor      edx, edx
-       div      rdx:rax, r8
+       mov      eax, dword ptr [rdi]
+       div      edx:eax, esi
        mov      dword ptr [rdi], eax
-       imul     esi, eax
-       mov      eax, ecx
-       sub      eax, esi
-						;; size=17 bbWeight=0.50 PerfScore 32.50
-G_M12183_IG09:
+       mov      eax, edx
+						;; size=8 bbWeight=1 PerfScore 28.25
+G_M12183_IG07:
        pop      rbp
        ret      
-						;; size=2 bbWeight=0.50 PerfScore 0.75
+						;; size=2 bbWeight=1 PerfScore 1.50
 
-; Total bytes of code 102, prolog size 4, PerfScore 109.75, instruction count 41, allocated bytes for code 102 (MethodHash=a751d068) for method System.Decimal+DecCalc:Div96By32(byref,uint):uint (FullOpts)
+; Total bytes of code 62, prolog size 4, PerfScore 67.25, instruction count 27, allocated bytes for code 62 (MethodHash=a751d068) for method System.Decimal+DecCalc:Div96By32(byref,uint):uint (FullOpts)
-26 (-11.21 % of base) - System.Decimal+DecCalc:VarDecCmpSub(byref,byref):int
 ; Assembly listing for method System.Decimal+DecCalc:VarDecCmpSub(byref,byref):int (FullOpts)
 ; Emitting BLENDED_CODE for X64 with AVX512 - Unix
 ; FullOpts code
 ; optimized code
 ; rbp based frame
 ; fully interruptible
 ; No PGO data
-; 0 inlinees with PGO data; 8 single block inlinees; 0 inlinees without PGO data
+; 0 inlinees with PGO data; 7 single block inlinees; 1 inlinees without PGO data
 ; Final local variable assignments
 ;
 ;  V00 arg0         [V00,T08] (  5,  5   )   byref  ->  rdi         single-def
 ;  V01 arg1         [V01,T09] (  5,  5   )   byref  ->  rsi         single-def
-;  V02 loc0         [V02,T12] (  7,  4.50)     int  ->  rcx        
-;  V03 loc1         [V03,T11] (  8,  5.25)     int  ->  rax        
-;  V04 loc2         [V04,T04] (  8, 15   )    long  ->  registers  
+;  V02 loc0         [V02,T12] (  7,  4.50)     int  ->  rax        
+;  V03 loc1         [V03,T11] (  8,  5.25)     int  ->  rdx        
+;  V04 loc2         [V04,T05] (  7, 11   )    long  ->  registers   ld-addr-op
 ;  V05 loc3         [V05,T06] (  7, 11   )     int  ->  registers  
 ;  V06 loc4         [V06,T13] (  4,  2.50)    long  ->  registers  
 ;  V07 loc5         [V07,T14] (  4,  2.50)     int  ->  registers  
-;  V08 loc6         [V08,T17] (  3,  1.50)     int  ->  rax        
-;  V09 loc7         [V09,T18] (  3,  1.50)    long  ->  rax        
+;  V08 loc6         [V08,T17] (  3,  1.50)     int  ->  rcx        
+;  V09 loc7         [V09,T18] (  3,  1.50)    long  ->  rcx        
 ;* V10 loc8         [V10    ] (  0,  0   )     int  ->  zero-ref   
-;  V11 loc9         [V11,T05] (  3, 12   )    long  ->  r10        
-;  V12 loc10        [V12,T00] (  6, 24   )    long  ->  rdx        
-;* V13 loc11        [V13    ] (  0,  0   )  struct (16) zero-ref    ld-addr-op <System.ReadOnlySpan`1[uint]>
-;# V14 OutArgs      [V14    ] (  1,  1   )  struct ( 0) [rsp+0x00]  do-not-enreg[XS] addr-exposed "OutgoingArgSpace"
-;  V15 tmp1         [V15,T10] (  3,  6   )     int  ->  rax         "dup spill"
-;  V16 tmp2         [V16,T07] (  3,  8   )     int  ->  rax        
-;  V17 tmp3         [V17,T03] (  2, 16   )     int  ->   r9         "dup spill"
-;  V18 tmp4         [V18,T15] (  2,  2   )    long  ->  rdx         "impSpillLclRefs"
-;  V19 tmp5         [V19,T16] (  2,  2   )     int  ->  rdi         "impSpillLclRefs"
-;* V20 tmp6         [V20    ] (  0,  0   )  struct (16) zero-ref    "ReadOnlySpan<T> for CreateSpan<T>" <System.ReadOnlySpan`1[uint]>
-;* V21 tmp7         [V21    ] (  0,  0   )     int  ->  zero-ref    "Inlining Arg"
-;* V22 tmp8         [V22    ] (  0,  0   )     int  ->  zero-ref    "Inlining Arg"
-;* V23 tmp9         [V23    ] (  0,  0   )   byref  ->  zero-ref    "field V13._reference (fldOffset=0x0)" P-INDEP
-;* V24 tmp10        [V24    ] (  0,  0   )     int  ->  zero-ref    "field V13._length (fldOffset=0x8)" P-INDEP
-;* V25 tmp11        [V25    ] (  0,  0   )   byref  ->  zero-ref    "field V20._reference (fldOffset=0x0)" P-INDEP
-;* V26 tmp12        [V26    ] (  0,  0   )     int  ->  zero-ref    "field V20._length (fldOffset=0x8)" P-INDEP
-;  V27 cse0         [V27,T02] (  4, 16   )    long  ->  rax         "CSE #01: aggressive"
-;  V28 rat0         [V28,T01] (  7, 20.75)    long  ->   r9         "Widened IV V03"
+;  V11 loc9         [V11,T03] (  3, 12   )    long  ->  rdi        
+;* V12 loc10        [V12    ] (  0,  0   )  struct (16) zero-ref    ld-addr-op <System.ReadOnlySpan`1[uint]>
+;# V13 OutArgs      [V13    ] (  1,  1   )  struct ( 0) [rsp+0x00]  do-not-enreg[XS] addr-exposed "OutgoingArgSpace"
+;  V14 tmp1         [V14,T10] (  3,  6   )     int  ->  rdx         "dup spill"
+;  V15 tmp2         [V15,T07] (  3,  8   )     int  ->  rdx        
+;  V16 tmp3         [V16,T01] (  2, 16   )     int  ->   r9         "dup spill"
+;  V17 tmp4         [V17,T15] (  2,  2   )    long  ->  rcx         "impAppendStmt"
+;  V18 tmp5         [V18,T16] (  2,  2   )     int  ->  rdi         "impSpillLclRefs"
+;* V19 tmp6         [V19    ] (  0,  0   )  struct (16) zero-ref    "ReadOnlySpan<T> for CreateSpan<T>" <System.ReadOnlySpan`1[uint]>
+;* V20 tmp7         [V20    ] (  0,  0   )    long  ->  zero-ref    "Inlining Arg"
+;* V21 tmp8         [V21    ] (  0,  0   )    long  ->  zero-ref    "Inline return value spill temp"
+;* V22 tmp9         [V22    ] (  0,  0   )    long  ->  zero-ref    "Inlining Arg"
+;  V23 tmp10        [V23    ] (  2,  8   )    long  ->  [rbp-0x10]  do-not-enreg[X] addr-exposed ld-addr-op "Inline ldloca(s) first use temp"
+;  V24 tmp11        [V24,T02] (  2, 16   )    long  ->  rcx         "impAppendStmt"
+;* V25 tmp12        [V25    ] (  0,  0   )   byref  ->  zero-ref    "field V12._reference (fldOffset=0x0)" P-INDEP
+;* V26 tmp13        [V26    ] (  0,  0   )     int  ->  zero-ref    "field V12._length (fldOffset=0x8)" P-INDEP
+;* V27 tmp14        [V27    ] (  0,  0   )   byref  ->  zero-ref    "field V19._reference (fldOffset=0x0)" P-INDEP
+;* V28 tmp15        [V28    ] (  0,  0   )     int  ->  zero-ref    "field V19._length (fldOffset=0x8)" P-INDEP
+;  V29 cse0         [V29,T04] (  3, 12   )    long  ->  r10         "CSE #01: aggressive"
+;  V30 rat0         [V30,T00] (  7, 20.75)    long  ->   r9         "Widened IV V03"
 ;
-; Lcl frame size = 0
+; Lcl frame size = 8
 
 G_M9821_IG01:
        push     rbp
-       mov      rbp, rsp
-						;; size=4 bbWeight=1 PerfScore 1.25
+       push     rbx
+       push     rax
+       lea      rbp, [rsp+0x10]
+						;; size=8 bbWeight=1 PerfScore 3.50
 G_M9821_IG02:
-       mov      eax, dword ptr [rsi]
-       mov      ecx, eax
-       sar      ecx, 31
-       or       ecx, 1
-       sub      eax, dword ptr [rdi]
-       mov      rdx, qword ptr [rdi+0x08]
+       mov      edx, dword ptr [rsi]
+       mov      eax, edx
+       sar      eax, 31
+       or       eax, 1
+       sub      edx, dword ptr [rdi]
+       mov      rcx, qword ptr [rdi+0x08]
        mov      edi, dword ptr [rdi+0x04]
        mov      r8, qword ptr [rsi+0x08]
        mov      esi, dword ptr [rsi+0x04]
-       test     eax, eax
+       test     edx, edx
        je       SHORT G_M9821_IG07
 						;; size=30 bbWeight=1 PerfScore 15.25
 G_M9821_IG03:
-       sar      eax, 16
+       sar      edx, 16
        jns      SHORT G_M9821_IG04
+       neg      edx
        neg      eax
-       neg      ecx
-       mov      r9, rdx
-       mov      rdx, r8
+       mov      r9, rcx
+       mov      rcx, r8
        mov      r8, r9
        mov      r9d, esi
        mov      esi, edi
        mov      edi, r9d
 						;; size=26 bbWeight=0.50 PerfScore 1.75
 G_M9821_IG04:
-       mov      r9d, eax
+       mov      r9d, edx
        align    [0 bytes for IG05]
 						;; size=3 bbWeight=0.75 PerfScore 0.19
 G_M9821_IG05:
        cmp      r9d, 9
        jge      SHORT G_M9821_IG08
 						;; size=6 bbWeight=4 PerfScore 5.00
 G_M9821_IG06:
        cmp      r9d, 10
-       jae      G_M9821_IG18
-       mov      rax, 0xD1FFAB1E      ; static handle
-       mov      eax, dword ptr [rax+4*r9]
+       jae      SHORT G_M9821_IG17
+       mov      rdx, 0xD1FFAB1E      ; static handle
+       mov      edx, dword ptr [rdx+4*r9]
        jmp      SHORT G_M9821_IG09
-						;; size=26 bbWeight=2 PerfScore 11.00
+						;; size=22 bbWeight=2 PerfScore 11.00
 G_M9821_IG07:
-       mov      r11, rdx
-       jmp      SHORT G_M9821_IG13
+       mov      rdx, rcx
+       jmp      SHORT G_M9821_IG12
 						;; size=5 bbWeight=0.50 PerfScore 1.12
 G_M9821_IG08:
-       mov      eax, 0xD1FFAB1E
+       mov      edx, 0xD1FFAB1E
 						;; size=5 bbWeight=2 PerfScore 0.50
 G_M9821_IG09:
        mov      r10d, edx
-       mov      eax, eax
-       imul     r10, rax
-       shr      rdx, 32
-       mov      edx, edx
-       imul     rdx, rax
-       mov      r11, r10
-       shr      r11, 32
-       add      rdx, r11
-       mov      r10d, r10d
-       mov      r11, rdx
-       shl      r11, 32
-       add      r11, r10
+       lea      r11, [rbp-0x10]
+       mov      rdx, rcx
+       mulx     rcx, rbx, r10
+       mov      qword ptr [r11], rbx
+       mov      rdx, qword ptr [rbp-0x10]
        mov      edi, edi
-       imul     rax, rdi
-       shr      rdx, 32
-       add      rdx, rax
-       mov      eax, 0xD1FFAB1E
-       cmp      rdx, rax
-       jbe      SHORT G_M9821_IG12
-						;; size=65 bbWeight=4 PerfScore 48.00
+       imul     rdi, r10
+       add      rdi, rcx
+       mov      ecx, 0xD1FFAB1E
+       cmp      rdi, rcx
+       jbe      SHORT G_M9821_IG11
+						;; size=41 bbWeight=4 PerfScore 40.00
 G_M9821_IG10:
-       mov      eax, ecx
-						;; size=2 bbWeight=1 PerfScore 0.25
-G_M9821_IG11:
+       add      rsp, 8
+       pop      rbx
        pop      rbp
        ret      
-						;; size=2 bbWeight=1 PerfScore 1.50
-G_M9821_IG12:
-       mov      edi, edx
+						;; size=7 bbWeight=1 PerfScore 2.25
+G_M9821_IG11:
        add      r9d, -9
        test     r9d, r9d
-       jg       SHORT G_M9821_IG15
-						;; size=11 bbWeight=4 PerfScore 7.00
-G_M9821_IG13:
-       mov      eax, edi
-       sub      eax, esi
-       je       SHORT G_M9821_IG16
-       cmp      eax, edi
+       jg       SHORT G_M9821_IG14
+						;; size=9 bbWeight=4 PerfScore 6.00
+G_M9821_IG12:
+       mov      ecx, edi
+       sub      ecx, esi
+       je       SHORT G_M9821_IG15
+       cmp      ecx, edi
        jbe      SHORT G_M9821_IG10
 						;; size=10 bbWeight=0.50 PerfScore 1.38
-G_M9821_IG14:
-       neg      ecx
+G_M9821_IG13:
+       neg      eax
        jmp      SHORT G_M9821_IG10
 						;; size=4 bbWeight=0.50 PerfScore 1.12
+G_M9821_IG14:
+       mov      rcx, rdx
+       jmp      SHORT G_M9821_IG05
+						;; size=5 bbWeight=2 PerfScore 4.50
 G_M9821_IG15:
-       mov      rdx, r11
-       jmp      G_M9821_IG05
-						;; size=8 bbWeight=2 PerfScore 4.50
-G_M9821_IG16:
-       mov      rax, r11
-       sub      rax, r8
-       jne      SHORT G_M9821_IG17
-       xor      ecx, ecx
+       mov      rcx, rdx
+       sub      rcx, r8
+       jne      SHORT G_M9821_IG16
+       xor      eax, eax
        jmp      SHORT G_M9821_IG10
 						;; size=12 bbWeight=0.50 PerfScore 1.88
-G_M9821_IG17:
-       cmp      rax, r11
+G_M9821_IG16:
+       cmp      rcx, rdx
        jbe      SHORT G_M9821_IG10
-       jmp      SHORT G_M9821_IG14
+       jmp      SHORT G_M9821_IG13
 						;; size=7 bbWeight=0.50 PerfScore 1.62
-G_M9821_IG18:
+G_M9821_IG17:
        call     CORINFO_HELP_RNGCHKFAIL
        int3     
 						;; size=6 bbWeight=0 PerfScore 0.00
 
-; Total bytes of code 232, prolog size 4, PerfScore 103.31, instruction count 81, allocated bytes for code 232 (MethodHash=418ad9a2) for method System.Decimal+DecCalc:VarDecCmpSub(byref,byref):int (FullOpts)
+; Total bytes of code 206, prolog size 8, PerfScore 97.06, instruction count 75, allocated bytes for code 206 (MethodHash=418ad9a2) for method System.Decimal+DecCalc:VarDecCmpSub(byref,byref):int (FullOpts)

Larger list of diffs: https://gist.github.com/MihuBot/6fc131614f94bf73d7408685d0ed9b4a

@MihuBot
Copy link
Owner Author

MihuBot commented Jul 15, 2024

@xtqqczze

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant