Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Guarded devirtualization assert with vector type #52864

Closed
AndyAyersMS opened this issue May 17, 2021 · 7 comments · Fixed by #53578
Closed

Guarded devirtualization assert with vector type #52864

AndyAyersMS opened this issue May 17, 2021 · 7 comments · Fixed by #53578
Assignees
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Milestone

Comments

@AndyAyersMS
Copy link
Member

This modified version of JIT\Regression\JitBlue\GitHub_21546 will fail if run with

COMPlus_TieredPGO=1
COMPlus_TC_QuickJitForLoops=1

with the assert

Assert failure(PID 51956 [0x0000caf4], Thread: 32372 [0x7e74]): Assertion failed '(type == varDsc->lvType) || (lvaIsImplicitByRefLocal(lnum) && fgGlobalMorph && (varDsc->lvType == TYP_BYREF)) || ((varDsc->lvType == TYP_STRUCT) && (genTypeSize(type) == varDsc->lvExactSize))' in 'System.Linq.Enumerable:TryGetLast(System.Collections.Generic.IEnumerable`1[Vector2],byref):System.Numerics.Vector2' during 'Indirect call transform' (IL size 131)

    File: C:\repos\runtime2\src\coreclr\jit\gentree.cpp Line: 6483

Looks like the indirect call transform is perhaps not properly handling vector types

// Licensed to the .NET Foundation under one or more agreements.
// The .NET Foundation licenses this file to you under the MIT license.

using System;
using System.Collections.Generic;
using System.Threading;
using System.Linq;
using System.Runtime.CompilerServices;

using Point = System.Numerics.Vector2;

namespace GitHub_21546
{
    public class test
    {
        static Point checkA;
        static Point checkB;
        static Point checkC;
        static int   returnVal;

        public const int DefaultSeed = 20010415;
        public static int Seed = Environment.GetEnvironmentVariable("CORECLR_SEED") switch
        {
            string seedStr when seedStr.Equals("random", StringComparison.OrdinalIgnoreCase) => new Random().Next(),
            string seedStr when int.TryParse(seedStr, out int envSeed) => envSeed,
            _ => DefaultSeed
        };

        [MethodImpl(MethodImplOptions.NoInlining)]
        static void check(Point a, Point b, Point c)
        {
            if (a != checkA)
            {
                Console.WriteLine($"A doesn't match. Should be {checkA} but is {a}");
                returnVal = -1;
            }
            if (b != checkB)
            {
                Console.WriteLine($"B doesn't match. Should be {checkB} but is {b}");
                returnVal = -1;
            }
            if (c != checkC)
            {
                Console.WriteLine($"C doesn't match. Should be {checkC} but is {c}");
                returnVal = -1;
            }
        }

        [MethodImpl(MethodImplOptions.NoInlining)]
        static void FailureCase(List<Point> p)
        {
            Point p1 = p[0];
            Point p2 = p.Last();

            check(p1, p[1], p2);
            check(p1, p[1], p2);
            check(p1, p[1], p2);
        }

        static Point NextPoint(Random random)
        {
            return new Point(
                (float)random.NextDouble(),
                (float)random.NextDouble()
            );
        }

        static int Main()
        {
            returnVal     = 100;
            Random random = new Random(Seed);

            for (int i = 0; i < 50; i++)
            {
                List<Point> p = new List<Point>();

                checkA = NextPoint(random);
                p.Add(checkA);
                checkB = NextPoint(random);
                p.Add(checkB);
                checkC = NextPoint(random);
                p.Add(checkC);

                FailureCase(p);

                Thread.Sleep(15);
            }

            for (int i = 0; i < 50; i++)
            {
                List<Point> p = new List<Point>();

                checkA = NextPoint(random);
                p.Add(checkA);
                checkB = NextPoint(random);
                p.Add(checkB);
                checkC = NextPoint(random);
                p.Add(checkC);

                FailureCase(p);
            }

            return returnVal;
        }
    }
}
@dotnet-issue-labeler
Copy link

I couldn't figure out the best area label to add to this issue. If you have write-permissions please help me learn by adding exactly one area label.

@dotnet-issue-labeler dotnet-issue-labeler bot added the untriaged New issue has not been triaged by the area owner label May 17, 2021
@AndyAyersMS AndyAyersMS added area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI and removed untriaged New issue has not been triaged by the area owner labels May 17, 2021
@AndyAyersMS AndyAyersMS added this to the 6.0.0 milestone May 17, 2021
@AndyAyersMS
Copy link
Member Author

We have

***** BB07
STMT00022 (IL 0x034...0x03D)
               [000085] &-C-G-------              *  CALLV stub struct System.Collections.Generic.IList`1[Vector2][System.Numerics.Vector2].get_Item (exactContextHnd=0x00007FF7AF0CB041)
               [000081] ------------ this in rcx  +--*  LCL_VAR   ref    V03 loc1         
               [000084] ------------ arg1         \--*  SUB       int   
               [000082] ------------                 +--*  LCL_VAR   int    V04 loc2         
               [000083] ------------                 \--*  CNS_INT   int    1

***** BB07
STMT00023 (IL   ???...  ???)
               [000087] --C---------              *  RETURN    simd8 
               [000086] --C---------              \--*  RET_EXPR  struct(inl return expr [000085])

and we want to expand the call for guarded devirt. This requires cloning the call. Since the call is an inline candidate, before we can clone it, we first need to modify the RET_EXPR since it's normally kept in a 1-1 relationship with the call. We do this by changing the RET_EXPR to a suitable temp. So effectively we want to create:

STMT00022 
   ASG(temp, CALL(...))

STMT00023
   RETURN(temp)

The assert here is that we're not getting the right typed temp for a SIMD8 returning call.

Various changes I've made to try and fix this haven't helped. @sandreenko can you take a look?

@sandreenko
Copy link
Contributor

Will do.

AndyAyersMS added a commit to AndyAyersMS/runtime that referenced this issue May 18, 2021
If the sovler wants to set the edge weight below zero, set it to
zero if within slop, or disallow if not.

Addresses assert seen in dotnet#52864.
@VincentBu
Copy link
Contributor

Failed in runtime-coreclr jitstress-isas-x86 20210522.1

Failed test:

JIT/Regression/JitBlue/GitHub_27405/GitHub_27405/GitHub_27405.sh

Error message:

Assert failure(PID 67729 [0x00010891], Thread: 2682281 [0x28eda9]): Assertion failed '(type == varDsc->lvType) || (lvaIsImplicitByRefLocal(lnum) && fgGlobalMorph && (varDsc->lvType == TYP_BYREF)) || ((varDsc->lvType == TYP_STRUCT) && (genTypeSize(type) == varDsc->lvExactSize))' in 'Program:Test2(System.Numerics.Vector4,System.Numerics.Vector4):System.Numerics.Vector4' during 'Lowering nodeinfo' (IL size 39)

File: /Users/runner/work/1/s/src/coreclr/jit/gentree.cpp Line: 6512
Image: /private/tmp/helix/working/9C4708A3/p/corerun
task_for_pid(67729) FAILED 5 (os/kern) failure
/private/tmp/helix/working/9C4708A3/w/A0B708C8/e/JIT/Regression/JitBlue/GitHub_27405/GitHub_27405/GitHub_27405.sh: line 375: 67729 Abort trap: 6           (core dumped) $LAUNCHER $ExePath "${CLRTestExecutionArguments[@]}"

Return code:      1
Raw output file:      /private/tmp/helix/working/9C4708A3/w/A0B708C8/e/JIT/Regression/Reports/JIT.Regression/JitBlue/GitHub_27405/GitHub_27405/GitHub_27405.output.txt
Raw output:
BEGIN EXECUTION
/tmp/helix/working/9C4708A3/p/corerun GitHub_27405.dll ''
Expected: 100
Actual: 134
END EXECUTION - FAILED
Test Harness Exitcode is : 1
To run the test:

set CORE_ROOT=/tmp/helix/working/9C4708A3/p
/private/tmp/helix/working/9C4708A3/w/A0B708C8/e/JIT/Regression/JitBlue/GitHub_27405/GitHub_27405/GitHub_27405.sh
Expected: True
Actual:   False


Stack trace
   at JIT_Regression._JitBlue_GitHub_27405_GitHub_27405_GitHub_27405_._JitBlue_GitHub_27405_GitHub_27405_GitHub_27405_sh()

@AndyAyersMS
Copy link
Member Author

Suspect the failure above is a different issue? GDV is not exercised much in the normal CI.

@sandreenko sandreenko self-assigned this May 27, 2021
@AndyAyersMS
Copy link
Member Author

Also perhaps related. Windows arm64 failure seen here

set COMPlus_TieredCompilation=1
set COMPlus_ReadyToRun=0
set COMPlus_TC_QuickJitForLoops=1
set COMPlus_TieredPGO=1

    JIT\SIMD\Vector3Interop_ro\Vector3Interop_ro.cmd [FAIL]
      
      Return code:      1
      Raw output file:      D:\h\w\A840097A\w\B0C409C0\e\JIT\SIMD\Reports\JIT.SIMD\Vector3Interop_ro\Vector3Interop_ro.output.txt
      Raw output:
      BEGIN EXECUTION
       "D:\h\w\A840097A\p\corerun.exe"  Vector3Interop_ro.dll 
      nativeCall_PInvoke_CheckVector3Size: sizeof(Vector3) == 12
      nativeCall_PInvoke_Vector3Arg:
          iVal 123
          sumOfEles(1.000000, 2.000000, 3.000000) = 6.000000
          str  abcdefg
          sumOfEles(10.000000, 11.000000, 12.000000) = 33.000000
      nativeCall_PInvoke_Vector3Arg_Unix:
          v3f32_xmm0: 1.000000 2.000000 3.000000
          f32_xmm2 - f32_xmm7: 100.000000 101.000000 102.000000 103.000000 104.000000 105.000000
          f32_mem0: 106.000000
          v3f32_mem1: 10.000000 20.000000 30.000000
          f32_mem2-3: 107.000000 108.000000
          sum = 1002.000000
      nativeCall_PInvoke_Vector3Arg_Unix2:
          v3f32_xmm0: 1.000000 2.000000 3.000000
          f32_xmm2 - f32_xmm7: 100.000000 101.000000 102.000000 103.000000 104.000000 105.000000
          f32_mem0: 106.000000
          v3f32_mem1: 4.000000 5.000000 6.000000
          f32_mem2-3: 107.000000 108.000000
          v3f32_mem4: 7.000000 8.000000 9.000000
          f32_mem5: 109.000000
          sum = 1090.000000
      nativeCall_PInvoke_Vector3Ret:
          Return value: (1.000000, 2.000000, 3.000000)
          Sum of return scalar values = 6.000000
      nativeCall_PInvoke_Vector3Array
          arrEle[0]: 1.000000 2.000000 3.000000
          arrEle[1]: 5.000000 6.000000 7.000000
          Sum = 24.000000
      nativeCall_PInvoke_Vector3InStruct
          First struct memeber: (1.000000 2.000000 3.000000) -> (2.000000 3.000000 4.000000)
          Second struct member: (5.000000 6.000000 7.000000) -> (6.000000 7.000000 8.000000)
          Sum of all return scalar values = 30.000000
      nativeCall_PInvoke_Vector3InComplexStruct
          Arg ival: 99
          Arg Vector3 v1: (1.000000 2.000000 3.000000)
          Arg Vector3 v2: (5.000000 6.000000 7.000000)
          Arg Vector3 v3: (10.000000 20.000000 30.000000)
          Arg string arg: arg_string
          Return ival: 100
          Return Vector3 v1: (2.000000 3.000000 4.000000)
          Return Vector3 v2: (6.000000 7.000000 8.000000)
          Return Vector3 v3: (11.000000 21.000000 31.000000)
          Return string arg: ret_string
          Sum of all return float scalar values = 93.000000
          Managed ival: 100
          Managed Vector3 v1: (2 3 4)
          Managed Vector3 v2: (6 7 8)
          Managed Vector3 v3: (11 21 31)
          Managed string arg: ret_string
          Managed Sum = 735
      All PInvoke testcases passed
      callBack_RPInvoke_Vector3Arg:
          iVal 123
          Sum0,1,2 = 14, 1400, 140
          str abcdefg
      callBack_RPInvoke_Vector3Arg_Unix:
          1, 2, 3
          10, 20, 30
          Sum0,1,2,3 = 14, NaN, NaN, 936
      RPInvoke Vector3Arg_Unix test failed

@sandreenko
Copy link
Contributor

Ok, there are 3 pieces that need to be fixed:

  1. gtNewLclvNode should take struct handle so we do something better then type == varDsc->lvType when structs are involved;
  2. with 1. we can improve (varDsc->lvType == TYP_STRUCT) && (genTypeSize(type) == varDsc->lvExactSize)) to (varTypeIsStruct(varDsc->lvType) && (structSize == varDsc->lvExactSize))
  3. and an alternative solution is to type call node for static Vector2 ReturnSIMD2() as SIMD8 earlier, could be valuable, but note that it will be changed to long for ABI purposes in lowering.

should not take long to fix, will try to compose a PR today.

@ghost ghost added the in-pr There is an active PR which will close this issue when it is merged label Jun 2, 2021
@ghost ghost removed the in-pr There is an active PR which will close this issue when it is merged label Jun 4, 2021
@ghost ghost locked as resolved and limited conversation to collaborators Jul 4, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

4 participants