Skip to content

Latest commit

 

History

History
162 lines (124 loc) · 18.7 KB

debugging-crossgen2.md

File metadata and controls

162 lines (124 loc) · 18.7 KB

How to Debug Crossgen2

Crossgen2 brings with it a number of new challenges for debugging the compilation process. Fortunately, in addition to challenges, crossgen2 is designed to enhance various parts of the debugging experience.

Important concerns to be aware of when debugging Crossgen2

  • Other than the JIT, Crossgen2 is a managed application
  • By default Crossgen2 uses a multi-core compilation strategy
  • A Crossgen2 process will have 2 copies of the JIT in the process at the same time, the one used to compile the target, and the one used to compile Crossgen2 itself.
  • Crossgen2 does not parse environment variables for controlling the JIT (or any other behavior), all behavior is controlled via the command line
  • The Crossgen2 command line as generated by the project system is quite complex

Built in debugging aids in Crossgen2

  • When debugging a multi-threaded component of Crossgen2 and not investigating a multi-threading issue itself, it is generally advisable to disable the use of multiple threads. To do this use the --parallelism 1 switch to specify that the maximum parallelism of the process shall be 1.

  • When debugging the behavior of compiling a single method, the compiler may be instructed to only compile a single method. This is done via the various --singlemethod options

    • These options work by specifying a specific method by type, method name, generic method arguments, and if those are insufficiently descriptive to uniquely identify the method, by index. Types are described using the same format that the managed Type.GetType(string) function uses, which is documented in the normal .NET documentation. As this format can be quite verbose, the compiler provides a --print-repro-instructions switch which will print the arguments necessary to compile a function to the console.
    • --singlemethodindex is used in cases where the method signature is the only distinguishing factor about the method. An index is used instead of a series of descriptive arguments, as specifying a signature exactly is extraordinarily complicated.
    • Repro args will look like the following --singlemethodtypename "Internal.Runtime.CompilerServices.Unsafe" --singlemethodname As --singlemethodindex 2 --singlemethodgenericarg "System.Runtime.Intrinsics.Vector256`1[[System.SByte]]" --singlemethodgenericarg "System.Runtime.Intrinsics.Vector256`1[[System.Double]]"
  • Since Crossgen2 is by default multi-threaded, it produces results fairly quickly even when compiling using a Debug variant of the JIT. In general, when debugging JIT issues we recommend using the debug JIT regardless of which environment caused a problem.

  • Crossgen2 supports nearly arbitrary cross-targetting, including OS and architecture cross targeting. The only restriction is that 32bit architecture cannot compile targetting a 64bit architecture. This allows the use of the debugging environment most convenient to the developer. In particular, if there is an issue which crosses the managed/native boundary, it is often convenient to debug using the mixed mode debugger on Windows X64.

    • If the correct set of assemblies/command line arguments are passed to the compiler Crossgen2 should produce binary identical output on all platforms.
    • The compiler does not check the OS/Architecture specified for input assemblies, which allows compiling using a non-architecture/OS matched version of the framework to target an arbitrary target. While this isn't useful for producing the diagnosing all issues, it can be cheaply used to identify the general behavior of a change on the full swath of supported architectures.

Control compilation behavior by using the --targetos and --targetarch switches. The default behavior is to target the crossgen2's own OS/Arch pair, but all 64bit versions of crossgen2 are capable of targetting arbitrary OS/Arch combinations. At the time of writing the current supported sets of valid arguments are:

Command line arguments
--targetos windows --targetarch x86
--targetos windows --targetarch x64
--targetos windows --targetarch arm
--targetos windows --targetarch arm64
--targetos linux --targetarch x64
--targetos linux --targetarch arm
--targetos linux --targetarch arm64
--targetos osx --targetarch x64
--targetos osx --targetarch arm64
  • Passing special jit behavior flags to the compiler is done via the --codegenopt switch. As an example to turn on tailcall loop optimizations and dump all code compiled use a pair of them like --codegenopt NgenDump=* --codegenopt TailCallLoopOpt=1.

  • When using the NgenDump feature of the JIT, disable parallelism as described above or specify a single method to be compiled. Otherwise, output from multiple functions will be interleaved and inscrutable.

  • Since there are 2 jits in the process, when debugging in the JIT, if the source files match up, there is a decent chance that a native debugger will stop at unfortunate and unexpected locations. This is extremely annoying, and to combat this, we generally recommend making a point of using a runtime which doesn't exactly match that of the crossgen2 in use. However, if that isn't feasible, it is also possible to disable symbol loading in most native debuggers. For instance, in Visual Studio, one would use the "Specify excluded modules" feature.

  • Crossgen2 identifies the JIT to use by the means of a naming convention. By default it will use a JIT located in the same directory as the crossgen2.dll file. In addition there is support for a --jitpath switch to use a specific JIT. This option is intended to support A/B testing by the JIT team. The --jitpath option should only be used if the jit interface has not been changed. The JIT specified by the --jitpath switch must be compatible with the current settings of the --targetos and --targetarch switches.

  • In parallel to the crossgen2 project, there is a tool known as r2rdump. This tool can be used to dump the contents of a produced image to examine what was actually produced in the final binary. It has a large multitude of options to control exactly what is dumped, but in general it is able to dump any image produced by crossgen2, and display its contents in a human readable fashion. Specify --disasm to display disassembly.

  • If there is a need to debug the dependency graph of crossgen2 (which is a very rare need at this time), there is a visualization tool located in the corert repo. https://github.com/dotnet/corert/tree/master/src/ILCompiler.DependencyAnalysisFramework/ILCompiler-DependencyGraph-Viewer To use that tool, get the sources from the CoreRT repo, compile it, and run it on Windows before the crossgen2 compilation begins. It will present a live view of the graph as it is generated and allow for exploration to determine why some node is in the graph. Every node in the graph has a unique id that is visible to this tool, and it can be used in parallel with a debugger to understand what is happening in the crossgen2 process. Changes to move this tool to a more commonly built location and improve the fairly horrible UI are encouraged.

  • When used in the official build system, the set of arguments passed to crossgen2 is extremely complex, especially with regards to the set of reference paths (each assembly is specified individually). To make it easier to use crossgen2 from the command line manually the tool will accept wildcards in its parsing of references. Please note that on Unix that the shell will happily expand these arguments by itself, which will not work correctly. In those situations enclose the argument in quotes to prevent the shell expansion.

  • Crossgen2 supports a --map and --mapcsv arguments to produce map files of the produced output. These are primarily used for diagnosing size issues, as they describe the generated file in fairly high detail, as well as providing a number of interesting statistics about the produced output.

  • Diagnosing why a specific method failed to compile in crossgen2 can be done by passing the --verbose switch to crossgen2. This will print many things, but in particular it will print the reason why a compilation was aborted due to an R2R format limitation.

  • Crossgen2 can use either the version of dotnet that is used to build the product (as found by the dotnet.cmd or dotnet.sh script found in the root of the runtime repo) or it can use a sufficiently recent corerun.exe produced by constructing a test build. It is strongly recommended if using corerun.exe to use a release build of corerun for this purpose, as crossgen2 runs a very large amount of managed code. The version of corerun used does not need to come from the same build as the crossgen2.dll that is being debugging. In fact, I would recommnend using a different enlistment to build that corerun to avoid any confusion.

  • In the runtime testbed, each test can be commanded to compile with crossgen2 by using environment variables. Just set the RunCrossgen2 variable to 1, and optionally set the CompositeBuildMode variable to 1 if you wish to see the R2R behavior with composite image creation. By default this will simply use dotnet to run crossgen2. If you run the test batch script from the root of the enlistment on Windows this will just work; otherwise, you must set the __TestDotNetCmd environment variable to point at copy of dotnet or corerun that can run crossgen2. This is often the easiest way to run a simple test with crossgen2 for developers practiced in the CoreCLR testbed. See the example below of various techniques to use when diagnosing issues under crossgen2.

  • When attempting to build crossgen2, you must build the clr.tools subset. If rebuilding a component of the JIT and wanting to use that in your inner loop, you must build as well with either the clr.jit or clr.alljits subsets. If the jit interface is changed, the clr.runtime subset must also be rebuilt.

  • After completion of a product build, a functional copy of crossgen2.dll will be located in a bin directory in a path like bin\coreclr\windows.x64.Debug\crossgen2. After creating a test native layout via a command such as src\tests\build generatelayoutonly then there will be a copy of crossgen2 located in the %CORE_ROOT%\crossgen2 directory. The version of crossgen2 in the test core_root directory will have the appropriate files for running under either an x64 dotnet.exe or under the target architecture. This was done to make it somewhat easier to do cross platform development, and assumes the primary development machine is x64,

Example of debugging a test application in Crossgen2

This example is to demonstrate debugging of a simple test in the CoreCLR testbed.

The example assumes that CORE_ROOT is set appropriately, and that __TestDotNetCmd is set appropriately. See comments above for details on __TestDotNetCmd

The test begins by setting RunCrossgen2=1 This will instruct the test batch script to run crossgen2 on input binaries. It will also create a copy of the input binaries which needs to be deleted if you modify the test. To do so, delete the directory where the test binaries are, and rebuild the test.

C:\git2\runtime>set RunCrossgen2=1

C:\git2\runtime>c:\git2\runtime\artifacts\tests\coreclr\windows.x64.Debug\jit\Directed\Arrays\Complex1\Complex1.cmd
BEGIN EXECUTION
Complex1.dll
        1 file(s) copied.
Could Not Find c:\git2\runtime\artifacts\tests\coreclr\windows.x64.Debug\jit\Directed\Arrays\Complex1\Complex1.dll.rsp
Response file: c:\git2\runtime\artifacts\tests\coreclr\windows.x64.Debug\jit\Directed\Arrays\Complex1\\Complex1.dll.rsp
c:\git2\runtime\artifacts\tests\coreclr\windows.x64.Debug\jit\Directed\Arrays\Complex1\IL-CG2\Complex1.dll
-o:c:\git2\runtime\artifacts\tests\coreclr\windows.x64.Debug\jit\Directed\Arrays\Complex1\\Complex1.dll
--targetarch:x64
--verify-type-and-field-layout
-O
-r:c:\git2\runtime\artifacts\tests\coreclr\windows.x64.Debug\Tests\Core_Root\System.*.dll
-r:c:\git2\runtime\artifacts\tests\coreclr\windows.x64.Debug\Tests\Core_Root\Microsoft.*.dll
-r:c:\git2\runtime\artifacts\tests\coreclr\windows.x64.Debug\Tests\Core_Root\mscorlib.dll
-r:c:\git2\runtime\artifacts\tests\coreclr\windows.x64.Debug\Tests\Core_Root\netstandard.dll
" "dotnet" "c:\git2\runtime\artifacts\tests\coreclr\windows.x64.Debug\Tests\Core_Root\crossgen2\crossgen2.dll" @"c:\git2\runtime\artifacts\tests\coreclr\windows.x64.Debug\jit\Directed\Arrays\Complex1\\Complex1.dll.rsp"   -r:c:\git2\runtime\artifacts\tests\coreclr\windows.x64.Debug\jit\Directed\Arrays\Complex1\IL-CG2\*.dll"
Emitting R2R PE file: c:\git2\runtime\artifacts\tests\coreclr\windows.x64.Debug\jit\Directed\Arrays\Complex1\\Complex1.dll
 "c:\git2\runtime\artifacts\tests\coreclr\windows.x64.Debug\Tests\Core_Root\corerun.exe" Complex1.dll
Starting...
Everything Worked!
Expected: 100
Actual: 100
END EXECUTION - PASSED
PASSED

From that invocation you can see that crossgen2 was launched with a response file containing a list of arguments including all of the details for references. Then you can manually run the actual crossgen2 command, which is prefixed with the value of the __TestDotNetCmd environment variable. For instance, once I saw the above output, I copied and pasted the last command, and ran it.

C:\git2\runtime>"dotnet" "c:\git2\runtime\artifacts\tests\coreclr\windows.x64.Debug\Tests\Core_Root\crossgen2\crossgen2.dll" @"c:\git2\runtime\artifacts\tests\coreclr\windows.x64.Debug\jit\Directed\Arrays\Complex1\\Complex1.dll.rsp"   -r:c:\git2\runtime\artifacts\tests\coreclr\windows.x64.Debug\jit\Directed\Arrays\Complex1\IL-CG2\*.dll
C:\git2\runtime\.dotnet
Emitting R2R PE file: c:\git2\runtime\artifacts\tests\coreclr\windows.x64.Debug\jit\Directed\Arrays\Complex1\\Complex1.dll

And then wanted to debug the individual method compilation, and ran it with the --print-repro-instructions switch

C:\git2\runtime>"dotnet" "c:\git2\runtime\artifacts\tests\coreclr\windows.x64.Debug\Tests\Core_Root\crossgen2\crossgen2.dll" @"c:\git2\runtime\artifacts\tests\coreclr\windows.x64.Debug\jit\Directed\Arrays\Complex1\\Complex1.dll.rsp"   -r:c:\git2\runtime\artifacts\tests\coreclr\windows.x64.Debug\jit\Directed\Arrays\Complex1\IL-CG2\*.dll --print-repro-instructions
C:\git2\runtime\.dotnet
Single method repro args:--singlemethodtypename "Complex,Complex1" --singlemethodname mul_em --singlemethodindex 1
Single method repro args:--singlemethodtypename "Complex_Array_Test,Complex1" --singlemethodname .ctor --singlemethodindex 1
Single method repro args:--singlemethodtypename "Complex_Array_Test,Complex1" --singlemethodname Main --singlemethodindex 1
Emitting R2R PE file: c:\git2\runtime\artifacts\tests\coreclr\windows.x64.Debug\jit\Directed\Arrays\Complex1\\Complex1.dll

I then wanted to see some more detail from the jit. To keep the size of this example small, I'm just using the NgenOrder=1switch, but jit developers would more likely use NgenDump=* switch.

C:\git2\runtime>"dotnet" "c:\git2\runtime\artifacts\tests\coreclr\windows.x64.Debug\Tests\Core_Root\crossgen2\crossgen2.dll" @"c:\git2\runtime\artifacts\tests\coreclr\windows.x64.Debug\jit\Directed\Arrays\Complex1\\Complex1.dll.rsp"   -r:c:\git2\runtime\artifacts\tests\coreclr\windows.x64.Debug\jit\Directed\Arrays\Complex1\IL-CG2\*.dll --print-repro-instructions --singlemethodtypename "Complex_Array_Test,Complex1" --singlemethodname Main --singlemethodindex 1 --codegenopt NgenOrder=1
C:\git2\runtime\.dotnet
Single method repro args:--singlemethodtypename "Complex_Array_Test,Complex1" --singlemethodname Main --singlemethodindex 1
         |  Profiled   | Method   |   Method has    |   calls   | Num |LclV |AProp| CSE |   Perf  |bytes | x64 codesize|
 mdToken |  CNT |  RGN |    Hash  | EH | FRM | LOOP | NRM | IND | BBs | Cnt | Cnt | Cnt |  Score  |  IL  |   HOT | CLD | method name
---------+------+------+----------+----+-----+------+-----+-----+-----+-----+-----+-----+---------+------+-------+-----+
06000002 |      |      | f656934b |    | rsp | LOOP |   3 |   0 |  35 |  56 |  48 |   7 |   43056 |  490 |   761 |   0 | Complex_Array_Test:Main(System.String[]):int
Emitting R2R PE file: c:\git2\runtime\artifacts\tests\coreclr\windows.x64.Debug\jit\Directed\Arrays\Complex1\\Complex1.dll

And finally, as the last --targetarch and --targetos switch is the meaningful one, it is simple to target a different architecture for ad hoc exploration...

C:\git2\runtime>"dotnet" "c:\git2\runtime\artifacts\tests\coreclr\windows.x64.Debug\Tests\Core_Root\crossgen2\crossgen2.dll" @"c:\git2\runtime\artifacts\tests\coreclr\windows.x64.Debug\jit\Directed\Arrays\Complex1\\Complex1.dll.rsp"   -r:c:\git2\runtime\artifacts\tests\coreclr\windows.x64.Debug\jit\Directed\Arrays\Complex1\IL-CG2\*.dll --print-repro-instructions --singlemethodtypename "Complex_Array_Test,Complex1" --singlemethodname Main --singlemethodindex 1 --codegenopt NgenOrder=1 --targetarch arm64
C:\git2\runtime\.dotnet
Single method repro args:--singlemethodtypename "Complex_Array_Test,Complex1" --singlemethodname Main --singlemethodindex 1
         |  Profiled   | Method   |   Method has    |   calls   | Num |LclV |AProp| CSE |   Perf  |bytes | arm64 codesize|
 mdToken |  CNT |  RGN |    Hash  | EH | FRM | LOOP | NRM | IND | BBs | Cnt | Cnt | Cnt |  Score  |  IL  |   HOT | CLD | method name
---------+------+------+----------+----+-----+------+-----+-----+-----+-----+-----+-----+---------+------+-------+-----+
06000002 |      |      | f656934b |    |  fp | LOOP |   3 |   0 |  35 |  59 |  48 |  10 |   63828 |  490 |  1048 |   0 | Complex_Array_Test:Main(System.String[]):int
Emitting R2R PE file: c:\git2\runtime\artifacts\tests\coreclr\windows.x64.Debug\jit\Directed\Arrays\Complex1\\Complex1.dll

Note that the only difference in the command line was to pass the --targetarch arm64 switch, and the JIT now compiles the method as arm64.

Finally, attaching a debugger to crossgen2.

Since this example uses dotnet as the __TestDotNetCmd you will need to debug the c:\git2\runtime\.dotnet\dotnet.exe process.

devenv /debugexe C:\git2\runtime\.dotnet\dotnet.exe "c:\git2\runtime\artifacts\tests\coreclr\windows.x64.Debug\Tests\Core_Root\crossgen2\crossgen2.dll" @"c:\git2\runtime\artifacts\tests\coreclr\windows.x64.Debug\jit\Directed\Arrays\Complex1\\Complex1.dll.rsp"   -r:c:\git2\runtime\artifacts\tests\coreclr\windows.x64.Debug\jit\Directed\Arrays\Complex1\IL-CG2\*.dll --print-repro-instructions --singlemethodtypename "Complex_Array_Test,Complex1" --singlemethodname Main --singlemethodindex 1 --codegenopt NgenOrder=1 --targetarch arm64

This will launch the Visual Studio debugger, with a solution setup for debugging the dotnet.exe process. By default this solution will debug the native code of the process only. To debug the managed components, edit the properties on the solution and set the Debugger Type to Managed (.NET Core, .NET 5+) or Mixed (.NET Core, .NET 5+).