Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SPMI should avoid DASM roundtrip for analysis #76178

Closed
jakobbotsch opened this issue Sep 26, 2022 · 2 comments · Fixed by #76238
Closed

SPMI should avoid DASM roundtrip for analysis #76178

jakobbotsch opened this issue Sep 26, 2022 · 2 comments · Fixed by #76238
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Milestone

Comments

@jakobbotsch
Copy link
Member

For large diffs it is prohibitively slow to produce disassembly for all method contexts with diffs, and most of us are probably familiar with superpmi-diffs timing out for a jit change with a large amount of diffs.

We are already returning a list of contexts with diffs when using -diffMCList. This is how we figure out which contexts to produce .dasm files for. It should not be that much extra work to produce a comma-separated file instead that also includes metrics like diff size for each of the contexts. Then we can create jit-analyze-spmi that can analyze this, or enhance jit-analyze with support for this, to produce output similar to jit-analyze directly from the .csv file, without having to go through intermediate disassembly.

As a final step we'll probably want to produce .dasm files for the most important contexts identified by jit-analyze-spmi.

@dotnet-issue-labeler dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Sep 26, 2022
@ghost ghost added the untriaged New issue has not been triaged by the area owner label Sep 26, 2022
@ghost
Copy link

ghost commented Sep 26, 2022

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

Issue Details

For large diffs it is prohibitively slow to produce disassembly for all method contexts with diffs, and most of us are probably familiar with superpmi-diffs timing out for a jit change with a large amount of diffs.

We are already returning a list of contexts with diffs when using -diffMCList. This is how we figure out which contexts to produce .dasm files for. It should not be that much extra work to produce a comma-separated file instead that also includes metrics like diff size for each of the contexts. Then we can create jit-analyze-spmi that can analyze this, or enhance jit-analyze with support for this, to produce output similar to jit-analyze directly from the .csv file, without having to go through intermediate disassembly.

As a final step we'll probably want to produce .dasm files for the most important contexts identified by jit-analyze-spmi.

Author: jakobbotsch
Assignees: -
Labels:

area-CodeGen-coreclr, untriaged

Milestone: -

@jakobbotsch jakobbotsch removed the untriaged New issue has not been triaged by the area owner label Sep 26, 2022
@jakobbotsch jakobbotsch added this to the 8.0.0 milestone Sep 26, 2022
jakobbotsch added a commit to jakobbotsch/runtime that referenced this issue Sep 27, 2022
This starts communicating more information about diffs back from
superpmi and starts using it in the driver. The current information is
the base size, diff size, base instructions, diff instructions and
context size.

The driver uses the base size/diff size to pick a small number of
interesting contexts and only creates disassembly for these contexts.
jit-analyze is then also invoked on only these contexts, but the
contexts are picked so that they overlap with what jit-analyze would
display. Additionally, we also pick the top 20 smallest contexts (in
terms of context size) -- I frequently use the smallest contexts with
diffs to investigate my change.

The new behavior is only enabled when no custom metrics are specified.
If custom metrics are specified we fall back to producing all
disassembly files and leave it up to jit-analyze to extract and analyze
the metrics specified.

The net result is that we can now get SPMI diffs for changes with even
hundreds of thousands of diffs in the same time as it takes to get diffs
for a small change.

Fix dotnet#76178
@ghost ghost added the in-pr There is an active PR which will close this issue when it is merged label Sep 27, 2022
jakobbotsch added a commit to jakobbotsch/runtime that referenced this issue Sep 28, 2022
This starts communicating more information about diffs back from
superpmi and starts using it in the driver. The current information is
the base size, diff size, base instructions, diff instructions and
context size.

The driver uses the base size/diff size to pick a small number of
interesting contexts and only creates disassembly for these contexts.
jit-analyze is then also invoked on only these contexts, but the
contexts are picked so that they overlap with what jit-analyze would
display. Additionally, we also pick the top 20 smallest contexts (in
terms of context size) -- I frequently use the smallest contexts with
diffs to investigate my change.

The new behavior is only enabled when no custom metrics are specified.
If custom metrics are specified we fall back to producing all
disassembly files and leave it up to jit-analyze to extract and analyze
the metrics specified.

Also remove --retainOnlyTopFiles. This is no longer necessary since we
avoid creating these disassembly files in the first place. Another
benefit is that all contexts mentioned by jit-analyze output will now be
part of the artifacts.

The net result is that we can now get SPMI diffs for changes with even
hundreds of thousands of diffs in the same time as it takes to get diffs
for a small change.

Fix dotnet#76178
jakobbotsch added a commit to jakobbotsch/runtime that referenced this issue Sep 29, 2022
This starts communicating more information about diffs back from
superpmi and starts using it in the driver. The current information is
the base size, diff size, base instructions, diff instructions and
context size.

The driver uses the base size/diff size to pick a small number of
interesting contexts and only creates disassembly for these contexts.
jit-analyze is then also invoked on only these contexts, but the
contexts are picked so that they overlap with what jit-analyze would
display. Additionally, we also pick the top 20 smallest contexts (in
terms of context size) -- I frequently use the smallest contexts with
diffs to investigate my change.

The new behavior is only enabled when no custom metrics are specified.
If custom metrics are specified we fall back to producing all
disassembly files and leave it up to jit-analyze to extract and analyze
the metrics specified.

Also remove --retainOnlyTopFiles. This is no longer necessary since we
avoid creating these disassembly files in the first place. Another
benefit is that all contexts mentioned by jit-analyze output will now be
part of the artifacts.

The net result is that we can now get SPMI diffs for changes with even
hundreds of thousands of diffs in the same time as it takes to get diffs
for a small change.

Fix dotnet#76178
@BruceForstall
Copy link
Member

Related: #52877

jakobbotsch added a commit that referenced this issue Sep 29, 2022
This starts communicating more information about diffs back from
superpmi and starts using it in the driver. The current information is
the base size, diff size, base instructions, diff instructions and
context size.

The driver uses the base size/diff size to pick a small number of
interesting contexts and only creates disassembly for these contexts.
jit-analyze is then also invoked on only these contexts, but the
contexts are picked so that they overlap with what jit-analyze would
display. Additionally, we also pick the top 20 smallest contexts (in
terms of context size) -- I frequently use the smallest contexts with
diffs to investigate my change.

The new behavior is only enabled when no custom metrics are specified.
If custom metrics are specified we fall back to producing all
disassembly files and leave it up to jit-analyze to extract and analyze
the metrics specified.

Also, the retainTopFilesOnly option is no longer necessary since our CI
pipeline will at most produce 100 .dasm files now. Another benefit is that
this means that all contexts mentioned in the jit-analyze output will now be
part of the artifacts.

The net result is that we can now get SPMI diffs for changes with even
hundreds of thousands of diffs in the same time as it takes to get diffs
for a small change.

Fix #76178
@ghost ghost removed the in-pr There is an active PR which will close this issue when it is merged label Sep 29, 2022
@ghost ghost locked as resolved and limited conversation to collaborators Oct 29, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants