Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Polyfill the incremental generator ForAttributeWithMetadataName from roslyn. #70911

Merged
merged 24 commits into from
Jul 5, 2022

Conversation

CyrusNajmabadi
Copy link
Member

This is necessary in the interim while we reference an older version of roslyn without this API. Once we can move forward to a version taht does expose this, we can delete the polyfill.

@dotnet-issue-labeler
Copy link

I couldn't figure out the best area label to add to this PR. If you have write-permissions please help me learn by adding exactly one area label.

@ghost ghost assigned CyrusNajmabadi Jun 17, 2022
@@ -0,0 +1,108 @@
// Licensed to the .NET Foundation under one or more agreements.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the general approach i took was to copy the roslyn code whole-sale and make as few changes to it as possible. The only changes i made were:

  1. update the license.
  2. change the namespace.
  3. use ValueListBuilder instead of ArrayBuilder.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These files are all then source-linked into the project that needs them. So subsequent updates to other generators will be much easier as they can just leverage this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to have someone from the Roslyn team review all files added under Common/src/Roslyn? I understand that this is a direct port with just minimal changes, but it may be worth it to have that extra pair of eyes over those minimal changes from the domain experts.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. @chsienki and @jaredpar both reviewed the roslyn side. They should likely review this port as well.


var comparison = isCaseSensitive ? StringComparison.Ordinal : StringComparison.OrdinalIgnoreCase;
return name.Length > AttributeSuffix.Length && name.EndsWith(AttributeSuffix, comparison);
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

port of roslyn helper.

@@ -0,0 +1,74 @@
// Licensed to the .NET Foundation under one or more agreements.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

direct port with only necessary cleanup to fit into runtime.

@@ -0,0 +1,377 @@
// Licensed to the .NET Foundation under one or more agreements.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

direct port with only necessary cleanup to fit into runtime.

return unchecked((currentKey * (int)0xA5555529) + newKey);
}

#if false
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if i was only referencing a tiny part of the surface area of the copied code, i ifdefed out the rest to keep teh intrusion minimal. i kept the code in case we may need to access more of it. this allows the code to stay close in sync with teh roslyn version.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we please add a comment above explaining this?

/// </summary>
private bool _useCLSCompliantNameArityEncoding;

#if false
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

similarly, this type is heavily if'defed out.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stephentoub do you know if we are also running the linker on the source generators? If so, we probably can keep this code (without the ifdef) and then just have the linker strip it out, and that way we can still preserve the files in sync with Roslyn, while not having to carry the extra IL in our binaries.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you know if we are also running the linker on the source generators?

I don't think we are, but @eerhardt can confirm.

@@ -0,0 +1,42 @@
// Licensed to the .NET Foundation under one or more agreements.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

copied wholesale with only the changes necessary for runtime.

@@ -0,0 +1,43 @@
// Licensed to the .NET Foundation under one or more agreements.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

copied wholesale with only the changes necessary for runtime.

@@ -0,0 +1,188 @@
// Licensed to the .NET Foundation under one or more agreements.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is one of two files containing the meat of the change. the entire incremental algorithm, with tweaks to fit into the runtime.

var collectedNodes = nodesWithAttributesMatchingSimpleName
.Collect()
.WithComparer(ImmutableArrayValueComparer<SyntaxNode>.Instance)
/*.WithTrackingName("collectedNodes_ForAttributeWithMetadataName")*/;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WithTrackingName doesn't seem to be available yet here. so i commented them all out.

string fullyQualifiedMetadataName)
{
var targetSyntaxTree = attributeTarget.SyntaxTree;
var result = new ValueListBuilder<AttributeData>(Span<AttributeData>.Empty);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

using ValueListBUilder everywhere. this also necessitates passing explicitly by ref and not capturing (can't capture ref-structs).

.Select(static (arrays, _) => GlobalAliases.Create(arrays.SelectMany(a => a.AliasAndSymbolNames).ToImmutableArray()))
/*.WithTrackingName("allUpGlobalAliases_ForAttribute")*/;

#if false
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

commenting out VB specific portion.

@@ -1,58 +1,30 @@
// Licensed to the .NET Foundation under one or more agreements.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is hte change to the actual generator to use the new API.

foreach (AttributeSyntax attributeSyntax in attributeListSyntax.Attributes)
{
if (semanticModel.GetSymbolInfo(attributeSyntax, cancellationToken).Symbol is IMethodSymbol attributeSymbol &&
attributeSymbol.ContainingType.ToDisplayString() == RegexGeneratorAttributeName)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no more costly binding of every attribute in the world, as well as a costly ToDisplayString on all of them.

@CyrusNajmabadi CyrusNajmabadi marked this pull request as ready for review June 17, 2022 20:30
@CyrusNajmabadi
Copy link
Member Author

Oh, wait, this PR is only fixing up regex generator, which is .NET 7 only. I guess once this is in you're going to open separate PRs for the other generators, and then look to service the JSON/logging generators for .NET 6?

I'm looking for guidance on what i should be doing :) My goal here is to just get us to a good state perf-wise. My preference would be if i could get traction to get whatever fixes/rollbacks in that are necessary as fast as possible. I'm happy to contribute my own time towards that if we cannot get people assigned to this. However, this not being a project i have deep insight into, i need as much guidance as possible as to what needs to be done and in what fashion it must be done.

Put another way, i am not a good person to ask questions to except on how this this incremental-generator API works and how to get perf good. I need you guys to answer any questions about:

  1. where this code lives.
  2. what needs to be patched.
  3. how things get released/serviced.

etc. etc.

The major request i have for you is that this be treated as very high priority as the perf is hitting us very badly. If we only have a couple weeks left on this, that really worries me greatly.

Copy link
Member

@stephentoub stephentoub left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Subject to the existing comments, and in particular anything we can do to test this, LGTM. I skimmed the Roslyn portions, but under the assumption it's already been heavily reviewed in Roslyn and thus focused on the diffs.

src/libraries/Common/src/Roslyn/CSharpSyntaxHelper.cs Outdated Show resolved Hide resolved
{
break;
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you consider using an API like LastIndexOf rather than open-coding this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so this is just a straight port of Roslyn: https://github.com/dotnet/roslyn/blob/main/src/Compilers/Core/Portable/MetadataReader/MetadataHelpers.cs#L484

so i def didn't want to touch this and screw anything up here at all :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, wasn't suggesting it being changed here. I'm more wondering why Roslyn isn't relying on such helpers, if there's an issue with them or if it just didn't happen, etc.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's a good question. my recollection is that some of this are fairly direct ports of hte old native code. So it may just have been super expedient to copy virtually unchanged and then have it not be touched. A case of "if it ain't broke". Given the nicer runtime APIs (and all the recent improvements around strings/spans as well), this likely could get an entire makeover. So it would more a question of if Compiler team would see value in that, our would prefer to not touch and put attention on other more valuable stuff :)

string stringRepresentingArity = emittedTypeName.Substring(indexOfManglingChar);

int arity;
bool nonNumericCharFound = !int.TryParse(stringRepresentingArity, NumberStyles.None, CultureInfo.InvariantCulture, out arity);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It'll be nice when code like this can use spans.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would def agree on that. i imagine this will get much simpler/saner if it moved to that. :)

for (int i = nestedNamespaces.Count - 1; i >= 0; i--)
{
names[nestedNamespaces[i].Key] = i;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why iterate in reverse to add everything to the dictionary?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jaredpar for feedback on this existing roslyn code :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually, i can remove all this code (At the cost of more deviations from roslyn). you guys don't ever look for nested-attribute types (e.g. the attribute is a nested type, not that hte attribute is on a nested type). So i can just remove this and hardcode that knowlege.

/// </summary>
private bool _useCLSCompliantNameArityEncoding;

#if false
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you know if we are also running the linker on the source generators?

I don't think we are, but @eerhardt can confirm.

{
if (!EqualityComparer<T>.Default.Equals(x[i], y[i]))
return false;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/// <summary>
/// Determines whether two sequences are equal according to an equality comparer.
/// </summary>
/// <typeparam name="TDerived">The type of element in the compared array.</typeparam>
/// <typeparam name="TBase">The type of element contained by the collection.</typeparam>
public static bool SequenceEqual<TDerived, TBase>(this ImmutableArray<TBase> immutableArray, ImmutableArray<TDerived> items, IEqualityComparer<TBase>? comparer = null) where TDerived : TBase
{
immutableArray.ThrowNullRefIfNotInitialized();
items.ThrowNullRefIfNotInitialized();
if (object.ReferenceEquals(immutableArray.array, items.array))
{
return true;
}
if (immutableArray.Length != items.Length)
{
return false;
}
if (comparer == null)
{
comparer = EqualityComparer<TBase>.Default;
}
for (int i = 0; i < immutableArray.Length; i++)
{
if (!comparer.Equals(immutableArray.array![i], items.array![i]))
{
return false;
}
}
return true;
}
/// <summary>
/// Determines whether two sequences are equal according to an equality comparer.
/// </summary>
/// <typeparam name="TDerived">The type of element in the compared array.</typeparam>
/// <typeparam name="TBase">The type of element contained by the collection.</typeparam>
public static bool SequenceEqual<TDerived, TBase>(this ImmutableArray<TBase> immutableArray, IEnumerable<TDerived> items, IEqualityComparer<TBase>? comparer = null) where TDerived : TBase
{
Requires.NotNull(items, nameof(items));
if (comparer == null)
{
comparer = EqualityComparer<TBase>.Default;
}
int i = 0;
int n = immutableArray.Length;
foreach (var item in items)
{
if (i == n)
{
return false;
}
if (!comparer.Equals(immutableArray[i], item))
{
return false;
}
i++;
}
return i == n;
}
/// <summary>
/// Determines whether two sequences are equal according to an equality comparer.
/// </summary>
/// <typeparam name="TDerived">The type of element in the compared array.</typeparam>
/// <typeparam name="TBase">The type of element contained by the collection.</typeparam>
public static bool SequenceEqual<TDerived, TBase>(this ImmutableArray<TBase> immutableArray, ImmutableArray<TDerived> items, Func<TBase, TBase, bool> predicate) where TDerived : TBase
{
Requires.NotNull(predicate, nameof(predicate));
immutableArray.ThrowNullRefIfNotInitialized();
items.ThrowNullRefIfNotInitialized();
if (object.ReferenceEquals(immutableArray.array, items.array))
{
return true;
}
if (immutableArray.Length != items.Length)
{
return false;
}
for (int i = 0, n = immutableArray.Length; i < n; i++)
{
if (!predicate(immutableArray[i], items[i]))
{
return false;
}
}
return true;
}

?

if (results.Length == 0)
return ImmutableArray<SyntaxNode>.Empty;

return results.AsSpan().ToArray().Distinct().ToImmutableArray();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same question about disposal for all the ValueListBuilders throughout.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed for all.

@stephentoub
Copy link
Member

The major request i have for you is that this be treated as very high priority as the perf is hitting us very badly. If we only have a couple weeks left on this, that really worries me greatly.

@danmoseley, is there someone who can pair with Cyrus to get the is dotted and ts crossed as necessary?

@CyrusNajmabadi, your concern is this taking weeks or your concern that weeks isn't sufficient to get done what needs to be done? If the latter, I don't think that needs to be a critical concern... for .NET 6 it's already shipped and so anything we do there will be servicing, and for .NET 7 we have more runway that Preview 7 for perf and bug fixes and the like.

@CyrusNajmabadi
Copy link
Member Author

SequenceEqual

Excellent. Switched to that. Not sure why intellisense didn't find that.

@CyrusNajmabadi
Copy link
Member Author

@CyrusNajmabadi, your concern is this taking weeks or your concern that weeks isn't sufficient to get done what needs to be done? If the latter, I don't think that needs to be a critical concern... for .NET 6 it's already shipped and so anything we do there will be servicing, and for .NET 7 we have more runway that Preview 7 for perf and bug fixes and the like.

Ok, that's great to know. Thanks for looking for someone to pair on this with me. That will be super helpful given all the things to juggle here :)

@chsienki
Copy link
Contributor

Do we plan on porting some tests over from the Roslyn repo for the shared code as well

Good question. @chsienki the runtime is referencing a version of roslyn that does not have WithTrackingName. So it seems like it would be very difficult to validate this here as we don't have the APIs that expose the data we need to ensure that incremental is working properly. Do you have any thoughts on how to validate things here?

I'll defer to you on how confident you are on the port, but it is tested in Roslyn at least, so we're not completely blind. You may want to just do a 'validation' pass with a debugger to check its being called incrementally in the IDE.

Without the tracking API you'd need to add 'test only' side effects to the generator that you can read back from the tests (or you can do what razor does and have an ETW logger that emits at each stage, and run it in the unit test to collect the 'side effect' data https://github.com/dotnet/sdk/blob/main/src/Tests/Microsoft.NET.Sdk.Razor.SourceGenerators.Tests/RazorSourceGeneratorTests.cs#L91 https://github.com/dotnet/sdk/blob/main/src/Tests/Microsoft.NET.Sdk.Razor.SourceGenerators.Tests/RazorSourceGeneratorTests.cs#L152)

@CyrusNajmabadi
Copy link
Member Author

@stephentoub pulled out a couple thousands of lines. that should make this more palatable i hope :)

@danmoseley
Copy link
Member

@danmoseley, is there someone who can pair with Cyrus to get the is dotted and ts crossed as necessary?

@joperezr are you a good person to help here? Asking directly as @ericstj is out.

@joperezr
Copy link
Member

@danmoseley, is there someone who can pair with Cyrus to get the is dotted and ts crossed as necessary?

@joperezr are you a good person to help here? Asking directly as @ericstj is out.

For sure, happy to help here.

Copy link
Member

@joperezr joperezr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to Steve, This LGTM barring the existing comments.

I chatted with @CyrusNajmabadi offline, this work is targeting .NET 7 Preview 7, and we will also try to get the other generators in for that same preview. After that, we will work on getting this into .NET 6, which currently is looking that this would target 6.0.8 release.

@joperezr
Copy link
Member

joperezr commented Jul 1, 2022

I've pulled this change down and have been doing some manual smoke testing, particularly on the incremental editing side, and things seem to be working as expected. I tested with a generated large project with around 200 files, all with types that had attributes on them, and the editing experience didn't seem to feel slow nor caused major memory allocation changes. I also tried deleting files that were consuming the generator and the generated code quickly updated as expected.

I chatted with @CyrusNajmabadi offline and I think this is good to go. In order to add incremental editing tests to have some coverage there we would need to consume a newer version of Roslyn. Given that we can't do that yet and the fact that this code is almost unchaged from the original code on the Roslyn repo (which of course has all of this test coverage) I'm comfortable with merging this now after doing the smoke testing to validate that things work as expected.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-System.Text.RegularExpressions source-generator Indicates an issue with a source generator feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants