-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NativeAOT size conscious authoring observations #83902
Comments
The details below dive a bit deeper into some of the async overhead workarounds, this may be useful for other NativeAOT code authors. I've folded it to keep the issue focussed. WorkaroundsOne of the hacks I've used to reduce size of async methods in generic classes (when they have a single await and no generic return type) can be seen being implemented in this commit NinoFloris/Slon@bcd8572 The basic steps come down to:
if you already inherit from a class you can't modify or when you are implementing a generic struct, an alternative is to move the async method to a static class which then takes a callback delegate as an alternative to the virtual method. This is worse in certain ways, better in others, see https://github.com/NinoFloris/Slon/blob/c7f2b856e35e91d1f025640785fae4172d960f00/Slon/Pg/Converters/CollectionConverter.cs#L242-L255 for an example. When working with ValueTasks the generic class method should first check whether the ValueTask is completed and skip the base method call if so, saving the alloc and virtual call. That's basically it, when dealing with ValueTask's that wrap an IValueTaskSource this technique won't work without allocating, AsTask() will have to allocate a Task to proxy the IValueTaskSource. This is where a public version of That being said, if that method would exist it wouldn't be an easy pattern to get right. As you can only use an IValueTaskSource once you can't use normal async await code, it would call GetResult on the non generic interface. And by doing so it would free the source without you getting the result. It would instead require you to write your own state machine to call into the virtual/callback after awaiting (some help from the language or some api to simplify doing this could be useful again). Once there you can proceed as usual, by converting the ValueTask back to its generic form |
Tagging subscribers to this area: @agocke, @MichalStrehovsky, @jkotas Issue DetailsNativeAOT authoring observationsAs I'm continuing to iterate to shave off overhead from the new serializer for Npgsql, and slowly getting more familiar with the code patterns that work well with NativeAOT I wanted to report on my findings so far. I will be using this issue to centralize future observations of a similar nature as well. AsyncOne class of issues in particular I want to highlight is the cost of everything relating to It might be worthwhile for the C# and runtime team to take a look at what can be done to reduce its footprint, as this interaction is supremely problematic for more casual authors. If there is no significant improvement to be made here it could help to expose relevant tools for authors to reduce it on a case by case basis. Some of those tools could be for instance having public methods on ValueTask and ValueTask to move from its generic form to non generic and back, allowing us external authors to support pooled IValueTaskSource(T) while doing so. Async TypesNext up I started looking at the general cost of even having apis returning ValueTask types, it's not cheap... Not only do you pay for the Task types but also for I have two links here, one before dropping about 20(?) mostly reference type instantiations of Before: https://github.com/NinoFloris/Slon/actions/runs/4507705639 The mstats are attached if you want to explore the System.Private.Corelib types in more detail. Reference TypesOnto the next papercut, reference type instantiations in particular. Now their methods are shared via __Canon, however all their concrete types are still required to exist in full, take a look: These all add to the binary size, but I'm not sure what they're exactly uniquely adding. Could these EETypes/method tables (what is the correct name of this?) be shared in any way via __Canon? Only keeping a concrete generic context around per reference type instantiation? I understand the latter would always be needed to have correct type testing etc. inside methods. Improving this situation seems like it may also reduce the previously discussed Value Type Code SharingFinally I would urge runtime devs to consider what it would take to share code across same size/layout value types, a la gc shapes in golang. For a type like int that could allow eligible code for say I understand this cuts into the ability to do runtime intrinsic optimizations (like uints never being negative values etc). I also see how this may complicate codegen as this sharing is only possible when the different instantiations don't actually produce different bodies (i.e. int.Equals will produce different code than DateOnly.Equals), however there will be many methods that don't depend on the generic type's methods at all, just their data representation. For an initial good enough experiment it might just be sufficient to add a stage that aliases same-type method instantiations by their body being byte for byte identical? Such a stage may also help with sharing code across all instantiations when it doesn't make use of the generic context at all. I can see the theoretical version of these things working but I'm obviously not sure what this would mean more concretely. (and the practical problems flowing from this, which I'm surely glossing over here) ConclusionIf we're really serious about NativeAOT being 'effortlessly' competitive (so no crazy authoring) these issues must be explored. If only just to understand the problematic elements better. All in all it's been challenging to keep size down to acceptable levels in this particular area of generics, async and serializer-like code. @DamianEdwards Is there a world in which we drive dotnet/aspnetcore#45910 stage 2 efforts across internal and external collaborators more effectively than just github issues? Is that something you're open to?
|
260 bytes is the size after sharing via __Canon. In general, smaller size implies worse speed. Would you be willing to pay for smaller size by worse startup time or worse throughput?
This undocumented mode can be enable by |
There's no sign of a __Canon entry, just a type definition, (non virtual method tables can get trimmed afaik?). (you can view the report here, its the last folded panel https://github.com/NinoFloris/Slon/actions/runs/4507808228) If the BOTR broadly applies to NativeAOT I'm aware the entire virtual method table is not shared, but I don't quite understand why. Is it really just to have concrete method table pointers on object instance headers? Would the throughput loss mean adding an indirection here? While the startup time increase would be for synthesizing these copies? Could that be ad-hoc on first allocation? The problem I'm dealing with is keeping code size small while having a reasonable supported set of types baked in, the chances that they'll all be used by the app is small, and the chances they're needed immediately even smaller. User directed rooting - which would certainly be preferable - is not a great story in ADO.NET, but that's a different matter. To answer your question directly, it might be worth some form of perf loss yes, but it'd have to be quantified first. The bad alternative is working purely with System.Object results everywhere, downcasting to its known type at the edges instead. It gets rid of all the ValueTask types, the CollectionConverter types and all the other type bloat. I have a commit that went down this road but I backed out of it as it just adds more implementation complexity, I hope to avoid it. Is this issue in any way related? #83438
Yeah that's the one! Thanks, saves a me search :) I'll see what it does. |
Yes, it would mean adding indirection in the method table and an extra instruction in all virtual callsites. It is not clear whether it would be a size win at the end.
Correct. |
Tagging subscribers to 'linkable-framework': @eerhardt, @vitek-karas, @LakshanF, @sbomer, @joperezr, @marek-safar Issue DetailsNativeAOT authoring observationsAs I'm continuing to iterate to shave off overhead from the new serializer for Npgsql, and slowly getting more familiar with the code patterns that work well with NativeAOT I wanted to report on my findings so far. I will be using this issue to centralize future observations of a similar nature as well. AsyncOne class of issues in particular I want to highlight is the cost of everything relating to It might be worthwhile for the C# and runtime team to take a look at what can be done to reduce its footprint, as this interaction is supremely problematic for more casual authors. If there is no significant improvement to be made here it could help to expose relevant tools for authors to reduce it on a case by case basis. Some of those tools could be for instance having public methods on ValueTask and ValueTask to move from its generic form to non generic and back, allowing us external authors to support pooled IValueTaskSource(T) while doing so. Async TypesNext up I started looking at the general cost of even having apis returning ValueTask types, it's not cheap... Not only do you pay for the Task types but also for I have two links here, one before dropping about 20(?) mostly reference type instantiations of Before: https://github.com/NinoFloris/Slon/actions/runs/4507705639 The mstats are attached if you want to explore the System.Private.Corelib types in more detail. Reference TypesOnto the next papercut, reference type instantiations in particular. Now their methods are shared via __Canon, however all their concrete types are still required to exist in full, take a look: These all add to the binary size, but I'm not sure what they're exactly uniquely adding. Could these EETypes/method tables (what is the correct name of this?) be shared in any way via __Canon? Only keeping a concrete generic context around per reference type instantiation? I understand the latter would always be needed to have correct type testing etc. inside methods. Improving this situation seems like it may also reduce the previously discussed Value Type Code SharingFinally I would urge runtime devs to consider what it would take to share code across same size/layout value types, a la gc shapes in golang. For a type like int that could allow eligible code for say I understand this cuts into the ability to do runtime intrinsic optimizations (like uints never being negative values etc). I also see how this may complicate codegen as this sharing is only possible when the different instantiations don't actually produce different bodies (i.e. int.Equals will produce different code than DateOnly.Equals), however there will be many methods that don't depend on the generic type's methods at all, just their data representation. For an initial good enough experiment it might just be sufficient to add a stage that aliases same-type method instantiations by their body being byte for byte identical? Such a stage may also help with sharing code across all instantiations when it doesn't make use of the generic context at all. I can see the theoretical version of these things working but I'm obviously not sure what this would mean more concretely. (and the practical problems flowing from this, which I'm surely glossing over here) ConclusionIf we're really serious about NativeAOT being 'effortlessly' competitive (so no crazy authoring) these issues must be explored. If only just to understand the problematic elements better. All in all it's been challenging to keep size down to acceptable levels in this particular area of generics, async and serializer-like code. @DamianEdwards Is there a world in which we drive dotnet/aspnetcore#45910 stage 2 efforts across internal and external collaborators more effectively than just github issues? Is that something you're open to?
|
@eerhardt are there tasks where the warnings for a library (stage 2 or not) seem relatively well understood and community folks could work through fixing them? or perhaps that would too much parallelism right now. |
I'm not sure that is what @NinoFloris is asking for. But I will tag the issues for each area in dotnet/aspnetcore#45910 stage 2.a.
@NinoFloris - do you have a specific proposal? What other medium besides github issues would you use? |
Personally I feel like games are an important usecase for AOTs and there the code size mostly doesn't matter since even if you have 100MB of binaries, you'll likely have multiple gigabytes of assets with it. As such, I believe that all size optimizations that negatively impact performance should be optional (or possible to disable via |
The size of async is tracked in #79204 and similar. The valuetype sharing would come at a throughput cost - consider code like this: runtime/src/libraries/System.Private.CoreLib/src/System/Collections/Generic/ArraySortHelper.cs Lines 565 to 582 in 8ffad52
If we do valuetype sharing here, what was originally a single instruction for |
I do not see anything actionable in this issue. The actionable suggestions are tracked in linked issues. |
NativeAOT authoring observations
As I'm continuing to iterate to shave off overhead from the new serializer for Npgsql, and slowly getting more familiar with the code patterns that work well with NativeAOT I wanted to report on my findings so far. I will be using this issue to centralize future observations of a similar nature as well.
Async
One class of issues in particular I want to highlight is the cost of everything relating to
async
code. Having any of these methods in generic types is one major source of bloat due to their inherent IL type + codegen explosion` multiplied by the number of canonical instantiations.It might be worthwhile for the C# and runtime team to take a look at what can be done to reduce its footprint, as this interaction is supremely problematic for more casual authors. If there is no significant improvement to be made here it could help to expose relevant tools for authors to reduce it on a case by case basis.
Some of those tools could be for instance having public methods on ValueTask and ValueTask to move from its generic form to non generic and back, allowing us external authors to support pooled IValueTaskSource(T) while doing so.
Tasks have inherent support of up/downcasting to do this. For ValueTask these apis could be used in the same way as up/downcasting Tasks, for the purpose of pushing this bulky async codegen to non generic types. There already is the internal method
ValueTask.DangerousCreateFromTypedValueTask
as a validation of this being useful internally.Async Types
Next up I started looking at the general cost of even having apis returning ValueTask types, it's not cheap... Not only do you pay for the Task types but also for
ValueTask<T>
,IValueTaskSource<T>
,ValueTask<T>.ValueTaskSourceAsTask
,ValueTaskAwaiter<T>
and others.I have two links here, one before dropping about 20(?) mostly reference type instantiations of
ValueTask<T>
and one report after doing so. The difference is a hefty 80kb, with methods taking up just 24kb of that difference. The remainder is mostly types and type dictionary metadata.Before: https://github.com/NinoFloris/Slon/actions/runs/4507705639
After: https://github.com/NinoFloris/Slon/actions/runs/4507808228
The mstats are attached if you want to explore the System.Private.Corelib types in more detail.
Reference Types
Onto the next papercut, reference type instantiations in particular. Now their methods are shared via __Canon, however all their concrete types are still required to exist in full, take a look:
These all add to the binary size, but I'm not sure what they're exactly uniquely adding. Could these EETypes/method tables (what is the correct name of this?) be shared in any way via __Canon? Only keeping a concrete generic context around per reference type instantiation? I understand the latter would always be needed to have correct type testing etc. inside methods.
Improving this situation seems like it may also reduce the previously discussed
ValueTask<T>
type bloat?Value Type Code Sharing
Finally I would urge runtime devs to consider what it would take to share code across same size/layout value types, a la gc shapes in golang. For a type like int that could allow eligible code for say
List<T>
to be reused across int, uint, enums,ValueTuple<int>
, DateOnly and other types wrapping a single int. The same would go for other primitives like long that have a lot of representationally isomorphic types to share code with.I understand this cuts into the ability to do runtime intrinsic optimizations (like uints never being negative values etc). I also see how this may complicate codegen as this sharing is only possible when the different instantiations don't actually produce different bodies (i.e. int.Equals will produce different code than DateOnly.Equals), however there will be many methods that don't depend on the generic type's methods at all, just their data representation. For an initial good enough experiment it might just be sufficient to add a stage that aliases same-type method instantiations by their body being byte for byte identical?
Such a stage may also help with sharing code across all instantiations when it doesn't make use of the generic context at all.
IIRC there is already some global deduplication mode (which impacts stack traces) but only sharing per generic type seems to be more suitable to be enabled by default?
I can see the theoretical version of these things working but I'm obviously not sure what this would mean more concretely. (and the practical problems flowing from this, which I'm surely glossing over here)
Conclusion
If we're really serious about NativeAOT being 'effortlessly' competitive (so no crazy authoring) these issues must be explored. If only just to understand the problematic elements better.
All in all it's been challenging to keep size down to acceptable levels in this particular area of generics, async and serializer-like code.
@DamianEdwards Is there a world in which we drive dotnet/aspnetcore#45910 stage 2 efforts across internal and external collaborators more effectively than just github issues? Is that something you're open to?
The text was updated successfully, but these errors were encountered: