Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Champion: Module Initializers (VS 16.8, .NET 5) #2608

Open
gafter opened this issue Jun 18, 2019 · 30 comments
Open

Champion: Module Initializers (VS 16.8, .NET 5) #2608

gafter opened this issue Jun 18, 2019 · 30 comments
Assignees
Labels
Implemented Needs ECMA Spec This feature has been implemented in C#, but still needs to be merged into the ECMA specification Proposal champion
Milestone

Comments

@gafter
Copy link
Member

gafter commented Jun 18, 2019

See also #2486

Although the .NET platform has a feature that directly supports writing initialization code for the assembly (technically, the module), it is not exposed in C#. This is a rather niche scenario, but once you run into it the solutions appear to be pretty painful. I have seen reports of a number of customers (inside and outside Microsoft) struggle with the problem, and there are no doubt more undocumented cases.

I suggest that we would add a tiny feature to support this without any explicit syntax, by having the C# compiler recognize a module attribute with a well-known name, like the following:

namespace System.Runtime.CompilerServices
{
    [Obsolete("This attribute is only to be used in C# language version 9.0 or later", true)]
    [AttributeUsage(AttributeTargets.Module, AllowMultiple = false)]
    public class ModuleInitializerAttribute : Attribute
    {
        public ModuleInitializerAttribute(Type type) { }
    }
}

You would use it like this

[module: System.Runtime.CompilerServices.ModuleInitializerAttribute(typeof(MyModuleInitializer))]

internal static class MyModuleInitializer
{
    static MyModuleInitializer()
    {
        // put your module initializer here
    }
}

and the C# compiler would then emit a module constructor that causes the static constructor of the identified type to be triggered:

void .cctor()
{
    // synthesize and call a dummy method with an unspeakable name,
    // which will cause the runtime to call the static constructor
    MyModuleInitializer.<TriggerClassConstructor>();
}

Open issues

  • Should we permit multiple types to be decorated with ModuleInitializerAttribute in a compilation? If so, in what order should the static constructors be invoked?

Alternative Approaches

There are a number of possible ways of exposing this feature in the language:

1. Special global method declaration

A module initializer would be provided by writing a special kind of method in the global scope:

internal void operator init() ...

This gives the new language construct its own syntax. However, given how rare and niche the scenario is, this is probably far too heavyweight an approach.

2. Attribute on the type to be initialized

Instead of a module-level attribute, perhaps the attribute would be placed on the type to be initialized

[ModuleInitializer]
class ToInitialize
{
    static ToInitialize() ...
}

With this approach, we would either need to reject a program that contains more than one application of this attribute, or provide some policy to define the ordering in case it is used multiple times. Either way, it is more complex than the original proposal above.

3. Attribute on a static method to be called

Instead of a module-level attribute, perhaps the attribute would be placed on the method to be called to perform the initialization

class Any
{
    [ModuleInitializer]
    static void Initializer() ...
}

As in the previous approach, we would either need to reject a program that contains more than one application of this attribute, or provide some policy to define the ordering in case it is used multiple times. Either way, it is more complex than the original proposal.

4. Original Proposal

The original proposal naturally prevents the user from declaring more than one initializer class without adding any special language rules to accomplish that. It is very lightweight in that it uses existing syntax and semantic rules for attributes.

LDM notes:

@jkotas
Copy link
Member

jkotas commented Jun 18, 2019

Would it be better to just stick the attribute on a method that is meant to be used as module initializer? It would allow the method to be called directly, without going through the overhead of RuntimeHelpers.RunClassConstructor.

The attribute can be potentially allowed on multiple methods in the module and then all of them would be called in some order from the module initializer.

@theunrepentantgeek
Copy link

I'd guess that one factor is the compiler overhead of the attribute search.

If the only place to look is the metadata for the module itself, the check will be very quick (avoiding any overhead for the vast majority of folks who never use this feature). If the search has to check every method of every class in module, it will take a lot longer, slowing down compilation for everyone.

That said, what about targeting the class itself - like this:

[System.Runtime.CompilerServices.ModuleInitializerAttribute]
internal static class MyModuleInitializer
{
    static MyModuleInitializer()
    {
        // put your module initializer here
    }
}

It's still a far larger search volume than the module metadata (so performance may be a concern), but a far smaller search volume than "every method".

@gafter
Copy link
Member Author

gafter commented Jun 18, 2019

The feature as proposed is intended to be the simplest possible thing that exposes (and maps almost directly to) the underlying .NET feature. No more is needed for the use cases I've seen. If you did need to have multiple bodies of code run initializers, for example, you could implement that in the language-supported one (use reflection to search for types with your own special attribute and initialize them).

I'm not worried about the "overhead" of calling RuntimeHelpers.RunClassConstructor once, as I expect that to be trivial in the overall execution of the program.

@ericstj
Copy link
Member

ericstj commented Jun 19, 2019

all of them would be called in some order

Can that be done in a predictable, stable, way? Any precedent for compiler deciding order of execution?
Maybe similar to static field initialization, but in that case you have a semi-predictable order as it's defined as textual and barring partial classes, the developer defines the order in a single file. In this case it spans files, which will have variable ordering depending on file-system sort due to the globbing that happens in .NET.SDK projects. What about allowing the attribute on a method and making it an error if it appears on more than one method? If folks need more than one then they can explicitly call them in a defined order. That said I think @gafter's suggestion works just as well if the overhead isn't too high.

Are there any rules about what you can do inside a module constructor? Are you allowed to load other assemblies, make pInvoke calls, call async code, etc? If there are lots of rules I can imagine it being hard to enforce them in the compiler making this a somewhat dangerous feature warranting terms like "unsafe" or "dangerous".

@dsaf
Copy link

dsaf commented Jun 19, 2019

So, what is the terminology: module or assembly? Can I have many modules per assembly? Is C# module same as as F# module same as .NET module?

[Obsolete("This attribute is only to be used in C# language version 9.0 or later", true)]

This sounds like the opposite of Obsolete :).

@yaakov-h
Copy link
Member

This sounds like the opposite of Obsolete :).

@dsaf it's a trick used to stop older versions of the compiler using something it doesn't understand. Newer versions ignore that exact obsoletion string.

ref struct uses the same technique.

As for what a module means, that's already defined as an attribute target. I assume this feature won't change the meaning.

@pinkfloydx33
Copy link

When using Fody, you need to have a method with a special name and signature. Obviously this can't be relied on here and I like the attribute specifying the type with the static constructor (short of special syntax). In that method you can just call whatever other initialization you need. So if for some reason you needed more than one module initializer, you could just refactor that into a single method that explicitly invokes the rest (no need for reflection).

The only thing I dislike about the assembly level attribute specifying the type is that it's hidden. A developer looking at a static constructor later may not realize that this is supposed to be a module initializer. This is the same problem you end up having with Fody. Not that it's a bad thing, but you need to make sure you document that method with warnings about removing code that is being relied on as a module initializer. If we could specify the attribute on the class/initializer directly, then it becomes a bit more obvious. The only problem is that now you could potentially have more than one such method and the compiler would have to search for it. In that case perhaps more than one detected attribute could issue a compiler error, though I'm not sure how that would impact then build process (ie slow it down)

@Joe4evr
Copy link
Contributor

Joe4evr commented Jun 19, 2019

The only thing I dislike about the assembly level attribute specifying the type is that it's hidden. A developer looking at a static constructor later may not realize that this is supposed to be a module initializer. This is the same problem you end up having with Fody. Not that it's a bad thing, but you need to make sure you document that method with warnings about removing code that is being relied on as a module initializer.

I see two (complementing) solutions for this:

  1. While really only a style guide, the new attribute can be placed in the same file as the Module Initializer method you want to call
  2. More thoroughly, the compiler can gain some extra knowledge about the attribute and verify that a cctor in the specified type exists, and otherwise reports a compilation error:
[module: System.Runtime.CompilerServices.ModuleInitializerAttribute(typeof(MyModuleInitializer))]
// Error CS####: No static constructor specified in type 'MyModuleInitializer' to be called as module initializer

internal static class MyModuleInitializer
{
    //oops, someone removed this code, but now it can't build
}

@HaloFour
Copy link
Contributor

I'm with @Joe4evr , I think that if this is the way in which module initializers are implemented that the compiler should check and enforce that a static constructor exists on the class.

However, if the compiler is making such a check I think it would be just as easy for the compiler to attempt to emit a static call to a well known static method rather than using a static class constructor:

[module: System.Runtime.CompilerServices.ModuleInitializerAttribute(typeof(MyModuleInitializer))]

internal static class MyModuleInitializer
{
    internal static void Initialize() {
        // do stuff here
    }
}

It would be a compiler error if that method is not resolved by the compiler at compile time.

In this case the compiler would only have to emit a static call to that method:

void .cctor()
{
    MyModuleInitializer.Initialize();
}

In my naive opinion this seems about as difficult as going the static constructor route, assuming that the compiler would check that such a constructor exists.

@masonwheeler
Copy link

This is good to see. One thing I'd really like to see changed, though: make module constructors run eagerly.

Right now, module initializers, like class static constructors, run lazily; at some point after a module has loaded but before any code from that module runs, the module initializer will run. Unfortunately, this adds unnecessary coupling and complication to one of the best scenarios for module initializers: plugins. Ideally, you could load a plugin assembly, the module initializer would run, and it would register the plugins with the plugin system (through a known method in a dependent assembly.) But with lazy initialization, you can't do that; you need the plugin system to "reach into" the module somehow in order to activate it, probably with Reflection, and by that point there's no point in having a module initializer at all; you just use Reflection to search for plugins to register.

With static constructors, lazy initialization is needed because there's no good way to resolve dependency order eagerly. But with CLR assemblies, we have a well-established dependency order already built into the fundamental concept of assemblies, so that limitation doesn't apply. So it seems to me there's no good reason not to make it eager.

@HaloFour
Copy link
Contributor

@masonwheeler

Seems like something you'd need to take to CoreCLR as the language currently can't influence how the initializers would behave.

@jkotas
Copy link
Member

jkotas commented Jun 19, 2019

"overhead" of calling RuntimeHelpers.RunClassConstructor once

The overhead is in that RuntimeHelpers.RunClassConstructor introduces unnecessary dependency on reflection stack. I agree that it is not a big deal for a lot of programs out there, but not all of them. The dependency on reflection stack means that this feature won't be usable by folks who want to write lean-and-mean code without reflection dependencies or where RuntimeHelpers.RunClassConstructor is not available such as Unity3D mini-profiles.

Can that be done in a predictable, stable, way?

Partial classes solved this problem.

what you can do inside a module constructor? Are you allowed to load other assemblies, make pInvoke calls, call async code, etc?

There are no special rules. You can do anything in module static constructor as what you would do in regular static constructor. DLLMain != module constructor.

@mjsabby
Copy link

mjsabby commented Aug 1, 2019

Is it possible to prioritize this in the 8.1 release?

@gafter
Copy link
Member Author

gafter commented Aug 2, 2019

@mjsabby There are no current plans for an 8.1 release.

@gafter gafter added this to the X.X candidate milestone Aug 26, 2019
@gafter gafter modified the milestones: X.X candidate, 9.0 candidate Aug 28, 2019
@gafter
Copy link
Member Author

gafter commented Aug 28, 2019

@jkotas Rather than using reflection, the compiler could implement this by injecting a static internal method into the type that does nothing, and then calling that method in the generated module initializer.

@jkotas
Copy link
Member

jkotas commented Aug 28, 2019

compiler could implement this by injecting a static internal method into the type that does nothing, and then calling that method in the generated module initializer.

Yes, that would work great and address all my concerns.

@Grauenwolf
Copy link

Grauenwolf commented Dec 30, 2019

The attribute can be potentially allowed on multiple methods in the module and then all of them would be called in some order from the module initializer.

I would vote against that part.

Just like we only get one Main method, you should only get one assembly initializer. It can call out to other functions if you need more organization, but anything more opens a nasty can of worms.

@chsienki
Copy link
Contributor

chsienki commented Jan 10, 2020

One possible use for the multi-module initializers approach is for generated code.

Generated code often needs a way to 'register' itself at startup. Having multiple module initializers would allow the generated code to add its own initializer that would perform any specific initialization logic it needs, without the need for the user to manually add an explicit call to it.

@masonwheeler
Copy link

@Grauenwolf If we're going to add this, it makes sense to look elsewhere for similar features and see what works and what doesn't.

Probably the best analogue comes from .NET architect Anders Hejlsberg's previous project, Delphi. It allows you to put an initialization section in any code file, and at compile-time the compiler sets everything up so they will all get executed one after the other and automagically gets the order of execution right so you don't end up with dependency problems.

That last bit (getting dependency order right) is based on a specific, restrictive property of Delphi's compilation that doesn't apply to C#, so we can't expect to be able to copy that successfully. But the basic principle of putting your initialization code together with the code it's initializing, and then having the compiler gather them into a single overall initializer, is a sound one. What are the alternatives? I can only think of two, and both are bad:

  1. You write a bunch of local initialization routines, then you have to keep track of all of them manually and call them all in the module initializer
  2. You don't write local initialization routines at all, and you have to write a module initializer that directly reaches into every piece of your assembly that needs initialized.

Both of these are significantly worse than having the compiler set it up for you. As for ordering, most of the time it won't be necessary, especially since static constructors will take care of most of the cases of initialization-time dependencies between classes. But for cases where it is, there should be some way to specify an explicit ordering, and any initializer routines that don't have an ordering set will run (in an undefined order) after all the ones that do.

@jnm2
Copy link
Contributor

jnm2 commented Apr 8, 2020

Re: https://github.com/RikkiGibson/csharplang/blob/module-initializers/proposals/module-initializers.md#motivation

  • Enable source generators to run some global initialization logic without the user needing to explicitly call anything
    • (in which scenarios is a static constructor insufficient/undesirable for this?)

One example is NUnit's VSTest adapter. We would like it to contain more dependencies as resources which requires us to hook up AssemblyResolve/Resolving event handlers before any of the types in the NUnit VSTest adapter assembly are loaded. However, VSTest loads the assembly and scans all types looking for implementations of VSTest interfaces. Without a module initializer, there's no predictable place where we can hook up AssemblyResolve/Resolving in time to prevent a type load exception. Giving every type in the assembly a static constructor seems problematic. Prototyping shows that module initializers solve this neatly.

@TFTomSun
Copy link

TFTomSun commented Apr 9, 2020

@jnm2 Unfortunately a module initializer is also not a safe solution for that specific case. We tried that already. The module initializer is invoked when the first type of the assembly is constructed. But Nunit analyzes the assembly on reflection level, which causes the dependencies to be resolved, but no type is constructed.
A safe approach is using a custom app domain manager. It is invoked early enough to set up the assembly resolution logic.

@jnm2
Copy link
Contributor

jnm2 commented Apr 9, 2020

@TFTomSun Thanks, that was helpful. You are correct. I was able to build examples on .NET Core and .NET Framework which show that Assembly.GetTypes() does not trigger the module initializer.

In this case the problem I'm trying to solve is that that VSTest analyzes the NUnit adapter assembly on a reflection level. (VSTest loops through Assembly.GetTypes() looking for VSTest interface implementations.)

Since appdomains are not a thing in .NET Core, I think this means we should just ask VSTest to run this code prior to examining assemblies using reflection:

RuntimeHelpers.RunModuleConstructor(adapterAssembly.ManifestModule.ModuleHandle);

This is still a neat solution compared to anything else I'm aware of that we could do.

@jnm2
Copy link
Contributor

jnm2 commented Apr 16, 2020

@AlekseyTs brought up a good question. The concept of a module initializer does not appear in ECMA-335. Does anyone know what happened to it?

https://blogs.msdn.microsoft.com/junfeng/2005/11/19/module-initializer-a-k-a-module-constructor/ lists what appears to be a section that was intended to follow the Type Initializer section:

CLR v2.0 introduces Module Initializer. It is very similar to type initializer, with the difference that the module initializer is associated with a module, instead of a type. Since the module initializer is not associated with any type, it is a global function.

The following describes module initializer.

1.1.2 Module Initializer

Modules may contain special methods called module initializers to initialize the module itself.

All modules may have a module initializer. This method shall be static, a member of the module, take no parameters, return no value, be marked with rtspecialname and specialname, and be named .cctor.

There are no limitations on what code is permitted in a module initializer. Module initializers are permitted to run and call both managed and unmanaged code.

1.1.2.1 Module Initialization Guarantees

The CLI shall provide the following guarantees regarding module initialization:

  1. The semantics of when, and what triggers execution of module initialization methods, is as follows:

    A. A module may have a module initializer method, or not.

    B. The module’s initializer method is executed at, or sometime before, first access to any types, methods, or data defined in the module

  2. A module initializer shall run exactly once for any given module, unless explicitly called by user code

  3. No method other than those called directly or indirectly from the module initializer will be able to access the types, methods, or data in a module before its initializer completes execution.

Since C# does not support global functions, C# does not support module initializer.

C++/CLI internally uses module initializer, as described in the MSDN VC++ chat script. (http://msdn.microsoft.com/chats/transcripts/vstudio/vstudio_091604.aspx)
[Use https://web.archive.org/web/20080309053655/http://msdn.microsoft.com/chats/transcripts/vstudio/vstudio_091604.aspx]

The following is the ildasm output of the module initializer of msvcm80.dll.

[...]

@jnm2
Copy link
Contributor

jnm2 commented Apr 16, 2020

This probably explains why it's not in ECMA-335.

Host: MartynL (Microsoft)
Q: Is the "module constructor" a CLR feature or a CLI feature?
A: Right now the module constructor has not been added to the CLI as I understand it. Generally our features for mixed mode are not part of the CLI.

https://web.archive.org/web/20080309053655/http://msdn.microsoft.com/chats/transcripts/vstudio/vstudio_091604.aspx

But are any guarantees actually documented anywhere?

@jnm2
Copy link
Contributor

jnm2 commented Apr 16, 2020

This is the only mention of module initializers that I can find in CLR docs:

@jnm2
Copy link
Contributor

jnm2 commented Apr 16, 2020

Given that a lot of us have been dependent on module initializers working for a number of years, and the pain that @gafter wants to address is real, is the next step to submit a PR to add documentation of it to https://github.com/dotnet/runtime/tree/master/docs/design/features?

@jkotas
Copy link
Member

jkotas commented Apr 16, 2020

Yes, I think it would be fine to start a ECMA-335 Augments doc where we will collect additions to ECMA-335 we would like to do, but do not have a good way to do currently. We have number of those.

@jnm2
Copy link
Contributor

jnm2 commented Apr 16, 2020

@jkotas Awesome! Should I start with opening an issue so that someone else can start the document, or should I start with a PR to add e.g. runtime/docs/design/features/ECMA-335-Augments.md with a minimal title and an edited version of the "Module Initializer" section quoted above?

@jkotas
Copy link
Member

jkotas commented Apr 16, 2020

You are welcomed to submit a PR to get it started.

@jkotas
Copy link
Member

jkotas commented Apr 16, 2020

Actually, we have a doc like already: https://github.com/dotnet/runtime/blob/master/src/libraries/System.Reflection.Metadata/specs/Ecma-335-Issues.md . We should build on top of it. Rename it ...Augments, and maybe move this whole directory to a more discoverable place, e.g. docs\design\specs .

@jcouv jcouv changed the title Champion: Module Initializers Champion: Module Initializers (VS 16.8, .NET 5) Sep 1, 2020
@MadsTorgersen MadsTorgersen modified the milestones: 9.0 candidate, 9.0 Sep 9, 2020
@333fred 333fred added the Implemented Needs ECMA Spec This feature has been implemented in C#, but still needs to be merged into the ECMA specification label Oct 16, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Implemented Needs ECMA Spec This feature has been implemented in C#, but still needs to be merged into the ECMA specification Proposal champion
Projects
None yet
Development

No branches or pull requests