Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[EnC] CustomAttribute rows in delta seem out of order #60125

Closed
lambdageek opened this issue Mar 11, 2022 · 6 comments
Closed

[EnC] CustomAttribute rows in delta seem out of order #60125

lambdageek opened this issue Mar 11, 2022 · 6 comments
Labels
untriaged Issues and PRs which have not yet been triaged by a lead

Comments

@lambdageek
Copy link
Member

Version Used: using https://github.com/dotnet/hotreload-utils 1.0.2-alpha.0.22157.2 which uses roslyn 4.2.0-3.22156.4

Steps to Reproduce:

Compile the following baseline and apply changes below, then look at the metadata delta with mdv

Baseline:

// Licensed to the .NET Foundation under one or more agreements.
// The .NET Foundation licenses this file to you under the MIT license.
using System;


namespace System.Reflection.Metadata.ApplyUpdate.Test
{
    public class AddNestedClass
    {
        public AddNestedClass()
        {
        }

        public string TestMethod()
        {
            return "123";
        }

    }
}

Modification:

// Licensed to the .NET Foundation under one or more agreements.
// The .NET Foundation licenses this file to you under the MIT license.
using System;


namespace System.Reflection.Metadata.ApplyUpdate.Test
{
    public class AddNestedClass
    {
        public AddNestedClass()
        {
        }

        public string TestMethod()
        {
            var n = new Nested();
	    n.Evt += new Action<string> (n.DefaultHandler);
            return n.buf;
        }

        private class Nested {
            public Nested() { }
	    public event Action<string> Evt;

	    public void RaiseEvt () {
		Evt ("789");
	    }

	    public string buf;

	    public void DefaultHandler (string s) {
		this.buf = s;
	    }
        }
    }
}

Compile with nullability enabled.

The mdv output is attached:
add-event.mdv.txt

Expected Behavior:

the custom attribute with parent 0x06000007 in both the baseline and Generation 1 will be at row 0x1a

Actual Behavior:

The EnCMap says:

1c: 0x0c00001a (CustomAttribute)  0    0x00001a  update  

So row 0x1a of the Custom Attributes table is updated by row 1 from Generation 1:

1: 0x04000004 (Field)      0x0a000012 (MemberRef)  01-00-00-00 (#356/7e)              

which has a completely different parent and .ctor

The actual update to Parent 0x06000007 is in row 3 of the custom attribute table:

3: 0x06000007 (MethodDef)  0x06000004 (MethodDef)  01-00-01-00-00 (#350/78)           

Note that the EnCLog is in a kind of funny order (field attributes before method attributes?)

2f: 0x0c00001b (CustomAttribute)  0                          
30: 0x0c00001c (CustomAttribute)  0                          
31: 0x0c00001a (CustomAttribute)  0                          
32: 0x0c00001d (CustomAttribute)  0                          
33: 0x0c00001e (CustomAttribute)  0                          

Why it's a problem:

I think CoreCLR will accept this as is, but for Mono we expect row updates (like Gen 1 row 1) to have the same Parent as the previous generation, or to set the Parent to 0 (for a deletion). Updates (as opposed to row additions) shouldn't invalidate our assumption that all custom attributes for a given Parent are contiguous.

@dotnet-issue-labeler
Copy link

I couldn't figure out the best area label to add to this issue. If you have write-permissions please help me learn by adding exactly one area label.

@dotnet-issue-labeler dotnet-issue-labeler bot added the untriaged Issues and PRs which have not yet been triaged by a lead label Mar 11, 2022
@lambdageek
Copy link
Member Author

/cc @tmat @davidwengier

@lambdageek
Copy link
Member Author

In case it's not clear, I think the custom attribute rows in Gen 1 should be like this:

1: 0x06000007 (MethodDef)  0x06000004 (MethodDef)  01-00-01-00-00 (#350/78)           
2: 0x04000004 (Field)      0x0a000012 (MemberRef)  01-00-00-00 (#356/7e)              
3: 0x04000004 (Field)      0x0a000013 (MemberRef)  01-00-00-00-00-00-00-00 (#35b/83)  
4: 0x06000009 (MethodDef)  0x0a000012 (MemberRef)  01-00-00-00 (#356/7e)              
5: 0x0600000a (MethodDef)  0x0a000012 (MemberRef)  01-00-00-00 (#356/7e)              

with EnCLog and EnCMap in the current order.

So row 1 is an update to row 0x1a of Gen0 and rows 2-5 define the added rows 0x1b-0x1e.

@davidwengier
Copy link
Contributor

davidwengier commented Mar 15, 2022

our assumption that all custom attributes for a given Parent are contiguous.

I can see how this could be thought to be wrong, but even if we changed the order of the CA table in Gen 1 to what you suggested, what would happen if you add another attribute to TestMethod (0x06000007) in Gen 2? Or if there where two methods in the original, each with one attribute? Or an attribute on the class, and then you added another attribute in Gen 1? I don't think its possible to emit deltas that can be applied without re-sorting the table, and keep all parents in the CA table contiguous. We would have to leave huge gaps in the CA table that we fill in later, or something.

Additionally the custom attribute table in Gen 1 still has to adhere to the rules of the custom attribute table, namely that it's sorted by parent, so your suggested layout violates the spec AFAIK.

@davidwengier
Copy link
Contributor

TL;DR isn't this an accidentally contrived example, because you only have one method and it just happens to be at the end of the custom attribute table in gen 0?

@lambdageek
Copy link
Member Author

it's sorted by parent, so your suggested layout violates the spec AFAIK.

Thinking about this a little more, I don't think there's any layout that won't violate some aspect of the spec - as long as CA additions are allowed, we either lose contiguity or the sorting property or both. So I think once there are any CA table edits (either additions or modifications) the runtime has to fall back to a linear scan of the whole table in order to accurately discover all the attributes.

So I think roslyn behavior is probably okay here, however I was surprised that CA table modifications can change the parent of a row from one generation to the next. But it's probably just something I'll have to deal with in the runtime.

lambdageek added a commit to lambdageek/runtime that referenced this issue Mar 16, 2022
allow CustomAttribute row modifications to change Parent, while
dotnet/roslyn#60125 is being sorted out
lambdageek added a commit to lambdageek/runtime that referenced this issue Mar 17, 2022
allow CustomAttribute row modifications to change Parent, while
dotnet/roslyn#60125 is being sorted out
lambdageek added a commit to dotnet/runtime that referenced this issue Mar 18, 2022
Fully implement support for the `NewTypeDefinition` EnC capability: a hot reload edit can add a new type definition (class, struct, interface, enum) that contains new fields, methods, properties, events, and nested classes, and can have generic params.  Also add reflection support for all of the above.

This is sufficient to support hot reload of types tagged with [`CreateNewOnMetadataUpdateAttribute`](https://docs.microsoft.com/en-us/dotnet/api/system.runtime.compilerservices.createnewonmetadataupdateattribute?view=net-6.0).

Contributes to #63643 and #57365

---

The implementation introduces `MonoAddedDefSkeleton` which is a bookkeeping structure that records the metadata indexes of where to find methods, fields, properties and events for a newly-added class.  The issue is that normally `mono_class_create_from_typedef` expects to find all that stuff in a contiguous block starting at the row pointed by the `MONO_TYPEDEF_FIELD_LIST` or `MONO_TYPEDEF_METHOD_LIST` columns of the new typedef.  But in EnC deltas, those columns are zeroed out, and instead the EnCLog functions like `ENC_FUNC_ADD_METHOD` and `ENC_FUNC_ADD_FIELD` explicitly record when a new row is added that is relevant to a particular type definition.

So during the pass over the EnCLog, we start creating skeletons when we see a new row added to the typedef table.  Then when we see the add functions, we record the row indices of the field, method, etc tables.  As far as I can tell, the rows for a particular type definition are contiguous, so we just record the first index and the count.

At the end of the pass, we save the skeletons on the metadata delta, and when code finally creates the `MonoClass` for a particular freshly-added type definition, we check the skeleton and use the recorded rows to populate the fields and methods.  Event and property data is created on-demand (the demand being reflection) in basically the same way.

One important note: we try very hard not to accidentally materialize a newly-added class as a `MonoClass` during the update itself - we need to at least get through the entire EnCLog - otherwise we might not see every field/method/etc and set up the class incorrectly.

The upshot is that the newly-added `MonoClass` behaves like any other "normal" class - all its fields are laid out normally (they're stored in the object, not in a separate hash table).  It can be generic.  It can have base types, interfaces, etc.

This is different from how added fields, props and events will be implemented on existing classes - we won't modify `MonoClass:fields` or the `MonoClassPropertyInfo:properties` arrays - once they have been allocated, we don't want to mess with them.  Instead additions to existing classes with create `MonoClassMetadataUpdateField` (`MonoClassMetadataUpdateProperty` and `MonoClassMetadataUpdateEvent`) structs that are going to be stored on the `MonoClassMetadataUpdateInfo` that's associated with each `MonoClass`.  The various iterators like `mono_class_get_fields` and `mono_class_get_props` will be modified to return the added fields/props/etc after the existing ones.

This is already implemented for fields.  Props and Events will be part of a separate effort to implement reflection support for them.

---

* Checkpoint. Infrastructure for added types

* register member->parent lookups for newly-added classes and members

* fix off by one error creating class from skeleton

* AddNestedClass test works with mono now; also make it generic

* checkpoint: adding properties

   it "works" but only because we don't look very hard for the new properties anywhere.  reflection is probably pretty broken.

* Add a property to AddNestedClass test

* remove allow class/field/method/property ifdefs

   keep the instance field ifdef for now

* add event to AddNestedClass

* add DISALLOW_BROKEN_PARENT ifdef

   allow CustomAttribute row modifications to change Parent, while dotnet/roslyn#60125 is being sorted out

* checkpoint: adding events

* Add new test ReflectionAddNewType

* Add new types to the image name cache

* Assembly.GetTypes and additional tests

* Make nested types work

* GetInterfaces on new types; also Activator.CreateInstance

* Mark mono_class_get_nested_classes_property as a component API

   also mono_class_set_nested_classes_property

* Add GetMethods, GetFields, GetMethod, GetField test

* [class] Implement added method iteration

   Change the iterator from storing MonoMethod** values to storing an iteration count. For added methods, when the iteration count is more than mono_class_get_method_count, run through the hot reload added_members list and
iterate over any relevant methods

* Implement added field iteration

* Add a GetField(BindingFlags.Static) test case

* Add Enum.GetNames and GetProperties tests - not working yet

* Mark mono_class_get_method_count as a component API

   also mono_class_get_field_count

* Enum values work

* Use GSList for added_fields (and props, events); add from_update bit

   Use GSList to simplify the concurrency story for accessing added_fields (and added_props and added_events, eventually): unlike a GPtrArray it won't resize, so we don't need to lock readers.

   Add a from_update bit to MonoProperty and MonoEvent - when props or events are added to existing classes, they will have the bit set.  That means that pointer arithmetic to figure out the prop (or event) tokens won't be usable (since
they're not allocated in one big block).

* Reflection on props in new classes works

* events on new classes work

* Add CustomAttribute reflection test

* remove test for props on existing type - it's not part of this PR

* instance field additions to existing types aren't supported yet

* rm TODO and FIXME

* store prop and event from_update bit in attrs

   The upper 16 bits aren't used for ECMA flags, so grab a bit for metadata-update

* remove instance fields from reflection test

   since Mono doesn't support them yet

* make the Browser EAT lanes happy

   Add some fields to the baseline versions of some test assemblies to keep some types that are used in the deltas from getting trimmed away
radekdoulik pushed a commit to radekdoulik/runtime that referenced this issue Mar 30, 2022
Fully implement support for the `NewTypeDefinition` EnC capability: a hot reload edit can add a new type definition (class, struct, interface, enum) that contains new fields, methods, properties, events, and nested classes, and can have generic params.  Also add reflection support for all of the above.

This is sufficient to support hot reload of types tagged with [`CreateNewOnMetadataUpdateAttribute`](https://docs.microsoft.com/en-us/dotnet/api/system.runtime.compilerservices.createnewonmetadataupdateattribute?view=net-6.0).

Contributes to dotnet#63643 and dotnet#57365

---

The implementation introduces `MonoAddedDefSkeleton` which is a bookkeeping structure that records the metadata indexes of where to find methods, fields, properties and events for a newly-added class.  The issue is that normally `mono_class_create_from_typedef` expects to find all that stuff in a contiguous block starting at the row pointed by the `MONO_TYPEDEF_FIELD_LIST` or `MONO_TYPEDEF_METHOD_LIST` columns of the new typedef.  But in EnC deltas, those columns are zeroed out, and instead the EnCLog functions like `ENC_FUNC_ADD_METHOD` and `ENC_FUNC_ADD_FIELD` explicitly record when a new row is added that is relevant to a particular type definition.

So during the pass over the EnCLog, we start creating skeletons when we see a new row added to the typedef table.  Then when we see the add functions, we record the row indices of the field, method, etc tables.  As far as I can tell, the rows for a particular type definition are contiguous, so we just record the first index and the count.

At the end of the pass, we save the skeletons on the metadata delta, and when code finally creates the `MonoClass` for a particular freshly-added type definition, we check the skeleton and use the recorded rows to populate the fields and methods.  Event and property data is created on-demand (the demand being reflection) in basically the same way.

One important note: we try very hard not to accidentally materialize a newly-added class as a `MonoClass` during the update itself - we need to at least get through the entire EnCLog - otherwise we might not see every field/method/etc and set up the class incorrectly.

The upshot is that the newly-added `MonoClass` behaves like any other "normal" class - all its fields are laid out normally (they're stored in the object, not in a separate hash table).  It can be generic.  It can have base types, interfaces, etc.

This is different from how added fields, props and events will be implemented on existing classes - we won't modify `MonoClass:fields` or the `MonoClassPropertyInfo:properties` arrays - once they have been allocated, we don't want to mess with them.  Instead additions to existing classes with create `MonoClassMetadataUpdateField` (`MonoClassMetadataUpdateProperty` and `MonoClassMetadataUpdateEvent`) structs that are going to be stored on the `MonoClassMetadataUpdateInfo` that's associated with each `MonoClass`.  The various iterators like `mono_class_get_fields` and `mono_class_get_props` will be modified to return the added fields/props/etc after the existing ones.

This is already implemented for fields.  Props and Events will be part of a separate effort to implement reflection support for them.

---

* Checkpoint. Infrastructure for added types

* register member->parent lookups for newly-added classes and members

* fix off by one error creating class from skeleton

* AddNestedClass test works with mono now; also make it generic

* checkpoint: adding properties

   it "works" but only because we don't look very hard for the new properties anywhere.  reflection is probably pretty broken.

* Add a property to AddNestedClass test

* remove allow class/field/method/property ifdefs

   keep the instance field ifdef for now

* add event to AddNestedClass

* add DISALLOW_BROKEN_PARENT ifdef

   allow CustomAttribute row modifications to change Parent, while dotnet/roslyn#60125 is being sorted out

* checkpoint: adding events

* Add new test ReflectionAddNewType

* Add new types to the image name cache

* Assembly.GetTypes and additional tests

* Make nested types work

* GetInterfaces on new types; also Activator.CreateInstance

* Mark mono_class_get_nested_classes_property as a component API

   also mono_class_set_nested_classes_property

* Add GetMethods, GetFields, GetMethod, GetField test

* [class] Implement added method iteration

   Change the iterator from storing MonoMethod** values to storing an iteration count. For added methods, when the iteration count is more than mono_class_get_method_count, run through the hot reload added_members list and
iterate over any relevant methods

* Implement added field iteration

* Add a GetField(BindingFlags.Static) test case

* Add Enum.GetNames and GetProperties tests - not working yet

* Mark mono_class_get_method_count as a component API

   also mono_class_get_field_count

* Enum values work

* Use GSList for added_fields (and props, events); add from_update bit

   Use GSList to simplify the concurrency story for accessing added_fields (and added_props and added_events, eventually): unlike a GPtrArray it won't resize, so we don't need to lock readers.

   Add a from_update bit to MonoProperty and MonoEvent - when props or events are added to existing classes, they will have the bit set.  That means that pointer arithmetic to figure out the prop (or event) tokens won't be usable (since
they're not allocated in one big block).

* Reflection on props in new classes works

* events on new classes work

* Add CustomAttribute reflection test

* remove test for props on existing type - it's not part of this PR

* instance field additions to existing types aren't supported yet

* rm TODO and FIXME

* store prop and event from_update bit in attrs

   The upper 16 bits aren't used for ECMA flags, so grab a bit for metadata-update

* remove instance fields from reflection test

   since Mono doesn't support them yet

* make the Browser EAT lanes happy

   Add some fields to the baseline versions of some test assemblies to keep some types that are used in the deltas from getting trimmed away
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
untriaged Issues and PRs which have not yet been triaged by a lead
Projects
None yet
Development

No branches or pull requests

2 participants