Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add JSON type hierarchies #67961

Merged
merged 8 commits into from
Apr 19, 2022

Conversation

eiriktsarpalis
Copy link
Member

@eiriktsarpalis eiriktsarpalis commented Apr 13, 2022

This PR implements polymorphism for System.Text.Json as a preview feature.

Walkthrough of Changes

Contributes to #63747.

@dotnet-issue-labeler
Copy link

Note regarding the new-api-needs-documentation label:

This serves as a reminder for when your PR is modifying a ref *.cs file and adding/modifying public APIs, to please make sure the API implementation in the src *.cs file is documented with triple slash comments, so the PR reviewers can sign off that change.

@ghost
Copy link

ghost commented Apr 13, 2022

Tagging subscribers to this area: @dotnet/area-system-text-json, @gregsdennis
See info in area-owners.md if you want to be subscribed.

Issue Details

This PR implements polymorphism for System.Text.Json as a preview feature.

Walkthrough of Changes

  • Adds the JsonDerivedTypeAttribute and JsonPolymorphicAttribute types for configuring polymorphism using attributes.
  • Add the JsonPolymorphicTypeConfiguration class for configuring polymorphism without attributes.
  • Adds a provisional JsonSerializerOptions.PolymorphicTypeConfiguration property. Used as a temporary configuration entry point until Developers can customize the JSON serialization contracts of their types #63686 has been implemented.
  • Extends the existing polymorphic serialization infrastructure from System.Object types to arbitrary classes and interfaces (cf. JsonConverter.MetadataHandling.cs, WriteStack.cs).
  • Adds infrastructure for polymorphic deserialization. Unlike serialization, dispatch to the derived converter happens in the converter layer (since it can only be determined mid-deserialization). Polymorphic deserialization is implemented in four converters: ObjectDefaultConverter, ObjectWithParameterizedConstructorConverter, JsonCollectionConverter and JsonDictionaryConverter.
  • PolymorphicTypeResolver.cs validates polymorphic type configuration and provides an entrypoint for mapping runtime types/type discriminators to derived JsonTypeInfo metadata.
  • Updates the source generator to support polymorphic attributes: derived types are added to the implicitly registered types of a JsonSerializerContext and a diagnostic warning is emitted for JsonSourceGenerationMode.Serialization.

Contributes to #63747.

Author: eiriktsarpalis
Assignees: eiriktsarpalis
Labels:

area-System.Text.Json, new-api-needs-documentation

Milestone: -

@eiriktsarpalis eiriktsarpalis added this to the 7.0.0 milestone Apr 13, 2022
@steveharter
Copy link
Member

Please share the benchmarks and analysis of that.

@steveharter
Copy link
Member

Is it feasible in 7.0 to add a new converter model that supports this polymorphic feature (and the host of all the other requests?).

@steveharter
Copy link
Member

Is it feasible in 7.0 to support this polymorphic feature with source-gen?

@eiriktsarpalis
Copy link
Member Author

Is it feasible in 7.0 to add a new converter model that supports this polymorphic feature (and the host of all the other requests?).

Yes, this is included in the acceptance criteria for the contract mode (#63686)

Is it feasible in 7.0 to support this polymorphic feature with source-gen?

This PR incorporates support for metadata-based source-gen.

Current.PolymorphicJsonTypeInfo = Current.JsonTypeInfo;
Current.JsonTypeInfo = derivedJsonTypeInfo.PropertyInfoForTypeInfo.JsonTypeInfo;
Current.JsonPropertyInfo = Current.JsonTypeInfo.PropertyInfoForTypeInfo;
Current.NumberHandling ??= Current.JsonPropertyInfo.NumberHandling;
Copy link
Member

@krwq krwq Apr 19, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doesn't this mean the base type will take precedence over derived type? should this be something like:

Current.NumberHandling = Current.JsonPropertyInfo.NumberHandling ?? Current.NumberHandling;

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or just simply Current.NumberHandling = Current.JsonPropertyInfo.EffectiveNumberHandling

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, but that probably needs to consult the parent number handling as happens in the Push operation:

Current.NumberHandling = Parent.NumberHandling ?? Current.JsonPropertyInfo.EffectiveNumberHandling;

Strange, I thought I had changed this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think Current.JsonPropertyInfo.NumberHandling ?? parent.EffectiveNumberHandling - note that EffectiveNumberHandling already takes options, declaring type etc into consideration

Debug.Assert(Current.PolymorphicSerializationState == PolymorphicSerializationState.PolymorphicReEntrySuspended);

// Swap out the two values as we resume the polymorphic converter
(Current.JsonTypeInfo, Current.PolymorphicJsonTypeInfo) = (Current.PolymorphicJsonTypeInfo, Current.JsonTypeInfo);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does that mean the PolymorphicJsonTypeInfo changes semantic meaning to OriginalJsonTypeInfo here? can you add a comment why we do this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Motivation is minimizing the amount of additional reference type fields we're adding to the stackframe struct. Doing this without swapping would require two additional pointers, which would contribute to the further bloating of the struct. The concern is that this would make the Push() and Pop() operations more expensive in a non pay-for-play way. I'll see if I can add a comment explaining this.

[InlineData(typeof(Interface))]
[InlineData(typeof(Class))]
[InlineData(typeof(GenericClass<int>))]
public static void SupportedBaseTypeArgument_ShouldSucced(Type baseType)
Copy link
Member

@krwq krwq Apr 19, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Succeed

Copy link
Member

@krwq krwq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm conditionally approving this:

  • mostly looks good, I've put feedback on whatever I feel might cause any issues but that won't affect functionality and can be addressed later
  • couple of nits
  • please file issue for anything remaining and make sure to address it after snap

@eiriktsarpalis eiriktsarpalis changed the title Add JSON type hierarchies as a preview feature Add JSON type hierarchies Apr 19, 2022
@eiriktsarpalis eiriktsarpalis merged commit bbb389d into dotnet:main Apr 19, 2022
@eiriktsarpalis eiriktsarpalis deleted the json-type-hierarchies branch April 19, 2022 13:49
{
JsonEncodedText jsonEncodedName = JsonEncodedText.Encode(customPropertyName, options.Encoder);

// Check if the property name conflicts with other metadata property names
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we opening ourselves up to breaking user apps in the future if they use a custom metadata tag that's unused in STJ now but then used in future releases?

Copy link
Member Author

@eiriktsarpalis eiriktsarpalis Apr 19, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't think so, this particular code path is only exercised for types that have explicitly opted into polymorphism which is new functionality.

Understood now, we might need to be careful about how to deal with that should new metadata names be added in the future. We might want to relax that check so that it only throws for conflicts in metadata properties that are opted in for the particular type.

directhex pushed a commit to directhex/runtime that referenced this pull request Apr 21, 2022
* Add JSON type hierarchies as a preview feature.

* fix incorrect preview features configuration in ref project

* Add check for conflicting custom type discriminator property names.

* Add null checks & tests to public APIs

* fix typo in constant name

* fix file ordering

* address feedback

* Remove preview features annotation from new APIs
@AndyAyersMS
Copy link
Member

AndyAyersMS commented Apr 27, 2022

@eiriktsarpalis do you expect any perf impact from this? It's possible the regression we see here is related: dotnet/perf-autofiling-issues#4894.

Also regression seen in

Note we may be duplicate filing some of these.

@eiriktsarpalis
Copy link
Member Author

Performance regressions as a result of this change are likely (and potentially unavoidable), particularly on the deserialization side of things. When running benchmarks locally I wasn't able to isolate any clear-cut regressions, but I'll investigate this and any other reports coming from the autofiler.

@dakersnar
Copy link
Contributor

This might be the cause of some more regressions on all configs (see details).
Reported here: dotnet/perf-autofiling-issues#4786

image

System.Text.Json.Tests.Perf_Get.GetBoolean

Result Ratio Operating System Bit Processor Name
Slower 0.71 ubuntu 20.04 Arm64 Unknown processor
Slower 0.70 Windows 10 Arm64 Microsoft SQ1 3.0 GHz
Slower 0.69 Windows 11 Arm64 Microsoft SQ1 3.0 GHz
Slower 0.71 Windows 11 Arm64 Unknown processor
Slower 0.61 macOS Monterey 12.3 Arm64 Apple M1 Max
Slower 0.53 Windows 10 X64 Intel Xeon Platinum 8272CL CPU 2.60GHz
Slower 0.72 Windows 10 X64 Intel Xeon CPU E5-1650 v4 3.60GHz
Slower 0.62 Windows 10 X64 Intel Core i7-6700 CPU 3.40GHz (Skylake)
Slower 0.86 Windows 11 X64 AMD Ryzen Threadripper PRO 3945WX 12-Cores
Slower 0.80 Windows 11 X64 AMD Ryzen 9 5950X
Slower 0.60 Windows 11 X64 Intel Core i7-8700 CPU 3.20GHz (Coffee Lake)
Slower 0.61 Windows 11 X64 Intel Core i9-9900T CPU 2.10GHz
Slower 0.74 alpine 3.13 X64 Intel Core i7-7700 CPU 3.60GHz (Kaby Lake)
Slower 0.66 pop 22.04 X64 Intel Core i7-6600U CPU 2.60GHz (Skylake)
Slower 0.69 ubuntu 18.04 X64 Intel Xeon CPU E5-1650 v4 3.60GHz
Slower 0.84 ubuntu 18.04 X64 Intel Core i7-2720QM CPU 2.20GHz (Sandy Bridge)
Slower 0.69 ubuntu 20.04 X64 AMD Ryzen 9 5900X
Slower 0.75 ubuntu 20.04 X64 Intel Core i7-8700 CPU 3.20GHz (Coffee Lake)
Slower 0.71 Windows 10 X86 Intel Xeon CPU E5-1650 v4 3.60GHz
Slower 0.42 Windows 10 X86 Intel Core i7-6700 CPU 3.40GHz (Skylake)
Slower 0.69 Windows 11 X86 AMD Ryzen Threadripper PRO 3945WX 12-Cores
Slower 0.68 macOS Monterey 12.2.1 X64 Intel Core i7-5557U CPU 3.10GHz (Broadwell)
Slower 0.68 macOS Monterey 12.3.1 X64 Intel Core i7-4870HQ CPU 2.50GHz (Haswell)

@eiriktsarpalis
Copy link
Member Author

It's unlikely that these changes would impact the specific microbenchmarks, since they're measuring the Utf8JsonReader layer that has not been touched by this PR. Could it be related to a codegen change?

@AndyAyersMS
Copy link
Member

Could it be related to a codegen change?

That's always a possibility, so let's dig in a bit.

Recent history for DeserializeFromUtf8Bytes shows what looks like two regressions for windows x64

newplot - 2022-05-13T093847 624

The first of these on 4/19 corresponds to 544f720...4881a63

Looking over that list again there seem to be two JSON related changes; none of the others seem relevant.

The second regression on 4/26-27 is harder to pin down, it looks like we didn't have great perf coverage and so there is a wide commit range: a077f27...9342649

If we look at other platforms to see if they also regressed, we see

Unix x64 -- similar pattern but missing data around the point of the first regression

newplot - 2022-05-13T094919 864

Ubuntu arm64 probably too noisy to tell, also an outage
newplot - 2022-05-13T095205 016

Windows arm64 sees the first regression, commit range f1e10a5...6c7ce85 overlaps with windows x64, and contains both those JSON changes
newplot - 2022-05-13T095332 298

So it sure seems like it's one of those two changes causing that first regression. The second one I'm not sure about.

@eiriktsarpalis
Copy link
Member Author

I very much agree that this PR is most probably responsible for some of the regressions in the Deserialize methods. My remark concerned the regressions noted in #67961 (comment), which test components lower in the stack.

@AndyAyersMS
Copy link
Member

AndyAyersMS commented May 13, 2022

Ah, ok. Let's look at that case.

GetUInt16 is fine, no regression:
newplot - 2022-05-13T111942 474

GetBoolean likewise does not show regression

newplot - 2022-05-13T112127 291

Looking at the auto-filing report, I see that it's actually measuring Mono perf and not CoreCLR. So nothing to worry about from this second issue.

Sorry for the false alarm.

@dakersnar
Copy link
Contributor

Yes, it looks like this specific PR was not the cause of this.

The data I was analyzing was comparing preview 3 (mid march) to preview 4 (mid april), but looking at the larger test history for CoreCLR it looks like the regression was either already solved or this was just noise. The fact that I was seeing regressions across all configs makes me assume it was a real regression, but regardless it looks to be solved now.
image

@ghost ghost locked as resolved and limited conversation to collaborators Jun 15, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants