Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Thread starvation on high load with System.Text.Json #40072

Closed
berndku opened this issue Jul 29, 2020 · 23 comments
Closed

Thread starvation on high load with System.Text.Json #40072

berndku opened this issue Jul 29, 2020 · 23 comments
Assignees
Labels
area-System.Text.Json code-analyzer Marks an issue that suggests a Roslyn analyzer documentation Documentation bug or enhancement, does not impact product or test code in-pr There is an active PR which will close this issue when it is merged tenet-performance Performance related issue
Milestone

Comments

@berndku
Copy link

berndku commented Jul 29, 2020

Description

After changing the serializer from Newtonsoft to System.Text.Json in a streaming application, where a high number of objects are serialized concurrently, we observe a thread starvation issue.

After a few hours under heavy load the number of native threads in the process increase slowly until ~500, which results in severe throughput degradation of the application. As a consequence we had to switch back to Newtonsoft Json.

A memory dump revealed that most of the threads wait on a critical section, see attached data.

Configuration

  • Net Core 3.1, x64
  • Windows Server 2016
  • Azure D4 VM (8 Cores, 32 GB RAM)

Data

image

Callstack:
ntdll!NtWaitForAlertByThreadId+14
ntdll!RtlpWaitOnAddressWithTimeout+43
ntdll!RtlpWaitOnAddress+b2
ntdll!RtlpWaitOnCriticalSection+db
ntdll!RtlpEnterCriticalSectionContended+cc
ntdll!RtlEnterCriticalSection+40
coreclr!CEEInfo::reportInliningDecision+51a [f:\workspace_work\1\s\src\vm\jitinterface.cpp @ 8292 + 87] f:\workspace_work\1\s\src\vm\jitinterface.cpp @ 8292 + 87
clrjit!InlineResult::Report+de [f:\workspace_work\1\s\src\jit\inline.cpp @ 749] f:\workspace_work\1\s\src\jit\inline.cpp @ 749
clrjit!Compiler::fgInline+3b0 [f:\workspace_work\1\s\src\jit\flowgraph.cpp @ 22013 + a] f:\workspace_work\1\s\src\jit\flowgraph.cpp @ 22013 + a
clrjit!Compiler::fgMorph+131 [f:\workspace_work\1\s\src\jit\morph.cpp @ 17025] f:\workspace_work\1\s\src\jit\morph.cpp @ 17025
clrjit!Compiler::compCompile+4f2 [f:\workspace_work\1\s\src\jit\compiler.cpp @ 4483] f:\workspace_work\1\s\src\jit\compiler.cpp @ 4483
clrjit!Compiler::compCompileHelper+296 [f:\workspace_work\1\s\src\jit\compiler.cpp @ 6035] f:\workspace_work\1\s\src\jit\compiler.cpp @ 6035
clrjit!Compiler::compCompile+21d [f:\workspace_work\1\s\src\jit\compiler.cpp @ 5367 + 20] f:\workspace_work\1\s\src\jit\compiler.cpp @ 5367 + 20
clrjit!jitNativeCode+273 [f:\workspace_work\1\s\src\jit\compiler.cpp @ 6663 + 53] f:\workspace_work\1\s\src\jit\compiler.cpp @ 6663 + 53
clrjit!CILJit::compileMethod+92 [f:\workspace_work\1\s\src\jit\ee_il_dll.cpp @ 319] f:\workspace_work\1\s\src\jit\ee_il_dll.cpp @ 319
coreclr!invokeCompileMethod+dd [f:\workspace_work\1\s\src\vm\jitinterface.cpp @ 12496 + 6e] f:\workspace_work\1\s\src\vm\jitinterface.cpp @ 12496 + 6e
coreclr!CallCompileMethodWithSEHWrapper+50 [f:\workspace_work\1\s\src\vm\jitinterface.cpp @ 12550 + 25] f:\workspace_work\1\s\src\vm\jitinterface.cpp @ 12550 + 25
coreclr!UnsafeJitFunction+96b [f:\workspace_work\1\s\src\vm\jitinterface.cpp @ 13039 + 6a] f:\workspace_work\1\s\src\vm\jitinterface.cpp @ 13039 + 6a
coreclr!MethodDesc::JitCompileCodeLocked+2f9 [f:\workspace_work\1\s\src\vm\prestub.cpp @ 965 + 5c] f:\workspace_work\1\s\src\vm\prestub.cpp @ 965 + 5c
coreclr!MethodDesc::JitCompileCodeLockedEventWrapper+32d [f:\workspace_work\1\s\src\vm\prestub.cpp @ 827 + 18] f:\workspace_work\1\s\src\vm\prestub.cpp @ 827 + 18
coreclr!MethodDesc::JitCompileCode+15f [f:\workspace_work\1\s\src\vm\prestub.cpp @ 773 + f] f:\workspace_work\1\s\src\vm\prestub.cpp @ 773 + f
coreclr!MethodDesc::PrepareILBasedCode+185 [f:\workspace_work\1\s\src\vm\prestub.cpp @ 432 + b] f:\workspace_work\1\s\src\vm\prestub.cpp @ 432 + b
coreclr!MethodDesc::PrepareInitialCode+40 [f:\workspace_work\1\s\src\vm\prestub.cpp @ 321] f:\workspace_work\1\s\src\vm\prestub.cpp @ 321
coreclr!MethodDesc::DoPrestub+44b [f:\workspace_work\1\s\src\vm\prestub.cpp @ 2025 + 8] f:\workspace_work\1\s\src\vm\prestub.cpp @ 2025 + 8
coreclr!PreStubWorker+24c [f:\workspace_work\1\s\src\vm\prestub.cpp @ 1781 + d] f:\workspace_work\1\s\src\vm\prestub.cpp @ 1781 + d
coreclr!ThePreStub+55 [F:\workspace_work\1\s\src\vm\amd64\ThePreStubAMD64.asm @ 22] F:\workspace_work\1\s\src\vm\amd64\ThePreStubAMD64.asm @ 22
System.Text.Json.JsonPropertyInfoCommon4[[System.__Canon, System.Private.CoreLib],[System.__Canon, System.Private.CoreLib],[System.__Canon, System.Private.CoreLib],[System.__Canon, System.Private.CoreLib]].GetValueAsObject(System.Object)+23 0x000001a7bb9197a8
0x000001a7bb9197a8 0x000001a77e92afc8
0x0000006e0a6bd070 0x000001a7820ab460
System.Text.Json.JsonSerializer.HandleEnumerable(System.Text.Json.JsonClassInfo, System.Text.Json.JsonSerializerOptions, System.Text.Json.Utf8JsonWriter, System.Text.Json.WriteStack ByRef)+33
0x0000006e0a6bd030 0x000001a77fa9e6f8
System.Text.Json.JsonSerializer.Write(System.Text.Json.Utf8JsonWriter, Int32, Int32, System.Text.Json.JsonSerializerOptions, System.Text.Json.WriteStack ByRef)+33c
System.Text.Json.JsonSerializer.WriteCore(System.Text.Json.Utf8JsonWriter, System.Object, System.Type, System.Text.Json.JsonSerializerOptions)+b9
System.Text.Json.JsonSerializer.WriteCore(System.Text.Json.PooledByteBufferWriter, System.Object, System.Type, System.Text.Json.JsonSerializerOptions)+86
System.Text.Json.JsonSerializer.WriteCoreBytes(System.Object, System.Type, System.Text.Json.JsonSerializerOptions)+7a

Analysis

Serialization code in application looks like this:

  var options = new JsonSerializerOptions
  {
        Converters = { new CustomJsonStringEnumConverter() },
        IgnoreNullValues = true
   };
return System.Text.Json.JsonSerializer.SerializeToUtf8Bytes<Message>(toSerialize, options);

The code for CustomJsonStringEnumConverter can be found here.

@berndku berndku added the tenet-performance Performance related issue label Jul 29, 2020
@Dotnet-GitSync-Bot Dotnet-GitSync-Bot added area-System.Text.Json untriaged New issue has not been triaged by the area owner labels Jul 29, 2020
@stephentoub
Copy link
Member

Your repro is creating a new options instance each time. Does the problem go away if you instead cache and reuse a single instance of it?

@berndku
Copy link
Author

berndku commented Jul 30, 2020

Thanks, after changing the options to a single instance the issue could not be observed any longer during a 2 hour load test.

@berndku berndku closed this as completed Jul 30, 2020
@stephentoub stephentoub reopened this Jul 30, 2020
@stephentoub
Copy link
Member

@steveharter, can we handle this better? While we encourage folks to cache, not doing so leads to really bad perf with all the code generation and jit'ing.

@jkotas
Copy link
Member

jkotas commented Jul 30, 2020

Add analyzer for this?

@layomia layomia removed the untriaged New issue has not been triaged by the area owner label Jul 31, 2020
@layomia layomia added this to the 5.0.0 milestone Jul 31, 2020
@scalablecory
Copy link
Contributor

This is HttpClient all over again, I would find a better solution than an analyzer.

@steveharter steveharter added the documentation Documentation bug or enhancement, does not impact product or test code label Aug 4, 2020
@steveharter
Copy link
Member

steveharter commented Aug 4, 2020

Add analyzer for this?

That would be good.

@steveharter, can we handle this better? While we encourage folks to cache, not doing so leads to really bad perf with all the code generation and jit'ing.

We should doc this better. Changing to doc issue.

If just the default options are used (no explicit options instance passed in) then the default options instance is automatically cached. It is also cached under ASP.NET. Also we plan on making it easier in 6.0 to specify the default options as well, so one can use the serialization APIs that don't take the options instance but still specify a custom options instance and its location.

Also, the stack appears to enter the critical section on calling the "getter" delegate. This is likely trying to JIT the IL that the serializer emitted to call the getter. With the code-gen effort that will release for 6.0, there will be no IL generated for the getter so the code for the POCOs are generated ahead-of-time, that should avoid the critical section plus the startup time will significantly improve. However, even with that, there is metadata that needs to be built up and cached, so caching of the options is still recommended even with code-gen.

@steveharter
Copy link
Member

Also dup of #38982

@layomia
Copy link
Contributor

layomia commented Aug 7, 2020

Triage: moving this issue to 6.0

For 5.0 we should document best practices for using JsonSerializerOptions and potential effects of not doing so: dotnet/docs#19878.

Mitigations to consider in 6.0/future:

@layomia layomia modified the milestones: 5.0.0, 6.0.0 Aug 7, 2020
@layomia layomia added the code-analyzer Marks an issue that suggests a Roslyn analyzer label Aug 7, 2020
@stephentoub
Copy link
Member

In 6.0/future we should consider adding a code analyzer for this and see how far allowing users to change the default options (#31094) and the code gen effort (#1568) take us in mitigating this issue.

We might want to consider additional mitigations, such as a global MRU cache. The issue as it stands today is pretty bad.

@layomia
Copy link
Contributor

layomia commented Sep 13, 2020

We might want to consider additional mitigations, such as a global MRU cache. The issue as it stands today is pretty bad.

Updated triage above with this comment. Tracking issue - #35084.

@GSPP
Copy link

GSPP commented Oct 4, 2020

Just to add some customer feedback:

I only learned that I needed to cache JsonSerializerOptions when profiling showed really bad performance. Normally, I'd have expected an options object to be a value object (a DTO). I did not expect it to hold very heavy data structures. It seems like a performance trap that is easy to fall into.

@layomia
Copy link
Contributor

layomia commented Jul 23, 2021

Triage - for .NET 6.0, we've provided documentation about perf caveats with using JsonSerializerOptions. We can consider additional mitigations for this in 7.

@layomia layomia removed this from the 6.0.0 milestone Jul 23, 2021
@layomia layomia added this to the 7.0.0 milestone Jul 23, 2021
@eiriktsarpalis
Copy link
Member

In 6.0/future we should consider adding a code analyzer for this and see how far allowing users to change the default options (#31094) and the code gen effort (#1568) take us in mitigating this issue.

We might want to consider additional mitigations, such as a global MRU cache. The issue as it stands today is pretty bad.

Wouldn't this require defining a notion of structural equality for JsonSerializerOptions instances? I'm not sure there's a bullet-proof way to do this, consider the following common anti-pattern:

for (int i = 0; i < 1_000_000; i++)
{
    var options = new JsonSerializerOptions { Converters = { new MyCoolCustomConverter() } };
    JsonSerializer.Serialize(values[i], options);
}

How can we determine if two lists of custom converters are equivalent? A conservative approach might be to check for reference equality but in this case it would clearly result in false negatives. We certainly want to avoid false positives as we'd be exposing ourselves to concurrency bugs impacting serialization settings.

I'm also not a huge fan of authoring mitigating APIs a la HttpClientFactory, since it doesn't address what I think is the core problem: making developers aware of the flaw. Once the developer becomes aware of the issue with JsonSerializerOptions, applying their own mitigation becomes a trivial code change: store as a static, use DI, lift the options instantiation outside of a hot loop.

@stephentoub
Copy link
Member

Wouldn't this require defining a notion of structural equality for JsonSerializerOptions instances?

Why? We generate IL for some Type. A cache could be at the type level, keyed off Type identity plus any other values we use as part of the IL generation. It also needn't be perfect, just good enough to mitigate the pit of failure it currently is.

Solution needn't be this if there's something better. But a solution is needed.

@eiriktsarpalis
Copy link
Member

eiriktsarpalis commented Jan 27, 2022

We could definitely try caching the generated IL globally, but I'm assuming that substantial part of the cost stems from all the reflection code run at converter construction time which we can't avoid unless we can determine that the converter caches scoped to a particular options instance are functionally identical.

@stephentoub
Copy link
Member

stephentoub commented Jan 27, 2022

I'm assuming that substantial part of the cost stems from all the reflection code run at converter construction time

That may contribute as well, but all the stacks from all the repros I've seen are due to JIT'ing IL over and over and over.

which we can't avoid unless we can determine that the converter caches scoped to a particular options instance are functionally identical

Like I said, it doesn't need to be perfect. I bet you'd get 90% of the way there if you just handled default cases where there's nothing custom in the mix, e.g. only using the default set of converters we supply.

@eiriktsarpalis eiriktsarpalis self-assigned this Jan 27, 2022
@eiriktsarpalis
Copy link
Member

Like I said, it doesn't need to be perfect. I bet you'd get 90% of the way there if you just handled default cases where there's nothing custom in the mix, e.g. only using the default set of converters we supply.

My only concern is that converters are one of the most common customizations applied to serialization. For the X% of users that modify anything but converters, we could be setting them up for major performance regressions just because they happen to apply a custom converter in later iterations of their app. I'd err towards predictability here.

@stephentoub
Copy link
Member

I'd err towards predictability here.

Not to be snarky, but the current situation is "predictably" bad. The cliff devs fall of off just by doing new JsonSerializerOptions() is abysmal, and there's nothing to suggest that's not a totally fine thing to do. A quick grep.app shows tons of places where new instances are created and converters aren't customized. There are 68 pages of search results for just new JsonSerializerOptions(): here are some problematic uses I see just from the first couple pages, and skipping over tests and the like, that don't involve custom converters:
https://github.com/pnp/powershell/blob/ea27996fad86082cfb0fe4a38a91c8b95166d2d6/src/Commands/Utilities/REST/RestHelper.cs#L21
https://github.com/pnp/powershell/blob/ea27996fad86082cfb0fe4a38a91c8b95166d2d6/src/Commands/Utilities/REST/RestHelper.cs#L85
https://github.com/pnp/powershell/blob/ea27996fad86082cfb0fe4a38a91c8b95166d2d6/src/Commands/Utilities/REST/GraphHelper.cs#L33
https://github.com/pnp/powershell/blob/ea27996fad86082cfb0fe4a38a91c8b95166d2d6/src/Commands/Utilities/REST/GraphHelper.cs#L128
https://github.com/pnp/powershell/blob/ea27996fad86082cfb0fe4a38a91c8b95166d2d6/src/Commands/Utilities/REST/GraphHelper.cs#L179
https://github.com/pnp/powershell/blob/ea27996fad86082cfb0fe4a38a91c8b95166d2d6/src/Commands/Utilities/REST/GraphHelper.cs#L201
https://github.com/pnp/powershell/blob/ea27996fad86082cfb0fe4a38a91c8b95166d2d6/src/Commands/Utilities/REST/GraphHelper.cs#L297
https://github.com/pnp/powershell/blob/ea27996fad86082cfb0fe4a38a91c8b95166d2d6/src/Commands/Utilities/REST/GraphHelper.cs#L316
https://github.com/pnp/powershell/blob/ea27996fad86082cfb0fe4a38a91c8b95166d2d6/src/Commands/Utilities/REST/GraphHelper.cs#L328
https://github.com/pnp/powershell/blob/ea27996fad86082cfb0fe4a38a91c8b95166d2d6/src/Commands/Utilities/REST/GraphHelper.cs#L334
https://github.com/pnp/powershell/blob/ea27996fad86082cfb0fe4a38a91c8b95166d2d6/src/Commands/Utilities/REST/GraphHelper.cs#L366
https://github.com/Pixeval/Pixeval/blob/3c13928070242a71ee266ecfe3c2da881ab5d78b/src/Pixeval.Utilities/Objects.cs#L124
https://github.com/Pixeval/Pixeval/blob/3c13928070242a71ee266ecfe3c2da881ab5d78b/src/Pixeval.Utilities/Objects.cs#L132
https://github.com/Pixeval/Pixeval/blob/3c13928070242a71ee266ecfe3c2da881ab5d78b/src/Pixeval.Utilities/Objects.cs#L140
https://github.com/Pixeval/Pixeval/blob/3c13928070242a71ee266ecfe3c2da881ab5d78b/src/Pixeval.Utilities/Objects.cs#L147
Heck, even we do it wrong in our sample code:
https://github.com/dotnet/tye/blob/cc61cc866bb97e8c70d1810be8c98f6d2b30b528/samples/dapr/pub-sub/store/Startup.cs#L65-L72

Predictability is important. But the current cliff is so large that we need to do something.

@eiriktsarpalis
Copy link
Member

we need to do something.

Agreed. If it does turn out that caching dynamic methods eliminates 80% of the performance issues with 20% of the effort we should just do that. But there are diminishing returns in reusing equivalent JsonSerializerOptions instances.

@stephentoub
Copy link
Member

@eiriktsarpalis and I chatted offline. We agreed at a minimum we need an analyzer, e.g. that warns about any JsonSerializerOptions construction that flows only to a call to a JsonSerializer method argument. Beyond that, he's going to profile to get a strong sense for where the actual costs are, and based on that we can decide how much further to go. If the primary costs are in JIT'ing, for example, we could look to employ some kind of tiering strategy, where we start with a purely reflection-based approach and only upgrade to reflection emit after N uses. We could look at having a copy-on-write solution for the existing caching, so that at least new JsonSerializerOptions() would all share the same existing cache. We could look at having a set of caches based on the most common properties set, though we'd also need to figure out if/what eviction policy we might need (we probably already need to think about that with the built-in cached instances, in support of things like collectable assemblies). Etc.

@eiriktsarpalis
Copy link
Member

FWIW we recently added a JsonSerializerOptions.Default property to partially mitigate this very problem (the default singleton not being accessible to users until recently).

@eiriktsarpalis
Copy link
Member

I knocked together a prototype implementing memoization for dynamic methods and compared it with main using a couple of new benchmarks. The performance improvements seem impressive:

Method Branch Mean Error StdDev Median Min Max Ratio RatioSD Gen 0 Gen 1 Allocated Alloc Ratio
DeserializeFromString main 357.5 ns 15.88 ns 16.30 ns 358.4 ns 323.1 ns 391.4 ns 1.00 0.00 0.0059 - 64 B 1.00
DeserializeFromStringColdStart main 265,351.4 ns 9,785.90 ns 10,877.00 ns 262,529.4 ns 250,632.3 ns 284,204.2 ns 1.00 0.00 0.9606 - 13082 B 1.00
DeserializeFromStringColdStart memoize 14,908.1 ns 975.41 ns 1,123.28 ns 14,752.9 ns 12,844.3 ns 16,596.0 ns 0.06 0.00 0.6662 0.0555 6750 B 0.52
Method Branch Mean Error StdDev Median Min Max Ratio RatioSD Gen 0 Gen 1 Allocated Alloc Ratio
SerializeToString main 233.2 ns 10.79 ns 11.54 ns 235.6 ns 210.4 ns 254.3 ns 1.00 0.00 0.0252 - 256 B 1.00
SerializeToStringColdStart main 174,392.6 ns 14,656.61 ns 16,878.57 ns 171,285.1 ns 151,667.6 ns 211,067.8 ns 1.00 0.00 0.7911 - 12959 B 1.00
SerializeToStringColdStart memoize 13,609.5 ns 529.22 ns 543.47 ns 13,778.9 ns 12,490.0 ns 14,116.5 ns 0.08 0.01 0.6095 0.0554 6653 B 0.51

@eiriktsarpalis
Copy link
Member

The performance issues have been partially mitigated by #64646. Closing in favor of #65396 which tracks pending analyzer work.

@ghost ghost locked as resolved and limited conversation to collaborators Mar 18, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-System.Text.Json code-analyzer Marks an issue that suggests a Roslyn analyzer documentation Documentation bug or enhancement, does not impact product or test code in-pr There is an active PR which will close this issue when it is merged tenet-performance Performance related issue
Projects
None yet
Development

No branches or pull requests

9 participants