Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[release/6.0] Fix AwaitableSocketAsyncEventArgs reorderings on weaker memory models #84432

Merged

Conversation

stephentoub
Copy link
Member

@stephentoub stephentoub commented Apr 6, 2023

There are a couple of places where we read the _continuation field and then read some other state which we assume to be consistent with the value we read in _continuation. But without a fence, those secondary reads could be reordered with respect to the first.

Fixes #84407
Fixes #70486
Fixes #72365

I don't have a repro, but we've received several related reports and this is the most likely cause based on code inspection.

Customer Impact

Rare crashes on arm64 due to an exception occurring in networking code the application doesn't control.

Testing

Only CI. The failure is so sporadic, we don't have a good repro or even ability to exercise it with stress.

Risk

Low. Other than changes to some Debug.Asserts, this is a one-word change to add volatile to a field. The change is effectively a nop on x86/64. It'll result in a few additional memory barriers being output on arm64, which are both necessary for correctness (whether or not it fixes this issue) and could have a small impact on performance.

(Note that this code no longer exists in .NET 8, having been replaced by use of the shared, public ManualResetValueTaskSourceCore implementation. We will want to port the fix to .NET 7 as well, though.)

There are a couple of places where we read the _continuation field and then read some other state which we assume to be consistent with the value we read in _continuation.  But without a fence, those secondary reads could be reordered with respect to the first.
@ghost
Copy link

ghost commented Apr 6, 2023

Tagging subscribers to this area: @dotnet/ncl
See info in area-owners.md if you want to be subscribed.

Issue Details

There are a couple of places where we read the _continuation field and then read some other state which we assume to be consistent with the value we read in _continuation. But without a fence, those secondary reads could be reordered with respect to the first.

Fixes #84407
Fixes #70486
Fixes #72365

I don't have a repro, but we've received several related reports and this is the most likely cause based on code inspection.

Customer Impact

Rare crashes on arm64 due to an exception occurring in networking code the application doesn't control.

Testing

Only CI. The failure is so sporadic, we don't have a good repro or even ability to exercise it with stress.

Risk

Low. Other than changes to some Debug.Asserts, this is a one-word change to add volatile to a field. The change is effectively a nop on x86/64. It'll result in a few additional memory barriers being output on arm64, which are both necessary for correctness (whether or not it fixes this issue) and could have a small impact on performance.

Author: stephentoub
Assignees: -
Labels:

area-System.Net.Sockets

Milestone: -

Copy link
Member

@wfurt wfurt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@carlossanlop
Copy link
Member

carlossanlop commented Apr 10, 2023

Today is code complete for the May Release. If we want this fix to get included in that release, please get the PR ready to merge before 4pm PT today:

  • Get a Tactics approval.
  • Confirm the CI failures are unrelated to this change.
  • Get a sign-off from an area owner.
  • If the change affects an OOB package, make sure the required OOB package authoring changes are included in the PR. Edit: Sockets.csproj is not marked IsPackable.

@leecow
Copy link
Member

leecow commented Apr 11, 2023

Approved for June release. Please open a 7.0 pr to cover that port.

@leecow leecow modified the milestones: 6.0.x, 6.0.18 Apr 11, 2023
@leecow leecow added Servicing-approved Approved for servicing release and removed Servicing-consider Issue for next servicing release review labels Apr 11, 2023
@stephentoub
Copy link
Member Author

/backport to release/7.0-staging

@github-actions
Copy link
Contributor

Started backporting to release/7.0-staging: https://github.com/dotnet/runtime/actions/runs/4670447938

@stephentoub
Copy link
Member Author

@carlossanlop, with this targeting staging, I can merge this now, right?

@stephentoub stephentoub merged commit 789d7ab into dotnet:release/6.0-staging Apr 14, 2023
@stephentoub stephentoub deleted the fixsocketreordering branch April 14, 2023 23:35
@ghost ghost locked as resolved and limited conversation to collaborators May 15, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-System.Net.Sockets Servicing-approved Approved for servicing release
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants