Skip to content
This repository has been archived by the owner on Jan 23, 2023. It is now read-only.
/ corefx Public archive

Fix threading issue when reading and writing from SslStream at the same time #37736

Merged
merged 1 commit into from
May 22, 2019

Conversation

krwq
Copy link
Member

@krwq krwq commented May 16, 2019

Fixes #36426 - note, this is not scoped to Alpine - issue repros on any Linux with latest (1.1.1) OpenSSL installed.
There were also occasional process crashes I've observed disappearing after fixing this (didn't get a chance to capture stack trace as it happened fairly rarely and I initially thought it was due to bad setup/unrelated)

Details:
SSL_read, SSL_write, SSL_get_error are advertised as they are not designed to work in multithreaded scenarios when working on the same SSL connection: https://www.openssl.org/docs/faq.html#PROG1

for HTTP 1 scenarios this is rather unusual to read and write at the same time but for HTTP 2 this is common.

Note: multiple reads OR writes (but not reads and writes) at the same time can still cause some issues around BIO_read/BIO_write (BIO_* use separate queues for reads and writes) - I have not addressed them because multiple reads at the same time do not seem to be useful in any scenarios (user will essentially get random bytes since you can't reason about the order).

This is marked as draft because of the pending investigation:

  • measure perf difference with and without the fix (mostly interested in typical HTTP scenario)
  • (not required taking below perf results into consideration) investigate why issue does not repro for TLS 1.2 and lower (possibly it does but is not observable or just mere luck)

cc: @wfurt @stephentoub

Before

Method Mean Error StdDev Median Min Max Gen 0/1k Op Gen 1/1k Op Gen 2/1k Op Allocated Memory/Op
HandshakeAsync 64,404.47 us 1,773.9318 us 1,659.3369 us 64,036.75 us 62,803.79 us 68,142.63 us - - - 8340 B
WriteReadAsync 17.02 us 0.1446 us 0.1208 us 17.07 us 16.71 us 17.16 us - - - 1 B
ReadWriteAsync 30.01 us 1.3402 us 1.5434 us 29.57 us 27.97 us 32.85 us 0.0600 - - 1 B
ConcurrentReadWrite 24.05 us 1.0155 us 1.1694 us 24.18 us 21.70 us 25.55 us - - - 1 B
ConcurrentReadWriteLargeBuffer 34.29 us 2.7674 us 3.1869 us 34.32 us 28.45 us 38.20 us - - - 4 B

After

Method Mean Error StdDev Median Min Max Gen 0/1k Op Gen 1/1k Op Gen 2/1k Op Allocated Memory/Op
HandshakeAsync 63,702.33 us 1,242.5028 us 1,329.4641 us 63,213.67 us 61,775.39 us 66,090.92 us - - - 8340 B
WriteReadAsync 18.15 us 0.1839 us 0.1720 us 18.16 us 17.87 us 18.49 us - - - 1 B
ReadWriteAsync 30.62 us 1.1702 us 1.3476 us 30.22 us 28.92 us 33.53 us 0.0600 - - -
ConcurrentReadWrite 23.59 us 1.6974 us 1.9547 us 23.17 us 20.72 us 27.02 us - - - 1 B
ConcurrentReadWriteLargeBuffer 32.33 us 1.8414 us 2.1206 us 31.38 us 29.77 us 36.13 us - - - 4 B

End to end tests (aspnet/Benchmarks: PlaintextNonPipelined)

Description RPS CPU (%) Memory (MB) Avg. Latency (ms) Startup (ms) First Request (ms) Latency (ms) Ratio
baseline 102,052 62 183 69.1 271 227.05 0.7 1.00
after-changes 101,276 60 188 71.55 295 286.54 0.48 0.99

Numbers seem to have difference within the error margin which seems like it might not be worth to investigate further and just take a fix (we might want to check also more end to end scenarios)

cc: @sebastienros

@krwq krwq marked this pull request as ready for review May 19, 2019 01:11
@krwq krwq requested a review from stephentoub May 20, 2019 22:40
@stephentoub
Copy link
Member

multiple reads OR writes (but not reads and writes) at the same time can still cause some issues around BIO_read/BIO_write (BIO_* use separate queues for reads and writes) - I have not addressed them because multiple reads at the same time do not seem to be useful in any scenarios (user will essentially get random bytes since you can't reason about the order).

This is already prevented at the SslStream layer, ensuring that there's at most only one read and one write at a time. Did you actually see problems here, or is this speculation?

@krwq
Copy link
Member Author

krwq commented May 22, 2019

BIO_read/BIO_write was speculation based on their description in the docs (also it would be hard to write reliable test for that) - didn't know we had prevention against that

@krwq
Copy link
Member Author

krwq commented May 22, 2019

/azp run corefx-outerloop-linux

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@krwq
Copy link
Member Author

krwq commented May 22, 2019

There seem to be some issue with Azure DevOps - it reports build as red but nothing actually failed on the latest retry: https://dev.azure.com/dnceng/public/_build/results?buildId=191964
Already talked about that with @safern offline and he confirmed the issue

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants