Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[HTTP/3] Fix content stream read hang #58296

Merged
merged 2 commits into from
Aug 31, 2021

Conversation

CarnaViire
Copy link
Member

@CarnaViire CarnaViire commented Aug 27, 2021

The fix is similar to PR #56892 but for another branch (when data is buffered). The root cause is the same - if there's still space in user's buffer we'd start waiting for the next data frame to arrive (which will not happen in echo scenario).

Current code (including #56892 change) may be inefficient in case of small data frames that may be already available in transport, but refactoring that is too large of a change if we consider backporting. I thought #56891 would cover that, but it's QUIC specific, while we also have HTTP/3 specific problem here. I will create another issue later.

Fixes #57888

@ghost
Copy link

ghost commented Aug 27, 2021

Tagging subscribers to this area: @dotnet/ncl
See info in area-owners.md if you want to be subscribed.

Issue Details

The fix is similar to #56892 but for another branch (when data is buffered). The root cause is the same - if there's still space in user's buffer we'd start waiting for the next data frame to arrive (which will not happen in echo scenario).

Current code (including #56892 change) may be inefficient in case of small data frames that may be already available in transport, but refactoring that is too large of a change if we consider backporting. I thought #56891 would cover that, but it's QUIC specific, while we also have HTTP/3 specific problem here. I will create another issue later.

Fixes #57888

Author: CarnaViire
Assignees: -
Labels:

area-System.Net.Http

Milestone: -

@geoffkizer
Copy link
Contributor

We really should refactor ReadResponseContent and ReadResponseContentAsync so they do not have so much duplicated logic. Doesn't have to happen for 6.0 though.

@geoffkizer
Copy link
Contributor

I think we still have a potential hang here.

Just because an HTTP3 DATA frame says it is X bytes long, does not necessarily guarantee that the peer will send all X bytes promptly. If a user sends a DATA frame with length 1000, but only sends 500 bytes, this will hang. That's bad. We should give the user whatever data we have available at the time.

The logic here should be similar to how we handle chunked encoding for HTTP/1.1. See here: https://github.com/dotnet/runtime/blob/main/src/libraries/System.Net.Http/src/System/Net/Http/SocketsHttpHandler/ChunkedEncodingReadStream.cs#L38

I'm not sure why it was implemented so differently than chunked encoding in the first place.

In other words, we should do something like this:
(1) Read as much data out of our read buffer as we can. If we can read at least one byte, or we hit EOF, we are done and return to the caller. (This is the ReadChunksFromConnectionBuffer logic in the chunked encoding code.)
(2) At this point, we need to read more data. In certain cases it makes sense to avoid buffering the data ourselves and just read directly into the user's buffer. If so, do the read and (assuming it doesn't fail or return unexpected EOF) return.
(3) Otherwise, we need to read into the buffer. Do the read into the buffer and try to get data out of the buffer again (as in step 1, i.e. call the equivalent of ReadChunksFromConnectionBuffer). If we can read at least one byte, or we hit EOF, we are done.
(4) Otherwise, we must not have read enough data (e.g. we just got a partial frame header or something like that. Loop back to step 3 and repeat until we either get some data or hit EOF.

(Note that the chunked encoding logic actually loops back to step 2 instead of step 3. In theory that allows us to do the optimization in step 2 in a very narrow case, where we read exactly the "chunk header" in step 3 but don't read any chunk data. This seems unlikely in practice, but there's probably no harm in having the extra optimization here.)

HTTP3 DATA frames should actually be a bit simpler to handle than HTTP/1.1 chunked encoding, because their layout is simpler -- the frame header is a fixed size, the length is binary encoded instead of ASCII, there's no trailing frame delimiter as there is in chunked encoding (i.e. trailing "\r\n" for every chunk). Note however that for HTTP3, I think we do need to accept 0-length DATA frames -- whereas in chunked encoding, we don't (because a 0-length frame indicates EOF).

@geoffkizer
Copy link
Contributor

Also note the description above is similar to how we processing incoming SSL frames. See

private async ValueTask<int> ReadAsyncInternal<TIOAdapter>(TIOAdapter adapter, Memory<byte> buffer)

@CarnaViire
Copy link
Member Author

@geoffkizer but we still believe that frame envelope and at least start of the payload should arrive promptly, right? The way ReadFrameEnvelopeAsync is implemented, it will block until both frame type and payload length are read. And it doesn't make sense to return until at least 1 byte of the payload is read.
The hang you are worried about may be mitigated easily by always returning from "read into user's buffer" branch, not only on frame border here

if (_responseDataPayloadRemaining == 0)
{
break;
}

Other than that, the buffer we have now is small (64 bytes) and we don't have a logic of growing it... (so it's only useful for envelopes) But we can use e.g. larger initial size, at least 4KB as in HTTP/1.1, so branching on whether to read into user's buffer or to our buffer would make sense (now I believe it's not)

@geoffkizer
Copy link
Contributor

@geoffkizer but we still believe that frame envelope and at least start of the payload should arrive promptly, right? The way ReadFrameEnvelopeAsync is implemented, it will block until both frame type and payload length are read. And it doesn't make sense to return until at least 1 byte of the payload is read.

We can't do anything other than wait for frame header + 1 byte. Until then we have no data to return to the user.

The hang you are worried about may be mitigated easily by always returning from "read into user's buffer" branch, not only on frame border here

if (_responseDataPayloadRemaining == 0)
{
break;
}

That sounds good.

Other than that, the buffer we have now is small (64 bytes) and we don't have a logic of growing it... (so it's only useful for envelopes) But we can use e.g. larger initial size, at least 4KB as in HTTP/1.1, so branching on whether to read into user's buffer or to our buffer would make sense (now I believe it's not)

We should probably use a size like 4K here, yeah.

QUIC is rather different than HTTP1/2 because it's in user mode and it actively pushes data to us. We should think through the implications of this and try to make sure we are optimizing for how QUIC works in practice. But at least for now it seems reasonable to make this work fairly similarly to HTTP/1.1 since we know that works correctly and gives pretty good performance.

@CarnaViire
Copy link
Member Author

Implementing buffering logic that @geoffkizer described ended up being a larger change that I'd like to see here for hang mitigation (and it didn't work on the first try 😁)
Even if I'd make it work, I'd still rather open a separate PR.

@geoffkizer
Copy link
Contributor

Even if I'd make it work, I'd still rather open a separate PR.

Yeah, we should do some perf analysis as part of this as well. I agree we should defer to a separate PR.

But I do think we should fix this issue in the current PR:

Just because an HTTP3 DATA frame says it is X bytes long, does not necessarily guarantee that the peer will send all X bytes promptly. If a user sends a DATA frame with length 1000, but only sends 500 bytes, this will hang. That's bad. We should give the user whatever data we have available at the time.

@CarnaViire CarnaViire merged commit e882020 into dotnet:main Aug 31, 2021
@CarnaViire CarnaViire deleted the h3-fix-read-hang branch August 31, 2021 15:25
@karelz karelz added this to the 7.0.0 milestone Aug 31, 2021
@CarnaViire
Copy link
Member Author

/backport to release/6.0

@github-actions
Copy link
Contributor

Started backporting to release/6.0: https://github.com/dotnet/runtime/actions/runs/1187474059

@ghost ghost locked as resolved and limited conversation to collaborators Oct 1, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[HTTP/3] Hang when reading response data
4 participants