-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
internal: fix client send preface problems #2380
Conversation
This CL fixes two problems: - In clientconn_state_transitions.go, each time we set up a server to send SETTINGS, we should also set up the server to read. This allows the client to successfully send its SETTINGS. - In clientconn.go, we incorrectly transitioned into TRANSIENT FAILURE when the http2client returned an error. This should be handled in the outer resetTransport main reset loop. The reason this became a problem is that the outer resetTransport has very specific conditions around when to transition into TRANSIENT FAILURE that the egregious transition did not have. So, it could transition into TRANSIENT FAILURE after failing to dial, even if it were trying to connect to a non-final address in the list of addresses.
Tested successfully on travis with |
This condition may happen in practice, though. Should we have a test case for each server behavior? |
I thought about that, but here's what that test looks like:
Is that what you'd expect, given [1]? Or is there some behavior that's defined for this? 1: Apparently, you can do
(despite the fact that lis.Close has not been called yet) I naively would have expected a synchronous pipe on conn.Write until the server does conn.Read, or minimally an async-with-timeout API. |
f55673c
to
28a4678
Compare
@dfawley PTAL. Added PipeListener, a new test to make sure client does not enter READY until both prefaces are sent/received, and code to do the needful. Tests on travis with |
28a4678
to
9c8ae94
Compare
internal/testutils/pipe_listener.go
Outdated
|
||
// Close closes the listener. | ||
func (p *pipeListener) Close() error { | ||
// We don't close p.C, because it races with the client reconnect. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about: lock, close p.c, unlock. In Dial: lock, check that p.c is open, make a new chan and push onto p.c, unlock? I think that avoids any potential races.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.. I don't really like adding complexity to something that's already too complex, but I'm OK with this for now since we're aiming to change it soon.
aaaa0e4
to
11e80cf
Compare
11e80cf
to
fd566b1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a couple minor things I noticed and one optional thing. Approving. Thanks!
When security is disabled, not waiting for the HTTP/2 handshake can lead to DoS-style behavior. For details, see: grpc/grpc-go#954. This requirement will incur an extra half-RTT latency before the first RPC can be sent under plaintext, but this is negligible and unencrypted connections are rarer than secure ones. Under TLS, the server will effectively send its part of the HTTP/2 handshake along with its final TLS "server finished" message, which the client must wait for before transmitting any data securely. This means virtually no extra latency is incurred by this requirement. Go had attempted to separate "connection ready" with "connection successful" (Issue: grpc/grpc-go#1444 PR: grpc/grpc-go#1648). However, this is confusing to users and introduces an arbitrary distinction between these two events. It has led to several bugs in our reconnection logic (e.g.s grpc/grpc-go#2380, grpc/grpc-go#2391, grpc/grpc-go#2392), due to the complexity, and it makes custom transports (grpc/proposal#103) more difficult for users to implement. We are aware of some use cases (in particular, https://github.com/soheilhy/cmux) expecting the behavior of transmitting an RPC before the HTTP/2 handshake is completed. Before making behavior changes to implement this, we will reach out to our users to the best of our abilities.
This CL fixes three problems:
In clientconn_state_transitions_test.go, each time we set up a server to send
SETTINGS, we should also set up the server to read. This allows the client
to successfully send its SETTINGS.
In clientconn.go, we incorrectly transitioned into TRANSIENT FAILURE when
creating an http2client returned an error. This should be handled in the outer
resetTransport main reset loop. The reason this became a problem is that the
outer resetTransport has very specific conditions around when to transition
into TRANSIENT FAILURE that the egregious transition did not have. So, it
could transition into TRANSIENT FAILURE after failing to dial, even if it
was trying to connect to a non-final address in the list of addresses.
In clientconn.go, we incorrectly stay in CONNECTING after createTransport when a server sends
its connection preface but the client is not able to send its connection preface. This
CL causes the addrconn to correctly enter TRANSIENT FAILURE when createTransport
fails, even if a server preface was received. It does so by making
ac.successfulHandshake to consider both server preface received as well as
client preface sent.