fix tight loop on aborted connection #285

aep · 2018-06-14T16:57:43Z

when the underlying IO returns 0 on read, we must stop polling it since
it's closed. Otherwise we'll be stuck in a tight loop.

this fixes sfackler/rust-openssl#949

carllerche

Good catch, thanks.

I left a note inline. Also, any way to add a test that covers this case?

carllerche · 2018-06-14T18:11:34Z

src/frame/reason.rs

@@ -59,6 +59,8 @@ impl Reason {
    pub const INADEQUATE_SECURITY: Reason = Reason(12);
    /// The endpoint requires that HTTP/1.1 be used instead of HTTP/2.
    pub const HTTP_1_1_REQUIRED: Reason = Reason(13);
+    /// The connection was closed
+    pub const CONNECTION_CLOSED : Reason = Reason(14);


These reasons are mapped to http 2.0 reasons from the spec. In this case, it would either be a protocol error or an I/o error (connection reset by peer).

aep · 2018-06-14T18:21:32Z

or an I/o error

sounds more appropriate than protocol error but couldnt find it in the spec

Also, any way to add a test that covers this case?

sounds like a lot of effort, since we'd have to make an io that deliberatly returns 0 at a specific point.

sfackler · 2018-06-14T18:27:17Z

Seems like it wouldn't be that bad to make a test. Start a server, open up a connection, write PRI * and close the connection, and make sure the server handshake returns an error.

aep · 2018-06-14T19:11:34Z

not sure how to do this using the existing h2 test framework.

mock doesnt have a concept of half closed.

i could open a pipe, but thats unix only. let me know if you want that.

#[test]
fn server_error_on_unclean_shutdown() {
    use std::io::Write;

    let _ = ::env_logger::try_init();
    let (io, mut client) = mock::new();

    let srv = server::Builder::new()
        .handshake::<_, Bytes>(io)
        .map_err(|e|println!("{}", e))
        .and_then(|srv| {
            println!("{:?}", srv);
            Ok(())
        });

    sender.write_all(b"PRI *").expect("write");
    drop(client);

    srv.wait().expect("wait");
}

will error with "mock closed" in write

aep · 2018-06-14T19:37:33Z

ok i changed mock to just ignore write if its closed, this works. the test now passes with this fix and never returns without it.

when the underlying IO returns 0 on read, we must stop polling it since it's closed. Otherwise we'll be stuck in a tight loop. this fixes sfackler/rust-openssl#949

carllerche

Thanks! Looks good to me.

This picks up a fix for hyperium/h2#285

@briansmith

* Propagate errors in conduit containers to the api (#1117) - It would be nice to display container errors in the UI. This PR gets the pod's container statuses and returns them in the public api - Also add a terminationMessagePolicy to conduit's inject so that we can capture the proxy's error messages if it terminates * proxy: Update prost to 0.4.0 (#1127) prost-0.4.0 has been released, which removes unnecessary dependencies. tower-grpc is being updated simultaneously, as this is the proxy's primary use of prost. See: https://github.com/danburkert/prost/releases/tag/v0.4.0 * Simplify & clarify "No TLS" server configuration (#1131) The same pattern will be used for the "No TLS" client configuration. Signed-off-by: Brian Smith <brian@briansmith.org> * proxy: Fix Inotify falling back to polling when files don't exist yet (#1119) This PR changes the proxy's Inotify watch code to avoid always falling back to polling the filesystem when the watched files don't exist yet. It also contains some additional cleanup and refactoring of the inotify code, including moving the non-TLS-specific filesystem watching code out of the `tls::config` module and into a new `fs_watch` module. In addition, it adds tests for both the polling-based and inotify-based watch implementations, and changes the polling-based watches to hash the files rather than using timestamps from the file's metadata to detect changes. These changes are originally from #1094 and #1091, respectively, but they're included here because @briansmith asked that all the changes be made in one PR. Closes #1094. Closes #1091. Fixes #1090. Fixes #1097. Fixes #1061. Signed-off-by: Eliza Weisman <eliza@buoyant.io> * test: Use proxy instead of lb for external test traffic (#1129) * test: Use proxy instead of lb for external test traffic * Adjust timeouts on install and get tests Signed-off-by: Kevin Lingerfelt <kl@buoyant.io> * Display proxy container errors in the Web UI (#1130) * Display proxy container errors in the Web UI Add an error modal to display pod errors Add icon to data tables to indicate errors are present Display errors on the Service Mesh Overview Page and all the resource pages * Start running integration tests in CI (#1064) * Start running integration tests in CI * Add gcp helper funcs * Split integration test cleanup into separate phase Signed-off-by: Kevin Lingerfelt <kl@buoyant.io> * Fix conduit version issue in integration tests (#1139) Signed-off-by: Kevin Lingerfelt <kl@buoyant.io> * Keep accepting new connections after TLS handshake error. (#1134) When a TLS handshake error occurs, the proxy just stops accepting requests. It seems my expectations of how `Stream` handles errors were wrong. The test for this will be added in a separate PR after the infrastructure needed for TLS testing is added. (This is a chicken and egg problem.) Signed-off-by: Brian Smith <brian@briansmith.org> * Add optional TLS client certificate authentication. (#1135) Refactor the way the TLS trust anchors are configured in preparation for the client and server authenticating each others' certificates. Make the use of client certificates optional pending the implementation of authorization policy. Signed-off-by: Brian Smith <brian@briansmith.org> * Attempt to load TLS settings immediately prior to starting watch (#1137) Previously, the proxy would not attempt to load its TLS certificates until a fs watch detected that one of them had changed. This means that if the proxy was started with valid files already at the configured paths, it would not load them until one of the files changed. This branch fixes that issue by starting the stream of changes with one event _followed_ by any additional changes detected by watching the filesystem. I've manually tested that this fixes the issue, both on Linux and on macOS, and can confirm that this fixes the issue. In addition, when I start writing integration tests for certificate reloading, I'll make sure to include a test to detect any regressions. Closes #1133. Signed-off-by: Eliza Weisman <eliza@buoyant.io> * Proxy: Make the control plane completely optional. (#1132) Proxy: Make the control plane completely optional. * Update Rustls to the latest Git version to fix a bug. (#1143) Using MS Edge and probably other clients with the Conduit proxy when TLS is enabled fails because Rustls doesn't take into consideration that Conduit only supports one signature scheme (ECDSA P-256 SHA-256). This bug was fixed in Rustls when ECDSA support was added, after the latest release. With this change MS Edge can talk to Conduit. Signed-off-by: Brian Smith <brian@briansmith.org> * Enable get for nodes/proxy for Prometheus RBAC (#1142) The `kubernetes-nodes-cadvisor` Prometheus queries node-level data via the Kubernetes API server. In some configurations of Kubernetes, namely minikube and at least one baremetal kubespray cluster, this API call requires the `get` verb on the `nodes/proxy` resource. Enable `get` for `nodes/proxy` for the `conduit-prometheus` service account. Fixes #912 Signed-off-by: Andrew Seigner <siggy@buoyant.io> * Grafana: remove fill and stack from individual resource breakouts (#1092) Remove the filling and stacking in request rate graphs that combine resources, to make it easier to spot outliers. * Grafana: remove fill and stack from individual resource breakouts * Remove all the stacks and fills from request rates everywhere * Build CLI only for host platform (#884) * Build CLI only for host platform Signed-off-by: Alena Varkockova <varkockova.a@gmail.com> * Changes after code review Signed-off-by: Alena Varkockova <varkockova.a@gmail.com> * Fix unbound variable issue in docker-build script (#1146) Signed-off-by: Kevin Lingerfelt <kl@buoyant.io> * v0.4.4 release notes (#1145) * v0.4.4 release notes * Tweak wording about adblocker fix Signed-off-by: Kevin Lingerfelt <kl@buoyant.io> * Upgrade to webpack 4 and webpack-dev-server 3 (#1138) Speeds up performance of webpack-dev-server. * proxy: Upgrade h2 to 0.1.10 (#1149) This picks up a fix for hyperium/h2#285 * Proxy: Make TLS server aware of its own identity. (#1148) * Proxy: Make TLS server aware of its own identity. When validating the TLS configuration, make sure the certificate is valid for the current pod. Make the pod's identity available at that point in time so it can do so. Since the identity is available now, simplify the validation of our own certificate by using Rustls's API instead of dropping down to the lower-level webpli API. This is a step towards the server differentiating between TLS handshakes it is supposed to terminate vs. TLS handshakes it is supposed to pass through. This is also a step toward the client side (connect) of TLS, which will reuse much of the configuration logic. Signed-off-by: Brian Smith <brian@briansmith.org> * proxy: Add `tls="true"` metric label to connections accepted with TLS (#1050) Depends on #1047. This PR adds a `tls="true"` label to metrics produced by TLS connections and requests/responses on those connections, and a `tls="no_config"` label on connections where TLS was enabled but the proxy has not been able to load a valid TLS configuration. Currently, these labels are only set on accepted connections, as we are not yet opening encrypted connections, but I wired through the `tls_status` field on the `Client` transport context as well, so when we start opening client connections with TLS, the label will be applied to their metrics as well. Closes #1046 Signed-off-by: Eliza Weisman <eliza@buoyanbt.io> * Truncate very long error messages, small tweaks to error messages (#1150) - If error messages are very long, truncate them and display a toggle to show the full message - Tweak the headings - remove Pod, Container and Image - instead show them as titles - Also move over from using Ant's Modal.method to the plain Modal component, which is a little simpler to hook into our other renders. * proxy: Clarify Outbound::recognize (#1144) The comments in Outbound::recognize had become somewhat stale as the logic changed. Furthermore, this implementation may be easier to understand if broken into smaller pieces. This change reorganizes the Outbound:recognize method into helper methods--`destination`, `host_port`, and `normalize`--each with accompanying docstrings that more accurately reflect the current implementation. This also has the side-effect benefit of eliminating a string clone on every request. * Add integration tests for tap (#1152) * Add integration tests for tap * Collect fewer tap events Signed-off-by: Kevin Lingerfelt <kl@buoyant.io> * dest service: close open streams on shutdown (#1156) * dest service: close open streams on shutdown * Log instead of print in pkg packages * Convert ServerClose to a receive-only channel Signed-off-by: Kevin Lingerfelt <kl@buoyant.io> * Don't panic on stats that aren't included in StatAllResourceTypes (#1154) Problem `conduit stat` would cause a panic for any resource that wasn't in the list of StatAllResourceTypes This bug was introduced by https://github.com/runconduit/conduit/pull/1088/files Solution Fix writeStatsToBuffer to not depend on what resources are in StatAllResourceTypes Also adds a unit test and integration test for `conduit stat ns` * Fix dashboard integration test (#1160) Signed-off-by: Kevin Lingerfelt <kl@buoyant.io> * Proxy: Add TLS client infrastructure. (#1158) Move TLS cipher suite configuration to tls::config. Use the same configuration to act as a client and a server. Signed-off-by: Brian Smith <brian@briansmith.org> * Proxy: More carefully keep track of the reason TLS isn't used. (#1164) * Proxy: More carefully keep track of the reason TLS isn't used. There is only one case where we dynamically don't know whether we'll have an identity to construct a TLS connection configuration. Refactor the code with that in mind, better documenting all the reasons why an identity isn't available. Signed-off-by: Brian Smith <brian@briansmith.org> * Don't allow stat requests for named resources in --all-namespaces (#1163) Don't allow the CLI or Web UI to request named resources if --all-namespaces is used. This follows kubectl, which also does not allow requesting named resources over all namespaces. This PR also updates the Web API's behaviour to be in line with the CLI's. Both will now default to the default namespace if no namespace is specified. * Enable optional parallel build of docker images (#978) * Enable optional parallel build of docker images By default, docker does image builds in a single thread. For our containers, this is a little slow on my system. Using `parallel` allows for *optional* improvements in speed there. Before: 41s After: 22s * Move parallel help text to stderr * proxy: re-enabled vectored writes through our dynamic Io trait object. (#1167) This adds `Io::write_buf_erased` that doesn't required `Self: Sized`, so it can be called on trait objects. By using this method, specialized methods of `TcpStream` (and others) can use their `write_buf` to do vectored writes. Since it can be easy to forget to call `Io::write_buf_erased` instead of `Io::write_buf`, the concept of making a `Box<Io>` has been made private. A new type, `BoxedIo`, implements all the super traits of `Io`, while making the `Io` trait private to the `transport` module. Anything hoping to use a `Box<Io>` can use a `BoxedIo` instead, and know that the write buf erase dance is taken care of. Adds a test to `transport::io` checking that the dance we've done does indeed call the underlying specialized `write_buf` method. Closes #1162 * proxy: add HTTP/1.1 Upgrade support automatically (#1126) Any HTTP/1.1 requests seen by the proxy will automatically set up to prepare such that if the proxied responses agree to an upgrade, the two connections will converted into a standard TCP proxy duplex. Implementation ----------------- This adds a new type, `transparency::Http11Upgrade`, which is a sort of rendezvous type for triggering HTTP/1.1 upgrades. In the h1 server service, if a request looks like an upgrade (`h1::wants_upgrade`), the request body is decorated with this new `Http11Upgrade` type. It is actually a pair, and so the second half is put into the request extensions, so that the h1 client service may look for it right before serialization. If it finds the half in the extensions, it decorates the *response* body with that half (if it looks like a response upgrade (`h1::is_upgrade`)). The `HttpBody` type now has a `Drop` impl, which will look to see if its been decorated with an `Http11Upgrade` half. If so, it will check for hyper's new `Body::on_upgrade()` future, and insert that into the half. When both `Http11Upgrade` halves are dropped, its internal `Drop` will look to if both halves have supplied an upgrade. If so, the two `OnUpgrade` futures from hyper are joined on, and when they succeed, a `transparency::tcp::duplex()` future is created. This chain is spawned into the default executor. The `drain::Watch` signal is carried along, to ensure upgraded connections still count towards active connections when the proxy wants to shutdown. Closes #195 * Add controller admin servers and readiness probes (#1168) * Add controller admin servers and readiness probes * Tweak readiness probes to be more sane * Refactor based on review feedback Signed-off-by: Kevin Lingerfelt <kl@buoyant.io> * bin: Remove unused script (#1153) Committed in error. Signed-off-by: Eliza Weisman <eliza@buoyant.io> * Proxy: Implement TLS conditional accept more like TLS conditional connect. (#1166) * Proxy: Implement TLS conditional accept more like TLS conditional connect. Clean up the accept side of the TLS configuration logic. Signed-off-by: Brian Smith <brian@briansmith.org> * Upgrade prometheus to v2.3.1 (#1174) Signed-off-by: Kevin Lingerfelt <kl@buoyant.io> * proxy: Document tls::config::watch_for_config_changes (#1176) While investigating TLS configuration, I found myself wanting a docstring on `tls::config::watch_for_config_changes`. This has one minor change in functionality: now, `future::empty()` is returned instead of `future:ok(())` so that the task never completes. It seems that, ultimately, we'll want to treat it as an error if we lose the ability to receive configuration updates. * Add CA certificate bundle distributor to conduit install (#675) * Add CA certificate bundle distributor to conduit install * Update ca-distributor to use shared informers * Only install CA distributor when --enable-tls flag is set * Only copy CA bundle into namespaces where inject pods have the same controller * Update API config to only watch pods and configmaps * Address review feedback Signed-off-by: Kevin Lingerfelt <kl@buoyant.io> * Add probes and log termination policy for distributor (#1178) Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>

This picks up a fix for hyperium/h2#285

carllerche requested changes Jun 14, 2018

View reviewed changes

aep force-pushed the master branch from bfce34e to 18beb87 Compare June 14, 2018 18:16

aep force-pushed the master branch from 18beb87 to 5c71484 Compare June 14, 2018 19:36

fix tight loop on aborted connection

ec22e03

when the underlying IO returns 0 on read, we must stop polling it since it's closed. Otherwise we'll be stuck in a tight loop. this fixes sfackler/rust-openssl#949

aep force-pushed the master branch from 5c71484 to ec22e03 Compare June 14, 2018 19:44

Import std::io

82f84e5

carllerche approved these changes Jun 15, 2018

View reviewed changes

carllerche merged commit 74a5e07 into hyperium:master Jun 15, 2018

olix0r added a commit to linkerd/linkerd2 that referenced this pull request Jun 18, 2018

proxy: Upgrade h2 to 0.1.10

7bc29bd

This picks up a fix for hyperium/h2#285

olix0r mentioned this pull request Jun 18, 2018

proxy: Upgrade h2 to 0.1.10 linkerd/linkerd2#1149

Merged

olix0r added a commit to linkerd/linkerd2 that referenced this pull request Jun 18, 2018

proxy: Upgrade h2 to 0.1.10 (#1149)

13716cd

This picks up a fix for hyperium/h2#285

fanzeyi mentioned this pull request Jun 29, 2018

http2 is broken because of h2 upgrade actix/actix-web#352

Closed

olix0r added a commit to linkerd/linkerd2-proxy that referenced this pull request Jul 7, 2018

proxy: Upgrade h2 to 0.1.10 (#1149)

9c55d40

This picks up a fix for hyperium/h2#285

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix tight loop on aborted connection #285

fix tight loop on aborted connection #285

aep commented Jun 14, 2018

carllerche left a comment

carllerche Jun 14, 2018

aep commented Jun 14, 2018

sfackler commented Jun 14, 2018

aep commented Jun 14, 2018 •

edited

Loading

aep commented Jun 14, 2018

carllerche left a comment

fix tight loop on aborted connection #285

fix tight loop on aborted connection #285

Conversation

aep commented Jun 14, 2018

carllerche left a comment

Choose a reason for hiding this comment

carllerche Jun 14, 2018

Choose a reason for hiding this comment

aep commented Jun 14, 2018

sfackler commented Jun 14, 2018

aep commented Jun 14, 2018 • edited Loading

aep commented Jun 14, 2018

carllerche left a comment

Choose a reason for hiding this comment

aep commented Jun 14, 2018 •

edited

Loading