QUIC support #1334

Demi-Marie · 2019-12-06T15:28:30Z

I am using the latest quinn git master. While I could backport to the latest release, I would prefer to get the existing code working first.

Current approach:

Use quinn-proto (the bare state machine)
Each connection has a pair of HashMaps to store wakers, one for readers and one for writers. When I/O on a stream becomes possible, the corresponding waker is awoken.
The mutable state is protected with mutexes, so libp2p-quic is thread-safe.
The state machine is only advanced on a background task.
Channels are used for incoming connections (to an endpoint) and incoming streams (to a connection). They are also used to send packets generated by a connection to the background task for transmission.
Strict mutex aquisition ordering is upheld to prevent deadlocks. Specifically, if both the mutex protecting the endpoint and the mutex protecting a connection need to be taken, the mutex protecting the endpoint must be taken first.

What I would like feedback on:

Should I be using channels in places where I am currently using mutexes?
To avoid potential memory exhaustion denial of service attacks, I am using a bounded channel for outgoing packets. If it fills up, packets are dropped. Since QUIC is a reliable protocol, I expect that these will be retransmitted, which is much better than having to buffer an unlimited amount of data.
How should I handle the case where a connection arrives, but we are not ready for it? We can’t buffer an unlimited number of them, so at some point we will need to impose backpressure. The current plan is to rely on quinn-proto limiting the number of connections that it returns but have not yet been accepted.

Edit

The code is now essentially complete and is ready for review.

This requires a branch of `quinn` that has not yet been merged.

It now incudes the changes needed by libp2p.

Also update to stable futures.

There are a LOT of `unimplemented!()` calls here!

All future versions will use ‘quinn-proto’ directly.

There are plenty of `unimplemented!()` calls, and TLS certificate processing has not even been started, but this should still be a good start.

The code compiles and is approaching a point at which it could be useful.

tomaka · 2019-12-06T15:44:54Z

How should I handle the case where a connection arrives, but we are not ready for it? We can’t buffer an unlimited number of them, so at some point we will need to impose backpressure. The current plan is to rely on quinn-proto limiting the number of connections that it returns but have not yet been accepted.

When you receive a connection or a substream, you wake up whatever was calling poll_inbound and/or whatever was polling the stream of incoming connections. If these things are not fast enough, the right solution is indeed freezing the QUIC background task entirely until the rest of the program is ready to accept more.

To avoid potential memory exhaustion denial of service attacks, I am using a bounded channel for outgoing packets. If it fills up, packets are dropped. Since QUIC is a reliable protocol, I expect that these will be retransmitted, which is much better than having to buffer an unlimited amount of data.

I don't really understand what you mean here. The data has to exist somewhere in memory for it to be retransmitted.
To me the idea here should be the same, in that we can prevent people from writing data by making write_substream return Poll::Pending.

tomaka · 2019-12-06T15:47:38Z

I am delaying TLS certificate handling because I am quite confident in being able to do it reasonably well, whereas this is my first time writing code that interacts with futures at a low level.

The low-level futures code is indeed far from being simple, but the TLS certificate is what stopped me in my latest attempt at implementing QUIC, and I don't think that it's just a detail that can be dismissed as "we'll do it later as it's so easy".

Demi-Marie · 2019-12-06T15:49:14Z

How should I handle the case where a connection arrives, but we are not ready for it? We can’t buffer an unlimited number of them, so at some point we will need to impose backpressure. The current plan is to rely on quinn-proto limiting the number of connections that it returns but have not yet been accepted.

When you receive a connection or a substream, you wake up whatever was calling poll_inbound and/or whatever was polling the stream of incoming connections. If these things are not fast enough, the right solution is indeed freezing the QUIC background task entirely until the rest of the program is ready to accept more.

The problem is that the QUIC background task handles all I/O. If I freeze it, existing connections will stop working.

To avoid potential memory exhaustion denial of service attacks, I am using a bounded channel for outgoing packets. If it fills up, packets are dropped. Since QUIC is a reliable protocol, I expect that these will be retransmitted, which is much better than having to buffer an unlimited amount of data.

I don't really understand what you mean here. The data has to exist somewhere in memory for it to be retransmitted.
To me the idea here should be the same, in that we can prevent people from writing data by making write_substream return Poll::Pending.

I do indeed do that when possible. I only drop a packet if a channel reports that it has capacity, but no longer has space when I try to write to it. I could buffer it myself, but that seems ugly.

burdges · 2019-12-09T23:58:12Z

Is freezing the lower level layer really how QUIC specifies the application of back pressure? I thought we could do much more fine grained back pressure?

Demi-Marie · 2019-12-10T00:22:20Z

Is freezing the lower level layer really how QUIC specifies the application of back pressure? I thought we could do much more fine grained back pressure?

Backpressure should be as fine-grained as possible. If it isn’t, that is a bug.

Much of the I/O code simply has not been written yet. Is that what you are referring to?

infinity0 · 2019-12-10T09:16:35Z

the right solution is indeed freezing the QUIC background task entirely until the rest of the program is ready to accept more.

The problem is that the QUIC background task handles all I/O. If I freeze it, existing connections will stop working.

To apply backpressure, simply don't send acks. In this case it means the QUIC layer should not send an ack to the incoming connection request until the application responds by accepting the poll_inbound.

This implements message transmission. The code relies on UDP datagram sending not blocking indefinitely. Since UDP does not retransmit packets, this should be a decent assumption in practice.

tomaka · 2020-10-01T13:44:31Z

requires either synchronization or explicit message-passing

And that should be done by the higher-level code, such as rust-libp2p, not the QUIC library itself.
I prefer to have is a library that doesn't block us in any way, rather than a library that handles everything for us an opinionated way, and that ends up conflicting with our code.

DemiMarie · 2020-10-01T14:06:01Z

I agree. That’s why I prefer quinn-proto.

dvc94ch · 2021-04-06T09:43:05Z

I can make a PR for https://github.com/ipfs-rust/libp2p-quic, but need to finalize the noise protocol it uses and document it. Also there is more perf tuning to do.

DemiMarie · 2021-07-15T08:38:51Z

I can make a PR for https://github.com/ipfs-rust/libp2p-quic, but need to finalize the noise protocol it uses and document it. Also there is more perf tuning to do.

Thank you! (I’m the author of this PR).

dvc94ch · 2021-07-17T11:08:45Z

So I opened #2144 . It is inspired by the work in this PR but it doesn't share any code. It works with the latest released libp2p/quinn and uses noise instead of tls. I think it would be possible to get the tls implementation from this PR and integrate it in the future, however I'm not particularly incentivized to do it, so contributions welcome. But I don't think it needs to be blocked on this feature. Also there isn't any tokio support, which someone may want to add in the future.

One other difference is the use of unbounded channels. This is for two reasons. 1. it simplifies the implementation, and 2. I haven't found a benchmark that performs better by requiring task switching than just buffering the data. This also matches my experience with benchmarks in other projects, like netsim-embed for example.

burdges · 2021-07-17T11:52:58Z

Which Noise? What curve?

dvc94ch · 2021-07-17T12:11:10Z

The cryptographic details are documented and implemented here [0]. It uses an IKpsk1 handshake with ed25519 keys and a chacha8poly1305 cipher. The psk1 is optional and is used instead of the pnet protocol that is used with tcp for private swarms.

[0] https://github.com/ipfs-rust/quinn-noise

DemiMarie · 2021-07-17T17:53:46Z

So I opened #2144 . It is inspired by the work in this PR but it doesn't share any code. It works with the latest released libp2p/quinn and uses noise instead of tls. I think it would be possible to get the tls implementation from this PR and integrate it in the future, however I'm not particularly incentivized to do it, so contributions welcome. But I don't think it needs to be blocked on this feature. Also there isn't any tokio support, which someone may want to add in the future.

As the person who wrote the relevant TLS code: I no longer work for Parity, but I am willing to answer (in my spare time) questions about the relevant code that I wrote. That said, my understanding is that Noise is a cleaner protocol anyway and so should be preferred. Additionally, the TLS specification for libp2p has some significant design weaknesses, namely signing the raw public key and not the SubjectPublicKeyInfo.

One other difference is the use of unbounded channels. This is for two reasons. 1. it simplifies the implementation, and 2. I haven't found a benchmark that performs better by requiring task switching than just buffering the data. This also matches my experience with benchmarks in other projects, like netsim-embed for example.

Do you have proofs that the channels will not grow without limit?

The cryptographic details are documented and implemented here [0]. It uses an IKpsk1 handshake with ed25519 keys and a chacha8poly1305 cipher. The psk1 is optional and is used instead of the pnet protocol that is used with tcp for private swarms.

Why ChaCha8 instead of ChaCha12? My understanding is that using ChaCha12 instead of ChaCha20 is considered quite safe, but using ChaCha8 only provides a 1-round safety margin.

dvc94ch · 2021-07-17T18:10:51Z

Do you have proofs that the channels will not grow without limit?

not really, but I like code that looks good in benchmarks and is readable. devops may disagree.

Why ChaCha8 instead of ChaCha12? My understanding is that using ChaCha12 instead of ChaCha20 is considered quite safe, but using ChaCha8 only provides a 1-round safety margin.

in the too much crypto paper chacha8 was deemed sufficient. while rand uses chacha12 for paranoia reasons, they did discuss using chacha8 but deemed it not critical enough for performance compared to the risk. however for quic it has a large effect on performance. according to my early benchmarks using aes-gcm, encryption/decryption was 50% of the workload.

dvc94ch · 2021-07-18T14:38:47Z

Section 3.3 covers ChaCha:

The best result on ChaCha is a key recovery attack on its 7-round version, with 2^237.7 time complexity (the exact unit is unclear) using output data from 2^96 instances of ChaCha, that is, 2^105 bytes of data. On 6 rounds, a similar attack can run in time & data 2^116 & 2^116, or 2^127.5 & 2^37.5. On 5 rounds, the attack becomes practical due to the lower diffusion, and runs in 2^16 time and data.

Note the 7-round attack is a security reduction from the claimed 256-bits of security, to "237.7" bits, and therefore is not a catastrophic attack.

[0] rand_chacha: consider ChaCha12 (or possibly ChaCha8) over ChaCha20 rust-random/rand#932

dvc94ch · 2021-07-18T14:42:34Z

so if I understand correctly an attack is only practical on 5 rounds, giving it a 3 round safety margin.

dvc94ch · 2021-07-19T14:57:45Z

Please show your support for libp2p/specs#351 by liking it. Thanks!

tomaka · 2021-07-21T10:38:48Z

One other difference is the use of unbounded channels. This is for two reasons. 1. it simplifies the implementation, and 2. I haven't found a benchmark that performs better by requiring task switching than just buffering the data. This also matches my experience with benchmarks in other projects, like netsim-embed for example.

The reason for bounded channels isn't related to performances at all, but for DoS resistance purposes. When using unbounded channels, the queue of messages can grow forever if someone sends messages faster than the rest of the software can process them. Since you have no control over the rate at which the socket receives messages and the rate at which the messages will be processed, you must have some sort of "stopping mechanism" that kills connections if they send more data than the node can handle.

I invite you to take a look at this comment.

dvc94ch · 2021-07-21T11:02:39Z

so the reasons for my effort to get this upstreamed are the following:

The transport listen_on api in the swarm doesn't work that well for quic, or at least I haven't found a nice way to implement it. In ipfs-embed we call listen_on on the transport and then again on the swarm to make it work.

I think the relay makes some assumptions that are tcp specific and having a slightly different transport in the codebase might lead to a better design.

Some points for improvement did come out of this although mostly not that surprising:

add an alpn string to the handshake intialization string [0]
use of bounded channels [1]
support tls in addition to noise [2]
tokio support [3]

However I'm not sure there is a path forward currently to upstreaming it due to a lack of a libp2p spec, so I'll continue maintaining it out-of-tree.

DemiMarie · 2021-07-21T17:35:18Z

However I'm not sure there is a path forward currently to upstreaming it due to a lack of a libp2p spec, so I'll continue maintaining it out-of-tree.

There already is a (flawed) spec for using TLS in QUIC. That can be improved with a single change.

dvc94ch · 2021-07-21T18:08:41Z

I guess supporting tls wouldn't be too much work (since you already wrote the code). Supporting both via feature flags is probably a bit more involved. Although to be honest I have no clue how tls actually works.

DemiMarie · 2021-07-21T18:50:29Z

I guess supporting tls wouldn't be too much work (since you already wrote the code). Supporting both via feature flags is probably a bit more involved. Although to be honest I have no clue how tls actually works.

rustls provides a high-quality implementation of the TLS protocol, and I got the necessary changes made upstream to allow it to be used in libp2p. I also wrote a library for low-level X.509 certificate parsing, and another one for basic ASN.1 serialization.

elenaf9 · 2022-09-22T09:41:01Z

Thank you for the huge work that has been done here @DemiMarie @tomaka!
I am closing this PR in favor of #2289, which is based on this PR / the tomaka/quiccc-again branch. Unfortunately its not visible in its git history, however you will definitely be mentioned as co-authors on merge.

After 4a317d the code is now completely based on libp2p#1334, thus the authorship should be set accordingly.

Demi-Marie added 15 commits November 29, 2019 17:04

Add libp2p-transport-quic crate

7492883

Add copyright notice and beginnings of a transport

1f98ed9

Bogus implementation that at least compiles

52b8614

Remove an unimplemented!

b544f67

Report peer addresses

432c1b6

This requires a branch of `quinn` that has not yet been merged.

Remove code duplication between QUIC and TCP transports

606ede2

Switch back to quinn master

10f16ba

It now incudes the changes needed by libp2p.

Implement dial

bf184c5

Add use lines for std future related types

2157f52

Also update to stable futures.

Initial StreamMuxer impl

b4fd310

There are a LOT of `unimplemented!()` calls here!

Last version using ‘quinn’

752c844

All future versions will use ‘quinn-proto’ directly.

Compiling (but not working) quinn-proto based libp2p-quic

202ccda

There are plenty of `unimplemented!()` calls, and TLS certificate processing has not even been started, but this should still be a good start.

More progress on libp2p-quic

3a0ba66

The code compiles and is approaching a point at which it could be useful.

The doc test passes!

3c27505

Merge branch 'stable-futures' into demi-quic-stable-futures

24728a6

Demi-Marie added 2 commits December 9, 2019 11:19

Preserve order of outgoing connections

49ebd43

Test suite compiles!

974d117

Demi-Marie added 7 commits December 11, 2019 18:53

Handle making new connections

c8ae241

Merge branch 'stable-futures' into demi-quic-stable-futures

b35d42a

All tests compile

e47e5de

Remove remaining unimplemented!()

f19dd46

Merge branch 'stable-futures' into demi-quic-stable-futures

67307a3

Simple fixes

f26323c

Implement sending messages

5ca62ae

This implements message transmission. The code relies on UDP datagram sending not blocking indefinitely. Since UDP does not retransmit packets, this should be a decent assumption in practice.

cryptoquick mentioned this pull request Dec 10, 2020

Fantastic work so far! ipfs-rust/ipfs-embed#28

Closed

mxinden mentioned this pull request Apr 8, 2021

NAT traversal tracking issue libp2p/specs#312

Open

mxinden mentioned this pull request Apr 20, 2021

NAT traversal #2052

Open

14 tasks

mxinden mentioned this pull request May 19, 2021

dial limiting is racy libp2p/go-libp2p#1105

Open

mxinden mentioned this pull request Jul 5, 2021

protocols: Implement Direct Connection Upgrade through Relay (dcutr) #2076

Closed

dvc94ch mentioned this pull request Jul 16, 2021

Add libp2p-quic. #2144

Closed

mxinden mentioned this pull request Aug 9, 2021

Libp2p quic second attempt #2159

Closed

kpp mentioned this pull request Oct 14, 2021

transports/quic: Add implementation based on quinn-proto #2289

Merged

7 tasks

elenaf9 mentioned this pull request Sep 9, 2022

[Tracking Issue] transports/quic: Add QUIC Transport #2883

Closed

15 tasks

elenaf9 closed this Sep 22, 2022

elenaf9 added a commit to kpp/rust-libp2p that referenced this pull request Oct 4, 2022

transports/quic/Cargo.toml fix authors

71595d0

After 4a317d the code is now completely based on libp2p#1334, thus the authorship should be set accordingly.

kpp mentioned this pull request Oct 7, 2022

transports/tls: Add libp2p-tls as per spec #2945

Merged

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

QUIC support #1334

QUIC support #1334

Demi-Marie commented Dec 6, 2019 •

edited

Loading

tomaka commented Dec 6, 2019

tomaka commented Dec 6, 2019 •

edited

Loading

Demi-Marie commented Dec 6, 2019

burdges commented Dec 9, 2019

Demi-Marie commented Dec 10, 2019 •

edited

Loading

infinity0 commented Dec 10, 2019

tomaka commented Oct 1, 2020

DemiMarie commented Oct 1, 2020

dvc94ch commented Apr 6, 2021

DemiMarie commented Jul 15, 2021

dvc94ch commented Jul 17, 2021

burdges commented Jul 17, 2021 •

edited

Loading

dvc94ch commented Jul 17, 2021 •

edited

Loading

DemiMarie commented Jul 17, 2021

dvc94ch commented Jul 17, 2021

dvc94ch commented Jul 18, 2021

dvc94ch commented Jul 18, 2021

dvc94ch commented Jul 19, 2021

tomaka commented Jul 21, 2021 •

edited

Loading

dvc94ch commented Jul 21, 2021

DemiMarie commented Jul 21, 2021

dvc94ch commented Jul 21, 2021

DemiMarie commented Jul 21, 2021

elenaf9 commented Sep 22, 2022

QUIC support #1334

QUIC support #1334

Conversation

Demi-Marie commented Dec 6, 2019 • edited Loading

Edit

tomaka commented Dec 6, 2019

tomaka commented Dec 6, 2019 • edited Loading

Demi-Marie commented Dec 6, 2019

burdges commented Dec 9, 2019

Demi-Marie commented Dec 10, 2019 • edited Loading

infinity0 commented Dec 10, 2019

tomaka commented Oct 1, 2020

DemiMarie commented Oct 1, 2020

dvc94ch commented Apr 6, 2021

DemiMarie commented Jul 15, 2021

dvc94ch commented Jul 17, 2021

burdges commented Jul 17, 2021 • edited Loading

dvc94ch commented Jul 17, 2021 • edited Loading

DemiMarie commented Jul 17, 2021

dvc94ch commented Jul 17, 2021

dvc94ch commented Jul 18, 2021

dvc94ch commented Jul 18, 2021

dvc94ch commented Jul 19, 2021

tomaka commented Jul 21, 2021 • edited Loading

dvc94ch commented Jul 21, 2021

DemiMarie commented Jul 21, 2021

dvc94ch commented Jul 21, 2021

DemiMarie commented Jul 21, 2021

elenaf9 commented Sep 22, 2022

Demi-Marie commented Dec 6, 2019 •

edited

Loading

tomaka commented Dec 6, 2019 •

edited

Loading

Demi-Marie commented Dec 10, 2019 •

edited

Loading

burdges commented Jul 17, 2021 •

edited

Loading

dvc94ch commented Jul 17, 2021 •

edited

Loading

tomaka commented Jul 21, 2021 •

edited

Loading