-
Notifications
You must be signed in to change notification settings - Fork 28
Refactor messages and add support for multipart messages #513
Conversation
This commit contains several major changes. # Re-work the message header ``` 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | + + | | + + | | + + | | + + | | + + | | + + | | + + | | + signature + | | + + | | + + | | + + | | + + | | + + | | + + | | + + | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | tag | flags | reserved | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | + + | | + + | | + + | | + participant_pk + | | + + | | + + | | + + | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ ``` The problem is that: - When reading from a byte stream, we don't know where the message finishes because there is no length field. So far this has not been a problem because we would only receive messages via HTTP. However, if we decide to do encryption in the TLS layer, we might support reading messages from a TLS connection directly. Then, we'll have an encrypted byte stream in input and un-encrypted byte stream as output, and we'll need to be able to find the message boundaries to parse the messages. - The participant public key is used to verify the signature, which is the first thing we do. It's not convenient to have to read this far in the header to get it. Instead it should follow the signature. - The flags field is currently un-used (we removed the certificate field in #508 Therefore we changed the header of the messages as follow: ``` 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | + + | | + + | | + + | | + + | | + + | | + + | | + + | | + signature + | | + + | | + + | | + + | | + + | | + + | | + + | | + + | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | + + | | + + | | + + | | + participant_pk + | | + + | | + + | | + + | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | tag | reserved | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ ``` # Remove the `Header` type To simplify the code, we also removed the `Header` type. Instead, the `Header` is not part of the `Message` itself: ```rust // The length field is omitted because it's compupted // automatically when serializing the message. // // The tag field is omitted because it's encoded in the `Payload` // enum type pub struct Message { pub signature: Option<Signature>, pub participant_pk: PublicSigningKey, pub payload: Payload, } ``` Note that the `Message` type cannot implement the `ToBytes` trait, because the insertion of the signature requires passing a secret signing key as an argument. Instead, `Message` has a `to_bytes` method. We could implement `FromBytes` but to make things consistent, we also implemented it directly as a `from_bytes` methods. # Remove the `Message::new_{sum,update,sum2}` constructors The `Message` type has been greatly simplified. There is no need for these constructors. # Introduce a new `Chunk` message type We introduced a new message type that represents a part, or chunk, of a bigger message: `Chunk`. The message structure is: ``` 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | chunk_id | message_id | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | flags | reserved | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | payload | ``` We currently only support one flag that indicates whether this is the last chunk of the message or not. The `chunk_id` identifies a chunk, and the `message_id` identifies the biggier message it is part of.
In particular, we removed the `Message::new_{sum,update,sum2}` constructors
d6b0d12
to
9287168
Compare
As part of updating the code to match the new message API, I realized the `PreProcessor` was un-necessarily complex and contained a lot of duplicated code (once for each message type). The service was initially intended to perform a bunch of checks on the incoming messages before passing them to the state machine, but it turns out the only this it actually does is checking the eligibility of the participant for the task it wants to take part to. As such we: - removed the duplicated code - update the service from `PreProcessor` to `TaskValidator`
9287168
to
ba6a16e
Compare
Codecov Report
@@ Coverage Diff @@
## master #513 +/- ##
==========================================
+ Coverage 56.89% 58.02% +1.12%
==========================================
Files 67 63 -4
Lines 3051 3021 -30
==========================================
+ Hits 1736 1753 +17
+ Misses 1315 1268 -47
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I doesn't feel so qualified to review this part of the codebase but for me it looks good.
I really like the simplification in the tower service!
/// ID of the message this chunk belongs to | ||
message_id: u16, | ||
/// `true` if this is the last chunk of the message, `false` otherwise | ||
last: bool, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a way to limit the amount of chunks a participant can send?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No. With this the server doesn't know the full message size. However we still have 24 "reserved" bytes, so we could maybe have a 16 bits long field for the total number of chunks. We'd still have 8 bits left.
sender_sk: &self.state.keys.secret, | ||
}; | ||
message_seal.seal(message) | ||
let mut buf = vec![0; message.buffer_length()]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it makes sense to add the seal_message
method to Message
? For me it feels a bit low level to work with bytes of the message.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I considered it but I had to make a choice between two APIs, but felt that the trade-offs should be made by the caller instead:
impl Message {
// This allocates two vec: one for serializing the message, one for the encrypted data. It's user friendly, and can't panic
fn seal_1(&self, sk: SecretSigningKey, pk: PublicEncryptKey) -> Vec<u8> {
let mut buf = vec![0; self.buffer_length()];
self.to_bytes(&mut buf, sk);
pk.encrypt(&buf[..])
}
// This allocates just one vec for the encrypted data. It's more consistent with the existing to_bytes method but it can panic.
fn seal_2(&self, buf: &mut [u8], sk: SecretSigningKey, pk: PublicEncryptKey) -> Vec<u8> {
self.to_bytes(&mut buf, sk);
pk.encrypt(&buf[..])
}
}
Apart from that, the caller may decide to do encryption differently (using the new streaming API for instance). So in the end, implementing that method here means we'd have to take some decisions I'd prefer the caller to take.
The three commits can/should be reviewed independently.