Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Identify message too big #954

Closed
Stebalien opened this issue May 28, 2020 · 6 comments
Closed

Identify message too big #954

Stebalien opened this issue May 28, 2020 · 6 comments
Assignees
Labels
kind/bug A bug in existing code (including security flaws) P0 Critical: Tackled by core team ASAP status/in-progress In progress

Comments

@Stebalien
Copy link
Member

When users use 4096 bit RSA keys (like we do on one of our bootstrappers), the identify message exceeds 2048 bytes (by about 20 bytes) when the signed envelop is added into the mix. That is:

  • 555 bytes for the key.
  • ~275 bytes for the protocol list.
  • ~12 bytes for the addresses
  • 1024+ bytes for the signed record
    • ~555 bytes for the key again
    • ~512 bytes for the signature
    • ~12 bytes for the actual addresses.

We're missing a few bytes in this math but this paints the picture.


We should fix this in all of the following ways:

  1. Quadruple the maximum message for all versions of identify to 8192 to be safe.
  2. Switch to ed25519 by default ASAP because RSA keys are just way too big.
  3. Split the identify protocol back into a version 1 and a version 2. Version 2 will:
    1. Support up to 8192 byte messages (guaranteed).
    2. Not duplicate the addresses.
    3. Critically, not duplicate the key. This last part will save us 555 bytes in this case.

Additionally, we should not use version 2 until we need it as negotiating version 2, then falling back on version 1 adds a round-trip to new connections. We could also revisit #903 but that's really hacky.

@aarshkshah1992
Copy link
Contributor

PR at #955.

@vyzo
Copy link
Contributor

vyzo commented May 29, 2020

This was discussed today. I think a reasonable solution that doesn't require a protocol upgrade would be to only send the signed peer record if the key is not too big in the current protocol version. This will protect lotus/drand who depend on signed peer records for Peer Exchange through gossipsub.

Note that I am not arguing against protocol upgrade; that's probably desirable for other reasons. But the existing protocol can be mostly fixed and still provide signed peer records, except for large RSA keys.

@Stebalien
Copy link
Member Author

Stebalien commented May 29, 2020

TL;DR:

Identify 1.0: Send signed record if small enough.
Identify 1.1: Always send signed record.

Peers would use identify 1 unless they absolutely need the signed record. In that case, they'd use identify 1.1.


Alternatively, we could just keep the identify 1.0/1.1 split and add a libp2p option for lotus nodes to try 1.1 before 1.0. This would fix your concern @vyzo, right?

@vyzo
Copy link
Contributor

vyzo commented May 30, 2020

Yes, it would fix the concern.
But really, we should try to send the signed peer record if it fits in identify/1.0 so that users can get the benefit of signed peer records without having to do anything or end up not sending sprs to older peers, since the testnet is live and it will be a mixed deployment.

@raulk
Copy link
Member

raulk commented Jun 1, 2020

Brainstorming other options as I read this thread:

  1. Elide the pubkey from the signed peer record in identify, when the pubkey is being sent in the identify message, using a magic replace sequence inside the record (a backreference).
    • The receiver would pre-process the protobuf, and replace the magic sequence with the out pubkey before processing the record.
    • This requires no protocol changes.
    • A peer that doesn't support this would simply reject the record, which has the same effect as not sending it.
    • It's hacky, but everything else here is.
  2. Compress the message using gzip or else, which could identify the duplication and optimise it. This implies a breaking change, so we'd have to deal with 1.0 / 1.1 complexity again (unless we add a request flag compression=gzip, but it gets too convoluted).
  3. Chunk the identify message.
    • Introduce a more boolean field in the response protobuf.
    • If the full message exceeds the limit, chunk it in two parts.
      • Part 1: everything except the signed record + more=true.
      • Part 2: only the signed record + more=false.
    • An old peer would only process part 1, ignore the more flag, and reset the stream when it sees more data arriving. This is equivalent to the behaviour proposed for 1.0 (would maybe result in spurious log statements though? but probably a fair tradeoff).
    • A new peer would process both parts. Equivalent to the behaviour proposed for 1.1.
    • This would be my chosen solution, as it requires no breaking changes, it's backwards compatible, and it requires no protocol versioning which is not an option with multistream-select 1.0 IMO.

@Stebalien

Peers would use identify 1 unless they absolutely need the signed record. In that case, they'd use identify 1.1.

How would identify know if the application layer needs signed records or not?


We need a test suite that tests identify with all possible key types/lengths, with and without peer records.

@jacobheun jacobheun added status/in-progress In progress P0 Critical: Tackled by core team ASAP labels Jun 2, 2020
@Stebalien
Copy link
Member Author

This has been fixed by sending multiple messages and merging them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug A bug in existing code (including security flaws) P0 Critical: Tackled by core team ASAP status/in-progress In progress
Projects
None yet
Development

No branches or pull requests

5 participants