[WIP] Parallel IBD #52

jaspervdm · 2020-04-21T13:48:02Z

Still WIP. Working on implementation side of p2p messages simultaneously, details might change.

lehnberg

Nice work @jaspervdm! Took a very high level pass at it with some questions.

lehnberg · 2020-04-21T16:23:06Z

text/0000-parallel_ibd.md

+
+A node joining the Grin network does not need the complete block history in order to fully verify the chain state. The block headers, the unspent output set and the complete kernel set are sufficient. The output and kernel sets are stored as leaves in a Merkle Mountain Range (MMR) and the block headers commit to the root of these trees. Prior to HF2, the block headers only committed to the output commitments and not their unspent/spent status. This meant that output and kernel data could only be verified after it had been completely downloaded, which forced nodes to download the full data from a single peer. The downsides of this are apparent: the download speed is bottlenecked by the bandwidth of the other peer and if that peer goes offline during download or provides malicious data (which can only be verified after the fact) the process had to be restarted from scratch from another peer.
+
+However, due to a consensus change in HF2 the headers now also commit to the unspent/spent status. The output spent status is stored in a bitmap, which is split up in chunks of 1024 and stored in separate MMR. Block headers commit to the root of this MMR. This means that these chunks can be downloaded and verified independently, by providing two merkle proofs along the data that prove the inclusion of the unspent outputs and the output bitmap in the roots. It allows the data to be downloaded in parallel and verified as it comes in, which greatly improves the bootstrap time of a fresh node.


I would try to make this less technical, more accessible to someone who has no idea what "commit", "bitmap", "MMR", roots", etc is.

Before we were downloading a huge zip file from one peer. Now, we can download multiple data streams from multiple peers in parallel. Something like that.

(github falling over, apologies if I'm commenting multiple times...)

Agreed. This section can probably be moved as-is into the reference explanation.

lehnberg · 2020-04-21T16:24:16Z

text/0000-parallel_ibd.md

+# Community-level explanation
+[community-level-explanation]: #community-level-explanation


Something I feel is missing from this section is a small paragraph of what exactly is being outlined in this RFC, i.e. what are we describing here, what is the change that will be triggered, and how will it work?

lehnberg · 2020-04-21T16:27:41Z

text/0000-parallel_ibd.md

+Here `TOTAL_SUPPLY` is the total supply of Grins at the point of sync, which is equal to `(HEIGHT + 1)*COINBASE_REWARD`.
+
+If all of this is successful, the node now has downloaded and verified the state up until the sync horizon point. Fully updating the chain state to the latest head involves downloading the full blocks after the sync horizon, verifying them and applying its contents consecutively, which is possible because all nodes are expected to store the full blocks past the compaction horizon.
+


What happens when the checks fail? Process is restarted? Different peers? What's the peer selection process?

There are a bunch of implementation details that I have intentionally left open, because I haven't fully decided on the scope of the RFC yet. Up to now I have limited it to mostly the p2p messages and how they should be verified, which is something we can comfortably get into the next hard fork release. But I could extend it to fully describe the new sync process, including how intermediary state should be tracked, how we can deal with malicious peers and what the plan would be around running the parallel sync side by side with the old sync at first, before we completely remove the old sync.

antiochp · 2020-04-23T13:51:39Z

We should think through how rangeproofs will be handled here.

We maintain separate MMRs for outputs and rangeproofs.
The output MMR actually stores OutputIdentifier entries (which are outputs without their rangeproofs).
I think we should consider requesting chunks of rangeproofs separately to chunks of outputs.

So three different requests -

chunked outputs
chunked rangeproofs
chunked kernels

Each is used to incrementally build a full MMR.
These can all be requested in parallel.
Each MMR root can be verified separately.

Additional UTXO verification to ensure the output MMR and rangeproof MMR can be used to reconstruct the full valid UTXO set.

For every unspent output in the utxo bitmap there must exist -

output in the output MMR
corresponding rangeproof in the rangeproof MMR

Then rangeproofs are batch verified, given the associated output commitments.

Chunk size of 1024 may potentially be too large for rangeproofs?
So maybe these should be requested with a smaller chunk size?

antiochp · 2020-04-23T13:56:50Z

Also an additional edge case that we need to consider.

A chunk with a full 1024 leaves may be entirely pruned. In this situation the root itself will also be pruned.

We would need to use a parent higher in the subtree beneath the peak (potentially ending up at the peak itself).
Its not clear to me if this is sufficient to prove existence of subtree in this larger subtree?
But there must be some way of proving a full 1024 leaf subtree was in there but fully pruned.

Maybe in this case we cannot prove anything about the 1024 subtree and we need to aggregate with the next one somehow?
Two adjacent 1024 leaf subtrees can be aggregated into a single 2048 leaf subtree with height+1, so maybe chunks can be variable sized based on how pruned they are?

Eventually we aggregate up to a subtree with either -

at least a single unpruned leaf, or
a fully pruned subtree resulting in a single MMR peak

This is all just thinking out loud, but this is kind of nasty edge case.

jaspervdm · 2020-04-24T10:27:20Z

We should think through how rangeproofs will be handled here. [..]
Chunk size of 1024 may potentially be too large for rangeproofs?
So maybe these should be requested with a smaller chunk size?

Yes this is something I've been thinking about as well. Since we're keeping a MMR for them at the very least we need to add a merkle proof for them as well. Adding a separate p2p message might make sense, but this does complicate the verification process of these messages a bit, since we can only verify them after receiving the output commitments and bitmap from the other messages. Unless we make sure to only request the rangeproof chunks after we got the corresponding output chunk.

A chunk with a full 1024 leaves may be entirely pruned. In this situation the root itself will also be pruned.
We would need to use a parent higher in the subtree beneath the peak (potentially ending up at the peak itself).
Its not clear to me if this is sufficient to prove existence of subtree in this larger subtree?

Good catch, hadn't thought about this edge case yet. I have to consider it a bit longer but if we have multiple fully pruned chunks next to each other we could have pruned the hashes until an arbitrary height above 10 (the height of the root of the full chunk subtree). A "proof" for this chunk would give the hashes from the first unpruned nodes up until the peak of the tree. A node could lie about the number of levels that are pruned and give a false proof which would prevent us from reconstructing the PMMR with the necessary intermediary hashes. But this is something we can only verify after receiving all the relevant chunks. An easy way out would be to not prune above height 10 but that feels like a hack (and maybe already impossible if we've already fully pruned chunks, I didn't check). The proper way of treating this is to probably have an additional verification step at the end that walks up from the fully pruned chunks and compares the proofs, filling in any MMR entries that are missing.

antiochp · 2020-04-27T19:35:23Z

Related to IBD is the case where a node has been offline for > 7 days and needs to sync to catch up.
Might be worth considering this, at least at a high level overview, in the RFC?

We currently do this is a very sub-optimal way by just downloading a new full txhashset.

With this PIBD work we can be significantly smarter than this -

download missing recent kernels (we already have most of them)
download the new utxo set
- recent chunks of MMR
- maybe the full utxo bitmap? (so we know what previous outputs have since been spent?)

The node would not need to re-verify existing kernels and would not need to re-verify any pre-existing unspent utxo rangeproofs.

Catching up with a couple of weeks of missing data like this will be relatively compact and likely pretty fast.

jaspervdm · 2020-04-28T12:33:58Z

I had only briefly considered the case where the node has synced in the past but is offline for longer than the horizon and assumed a bit naively that they could simply donwload the missing chunks. But you are right, they need the updated bitmap as well. This actually makes me reconsider the design of the output chunk message, should we perhaps split off the bitmap to a seperate one? It is needed in all sync situations, and we would save the effort/bw of including a merkle proof for them in every chunk message.

phyro · 2020-05-01T20:17:25Z

This could be a nonissue, but I have a question about how this mixes with the #47 proposal. Since NRD kernels are a kernel type that is not completely isolated, they need to be validated relative to some moving window of max relative lock distance. If I'm understanding this correctly it means that a kernel chunk that contains a NRD kernel can't be fully validated on its own. Could this be an issue or is it easily solvable?

tromp · 2020-07-06T13:56:39Z

We should make sure that the IBD size remains linear in UTXO set size rather than in TXO size.
That will require switching the representation (segments of) of the spent bitmap from a bitmap to a list of unspent indices when the bitmap becomes sufficiently sparse.

lehnberg · 2020-10-14T14:00:33Z

Closed in favour of smaller RFC: #68

WIP PIBD RFC

3a153e8

jaspervdm changed the title ~~Parallel IBD~~ [WIP] Parallel IBD Apr 21, 2020

jaspervdm added 2 commits April 21, 2020 16:08

Add Antioch as co-author

5b0d8f7

Email link should start with mailto

889fbcf

lehnberg reviewed Apr 21, 2020

View reviewed changes

lehnberg added the node dev Related to node dev team label Apr 27, 2020

lehnberg mentioned this pull request Apr 28, 2020

Release Planning: Grin v4.0.0 mimblewimble/grin-pm#248

Closed

lehnberg assigned jaspervdm and j01tz and unassigned jaspervdm May 5, 2020

lehnberg closed this Oct 14, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] Parallel IBD #52

[WIP] Parallel IBD #52

jaspervdm commented Apr 21, 2020 •

edited

Loading

lehnberg left a comment

lehnberg Apr 21, 2020

antiochp Apr 23, 2020

lehnberg Apr 21, 2020

lehnberg Apr 21, 2020

jaspervdm Apr 21, 2020

antiochp commented Apr 23, 2020

antiochp commented Apr 23, 2020

jaspervdm commented Apr 24, 2020

antiochp commented Apr 27, 2020 •

edited

Loading

jaspervdm commented Apr 28, 2020

phyro commented May 1, 2020

tromp commented Jul 6, 2020

lehnberg commented Oct 14, 2020


		A node joining the Grin network does not need the complete block history in order to fully verify the chain state. The block headers, the unspent output set and the complete kernel set are sufficient. The output and kernel sets are stored as leaves in a Merkle Mountain Range (MMR) and the block headers commit to the root of these trees. Prior to HF2, the block headers only committed to the output commitments and not their unspent/spent status. This meant that output and kernel data could only be verified after it had been completely downloaded, which forced nodes to download the full data from a single peer. The downsides of this are apparent: the download speed is bottlenecked by the bandwidth of the other peer and if that peer goes offline during download or provides malicious data (which can only be verified after the fact) the process had to be restarted from scratch from another peer.

		However, due to a consensus change in HF2 the headers now also commit to the unspent/spent status. The output spent status is stored in a bitmap, which is split up in chunks of 1024 and stored in separate MMR. Block headers commit to the root of this MMR. This means that these chunks can be downloaded and verified independently, by providing two merkle proofs along the data that prove the inclusion of the unspent outputs and the output bitmap in the roots. It allows the data to be downloaded in parallel and verified as it comes in, which greatly improves the bootstrap time of a fresh node.

		# Community-level explanation
		[community-level-explanation]: #community-level-explanation

		Here `TOTAL_SUPPLY` is the total supply of Grins at the point of sync, which is equal to `(HEIGHT + 1)*COINBASE_REWARD`.

		If all of this is successful, the node now has downloaded and verified the state up until the sync horizon point. Fully updating the chain state to the latest head involves downloading the full blocks after the sync horizon, verifying them and applying its contents consecutively, which is possible because all nodes are expected to store the full blocks past the compaction horizon.

[WIP] Parallel IBD #52

[WIP] Parallel IBD #52

Conversation

jaspervdm commented Apr 21, 2020 • edited Loading

lehnberg left a comment

Choose a reason for hiding this comment

lehnberg Apr 21, 2020

Choose a reason for hiding this comment

antiochp Apr 23, 2020

Choose a reason for hiding this comment

lehnberg Apr 21, 2020

Choose a reason for hiding this comment

lehnberg Apr 21, 2020

Choose a reason for hiding this comment

jaspervdm Apr 21, 2020

Choose a reason for hiding this comment

antiochp commented Apr 23, 2020

antiochp commented Apr 23, 2020

jaspervdm commented Apr 24, 2020

antiochp commented Apr 27, 2020 • edited Loading

jaspervdm commented Apr 28, 2020

phyro commented May 1, 2020

tromp commented Jul 6, 2020

lehnberg commented Oct 14, 2020

jaspervdm commented Apr 21, 2020 •

edited

Loading

antiochp commented Apr 27, 2020 •

edited

Loading