Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat:networking: (Synchronous) Consistent Broadcast for Filecoin EC #9858

Merged
merged 29 commits into from
Mar 30, 2023

Conversation

adlrocha
Copy link
Contributor

@adlrocha adlrocha commented Dec 13, 2022

Related Issues

Implementation of filecoin-project/FIPs#527

Proposed Changes

This PR implements a synchronous consistent broadcast to detect potential consensus equivocations.

  • It implements a consistent broadcast that delays the delivery of blocks to the syncer to detect when a set of equivocated blocks are trying to be pushed for certain height.
  • The cache and this mechanisism is triggered when a new block is received through the network (pubsub subscription).

Checklist

Before you mark the PR ready for review, please make sure that:

  • Commits have a clear commit message.
  • PR title is in the form of of <PR type>: <area>: <change being made>
    • example: fix: mempool: Introduce a cache for valid signatures
    • PR type: fix, feat, build, chore, ci, docs, perf, refactor, revert, style, test
    • area, e.g. api, chain, state, market, mempool, multisig, networking, paych, proving, sealing, wallet, deps
  • New features have usage guidelines and / or documentation updates in
  • Tests exist for new functionality or change in behavior
  • CI is green

@jennijuju
Copy link
Member

Oh whoa I didn’t even notice this until now! Thank you Alfonso for getting this in! @filecoin-project/lotus-tse can one of you throw this branch on your miner and confirm win posts stability? (AFTER WINTER BREAK)

chain/sub/incoming.go Outdated Show resolved Hide resolved
Comment on lines 44 to 48
// NOTE: If we want the payment channel itests to pass that use a
// block time of 5*millisecond, we need to set the consistent broadcast
// delay to something lower than that block time.
// todo: configure such a low delay only for paych tests.
build.CBDeliveryDelay = 2 * time.Millisecond
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we fully disable this for most tests? wdpost tests for example use 2ms block time, and it would be good to not slow them down even more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I completely disabled consistent broadcast for itests. That being said, it would be great if we can at least enable it on a smoke consensus or deals itests with a decent block time (if there is any) just as a sanity-check that this doesn't break anything in the future.

Copy link
Contributor

@magik6k magik6k left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(meant to comment, not approve)

@magik6k magik6k self-requested a review January 4, 2023 14:09
@jennijuju jennijuju added this to the v1.21.0 milestone Feb 23, 2023
Copy link
Contributor

@arajasek arajasek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Broadly LGTM, a few questions / suggestions.

chain/sync_test.go Show resolved Hide resolved
chain/sub/bcast/consistent.go Outdated Show resolved Hide resolved
chain/sub/bcast/consistent.go Outdated Show resolved Hide resolved
chain/sub/bcast/consistent.go Outdated Show resolved Hide resolved
chain/sub/incoming.go Outdated Show resolved Hide resolved
type blksInfo struct {
ctx context.Context
cancel context.CancelFunc
blks []cid.Cid
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm unsure about the thread-safety of this blks slice. We seem to be reading & modifying it somewhat unsafely, and while it's probably the case that the system won't ever race here, I'm...not sure.

Can we reason through this a bit?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am using a sync.Map to prevent parallel read/writes and avoid requiring a specific lock for every blksInfo which may be harder to handle. I assume blksInfo as a single unit of data, and never read the data structure of blocks directly. Thus the load, store methods.

That being said, I may have missed some subtle detail. Let me know if this makes sense.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So my concern is that we load the blksInfo out of the sync.Map, but then read and modify the blks field. I think if we have 2 calls to RcvBlock, they both load the same bInfo object (safely). After that, however, they're gonna be concurrently accessing the blks slice, which I don't think is safe (eg. iterating over the blks field, while it's also being modified).

I'm not 100% sure that I'm right, and even if so, I'm not sure whether simultaneous calls to RcvBlock are possible (need to 👀 more), but does that make sense?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, looking at the code closer, we don't actually ever modify it. So we're probably okay here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we are OK on that, but I think you are right that there may be a race here as one routine may load the dict and another one may load it in parallel before the previous one has been able to store any potential changes.

I think this is fixed by moving this lock after the load statement in line 125. WDYT?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@adlrocha I don't think that's sufficient. I think it needs to be held until no more writing happens, which is the end of the method (as currently written). Consider the following situation:

  • 2 blocks A and B come in with the same VRFProof.
  • A gets the lock first, creates an empty BCastDict (line 119), doesn't find a match for the common key on line 125, drops the lock
  • B gets the lock, loads the empty BCastDict created by A, also doesn't find a match for the common key on line 125, drops the lock
  • A stores blkACid as the only entry corresponding to key on line 141, returns
  • B overwrites that, storing blkBCid as the only entry corresponding to key on line 141, returns
  • Both A and B succeed their respective calls to WaitForDelivery

chain/sub/bcast/consistent.go Outdated Show resolved Hide resolved
chain/sub/bcast/consistent.go Outdated Show resolved Hide resolved
Comment on lines 131 to 133
// CBDeliveryDelay is the delay before deliver in the synchronous consistent broadcast.
// This determines the wait time for the detection of potential equivocations.
var CBDeliveryDelay = 6 * time.Second
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Does this impact block propagation? I guess no since pubsub checks don't touch this logic.
  • Can we set this to a lower value while still getting the expected security benefits?
    • My concern is that if we merge this now, without fixes for market cron we'll push quite deep into danger territory on how long it takes to get through an epoch (propagation delay + cb delay + state compute, which can be 14s today + winning post)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this impact block propagation? I guess no since pubsub checks don't touch this logic.

It shouldn't, right? The only thing we are delaying is the delivery of the block to the syncer (i.e. the call to InformNewBlock), but this is done after the Gossipsub validation so the propagation of the block should follow its course seamlessly.

Can we set this to a lower value while still getting the expected security benefits?

This is a great point, and this could definitely be a problem. These 6 seconds is the theoretical time that it takes for a block to be propagated to all of the network by Gossipsub. The assumption is that after this time there is high probability that any equivocation has been propagated enough to be caught by a majority of miners. Lowering this value will impact the effectiveness of the scheme, but I don't know to what extent or what is the minimum threshold. I know @jsoares and @guy.goren were working on getting some real data from mainnet with ProbeLab to see empirically what is the optimal delay. I'll let them chime in in case they have more info.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this impact block propagation? I guess no since pubsub checks don't touch this logic.

As @adlrocha said, propagation shouldn't be affected. Anything else would negate the benefits of the patch: we want the guard time to be larger than the propagation time, which is easier the faster propagation is.

Can we set this to a lower value while still getting the expected security benefits?

The value can be set lower. In fact, there is no "cliff" behaviour and the security gains can be approximated (if not formally, at least intuitively) as a logistic function centred on the average propagation delay: while you don't necessarily need to cover the 99th-percentile, too low thresholds would catch few attacks and you'd always want to at least cover the average case.

Alas, our only real-world data was collected from the bootstrapper nodes (via Sentinel). Our analysis of that data shows that the vast majority of blocks reach the bootstrappers in < 1s. However, there are a few weaknesses here, primarily as (i) we're trusting source timestamps; (ii) bootstrapper nodes are not necessarily representative of the average node in terms of connectivity (both link & centrality, even mesh configuration).

In the meantime, ProbeLab is working on a network-wide GossipSub study that will yield a higher-confidence distribution for block propagation. Alas, that work is taking longer than we expected and there is still no time frame for the results. The plan was, therefore, to start with the 6s used as the assumption in the GossipSub security analysis, keep it user-configurable, and push better defaults once we have the data. I expect we'll land in the 1 to 2 second range, but don't hold me to that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that here I'm more concerned about a block arriving late making SPs be late with their own block.

Propagation delay is 10s, state compute today takes up to 14s due to cron issues. With this patch worst case if some set of SPs choose to send blocks late, we'd have essentially no time spare for WindowPoSt, essentially cutting into propagation delay for the next epoch - which afaict would be either comparable or worse than epoch boundary

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, that's a valid concern given we're right at the threshold. How high would you be comfortable going? Given the context above, I think we could conceivably adjust down a second or two to leave some buffer time -- and still reassess when the study results are available.

Co-authored-by: Aayush Rajasekaran <arajasek94@gmail.com>
@arajasek
Copy link
Contributor

So I think I share @magik6k's concern here -- this is unfortunate, but the good news is that we're expecting things to get better fairy soon.

What would we think about shipping this with an artificially low value (2 seconds?), and running that until nv19. That gives time for any CRITICAL issues with the code to emerge (not that I expect any). Upon nv19, we can set this to the desired 6 seconds.

We also probably want to lower the default propagation delay when this is set to 6 seconds, I think?

@jsoares
Copy link
Member

jsoares commented Mar 17, 2023

So I think I share @magik6k's concern here -- this is unfortunate, but the good news is that we're expecting things to get better fairy soon.

What would we think about shipping this with an artificially low value (2 seconds?), and running that until nv19. That gives time for any CRITICAL issues with the code to emerge (not that I expect any). Upon nv19, we can set this to the desired 6 seconds.

We also probably want to lower the default propagation delay when this is set to 6 seconds, I think?

(note: I'm responding to this but I'm not the author of the FIP...)

That seems reasonable to me. I'm not sure it will be that far off the final numbers.

As for adjusting the cutoff, I think the propagation study will definitely provide good data to inform this. That said:

  1. The two things are somewhat independent, as long as they fit within the time available. PropagationDelaySecs sets the cutoff for accepting blocks, whereas CBDeliveryDelay defines how long we'll wait to potentially reject a block received by the cutoff. It does not extend the acceptance window. (also, both of these things have pretty bad names, as they describe bounds rather than delays)
  2. Setting the cutoff too low can have a noticeable negative impact, whereas setting the guard time too low will just decrease its effectiveness. So we probably want to be more conservative with respect to the former.

@magik6k
Copy link
Contributor

magik6k commented Mar 20, 2023

It does not extend the acceptance window

That's correct, but in the current implementation it extends the latest time at which block validation begins. Ideally we'd start validating those blocks "optimistically", and somehow retro-reject them if we see another within CBDeliveryDelay

@jsoares
Copy link
Member

jsoares commented Mar 20, 2023

Ideally we'd start validating those blocks "optimistically", and somehow retro-reject them if we see another within CBDeliveryDelay

Indeed, and that's suggested in the FIP as a potential optimisation. It's unclear to me how much of a gain that represents in practice, and it may add a bit of complexity, but it's a possible direction if we are indeed time-constrained. My expectation is still that the final parameters will be very comfortable wrt to the available window.

@adlrocha
Copy link
Contributor Author

@arajasek, all comments addressed. I changed the locking strategy to make it less fine-grained but easier to reason about (as discussed above). We could recover the fine-grained locking and make it more understandable by using https://github.com/gammazero/keymutex, but I didn't want to introduce any additional dependency.

(For context, we used this dependency extensively for storetheindex, and is written and maintained by Andrew).

chain/sub/bcast/consistent.go Outdated Show resolved Hide resolved
Co-authored-by: Aayush Rajasekaran <arajasek94@gmail.com>
@arajasek arajasek merged commit 8b2208f into filecoin-project:master Mar 30, 2023
Stebalien added a commit that referenced this pull request Apr 27, 2023
…ast"

This reverts commit 8b2208f, reversing
changes made to 2db6b12.

Unfortunately, we've found multiple issues in this code that could lead
to unbounded resource usage, block censorship, crashes, etc. Luckily,
this code isn't actually needed by the main Filecoin chain, which relies
on reporting consensus faults. So we can just try again later.
Stebalien added a commit that referenced this pull request Apr 27, 2023
…ast"

This reverts commit 8b2208f, reversing
changes made to 2db6b12.

Unfortunately, this is rather tricky code. We've found several issues so
far and, while we've fixed a few, there are outstanding issues that
would require complex fixes we don't have time to tackle right now.

Luckily, this code isn't actually needed by the main Filecoin chain
which relies on consensus fault reporting to handle equivocation. So we
can just try again later.
arajasek added a commit that referenced this pull request May 2, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants