Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't let state reconstruction starve pruning #3026

Open
michaelsproul opened this issue Feb 15, 2022 · 3 comments
Open

Don't let state reconstruction starve pruning #3026

michaelsproul opened this issue Feb 15, 2022 · 3 comments
Labels
database enhancement New feature or request

Comments

@michaelsproul
Copy link
Member

Description

Our Prater nodes running with --reconstruct-historic-states sometimes run out of disk space and die because of this caveat of state reconstruction:

While reconstruction is running the node will temporarily pause migrating new data to the freezer database. This will lead to the database increasing in size temporarily (by a few GB per day) until state reconstruction completes.

Rather than requiring state reconstruction to run in one go I think we should allow the background migrator to alternate between state reconstruction and pruning tasks. This will require a bit of a refactor of the reconstruct_historic_states function, perhaps passing in a maximum number of slots to reconstruct in one batch before returning. We might also have to track the reconstruction status in the BackgroundMigrator.

@michaelsproul michaelsproul added enhancement New feature or request database labels Feb 15, 2022
@michaelsproul
Copy link
Member Author

Thanks @tthebst for starting to look into this.

Here are a few pointers that might be useful:

  • The code for the state reconstruction lives in this function:
    pub fn reconstruct_historic_states(self: &Arc<Self>) -> Result<(), Error> {

    It will probably need to be modified to take an argument like the "maximum number of slots to reconstruct in one pass". For efficiency it should probably be a multiple of the slots-per-restore-point (maybe 8192, which is a multiple of every SPRP).
  • The code for driving the state reconstruction and the finalization migration lives here:
    pub fn process_reconstruction(&self) {
    if let Some(Notification::Reconstruction) =
    self.send_background_notification(Notification::Reconstruction)
    {
    Self::run_reconstruction(self.db.clone(), &self.log);
    }
    }

    One way to implement the batching for the async migrator would be to re-send the Notification::Reconstruction message at the end of run_reconstruction whenever the reconstruction function (reconstruct_historic_states) indicates that it has more work remaining. This would allow the background thread to interleave finalization processing and reconstruction. For example if the message queue starts with [Reconstruction, Finalization], we would process one reconstruction batch, push Reconstruction to the queue, then process the finalization. Now the queue would be [Reconstruction] and we would sit there running reconstruction batches until a new Finalization message arrived. To make this work we'd have to remove the preferential treatment for Reconstruction in the message de-queue logic here, and maybe re-consider the behaviour where we drain the whole channel on every iteration. That was added as an optimisation for the case where we have Finalization notifications backed up and want to just process the most recent one. I think it would be possible to keep that optimisation, but haven't thought through the mechanics in depth.

There are likely many approaches that would work here, and this is just an idea. You're welcome to implement any design that you think would be appropriate 😊

@int88
Copy link
Contributor

int88 commented May 12, 2023

@michaelsproul Has this issue been fixed? If not, I'd like to try 😃

@michaelsproul
Copy link
Member Author

@int88 It was implemented by @tthebst, but he abandoned the impl because he was running into strange database corruption issues similar to #3433 and #3455. So far I've never managed to reproduce those issues.

The commit is here if you're interested: 481e792

That commit is adapted to work with tree-states (my mega-optimisation project that is currently broken), but we could backport it to unstable. I suspect the change doesn't meaningfully increase the chance of database corruption, as it was running fine on several tree-states nodes for several weeks.

bors bot pushed a commit that referenced this issue Oct 27, 2023
## Issue Addressed

Closes #4817.

## Proposed Changes

- Fill in the linear block roots array between 0 and the slot of the first block (e.g. slots 0 and 1 on Holesky).
- Backport the `--freezer`, `--skip` and `--limit` options for `lighthouse db inspect` from tree-states. This allows us to easily view the database corruption of 4817 using `lighthouse db inspect --network holesky --freezer --column bbr --output values --limit 2`.
- Backport the `iter_column_from` change and `MemoryStore` overhaul from tree-states. These are required to enable `lighthouse db inspect`.
- Rework `freezer_upper_limit` to allow state lookups for slots below the `state_lower_limit`. Currently state lookups will fail until state reconstruction completes entirely.

There is a new regression test for the main bug, but no test for the `freezer_upper_limit` fix because we don't currently support running state reconstruction partially (see #3026). This will be fixed once we merge `tree-states`! In lieu of an automated test, I've tested manually on a Holesky node while it was reconstructing.

## Additional Info

Users who backfilled Holesky to slot 0 (e.g. using `--reconstruct-historic-states`) need to either:

- Re-sync from genesis.
- Re-sync using checkpoint sync and the changes from this PR.

Due to the recency of the Holesky genesis, writing a custom pass to fix up broken databases (which would require its own thorough testing) was deemed unnecessary. This is the primary reason for this PR being marked `backwards-incompat`.

This will create few conflicts with Deneb, which I've already resolved on `tree-states-deneb` and will be happy to backport to Deneb once this PR is merged to unstable.
bors bot pushed a commit that referenced this issue Oct 27, 2023
## Issue Addressed

Closes #4817.

## Proposed Changes

- Fill in the linear block roots array between 0 and the slot of the first block (e.g. slots 0 and 1 on Holesky).
- Backport the `--freezer`, `--skip` and `--limit` options for `lighthouse db inspect` from tree-states. This allows us to easily view the database corruption of 4817 using `lighthouse db inspect --network holesky --freezer --column bbr --output values --limit 2`.
- Backport the `iter_column_from` change and `MemoryStore` overhaul from tree-states. These are required to enable `lighthouse db inspect`.
- Rework `freezer_upper_limit` to allow state lookups for slots below the `state_lower_limit`. Currently state lookups will fail until state reconstruction completes entirely.

There is a new regression test for the main bug, but no test for the `freezer_upper_limit` fix because we don't currently support running state reconstruction partially (see #3026). This will be fixed once we merge `tree-states`! In lieu of an automated test, I've tested manually on a Holesky node while it was reconstructing.

## Additional Info

Users who backfilled Holesky to slot 0 (e.g. using `--reconstruct-historic-states`) need to either:

- Re-sync from genesis.
- Re-sync using checkpoint sync and the changes from this PR.

Due to the recency of the Holesky genesis, writing a custom pass to fix up broken databases (which would require its own thorough testing) was deemed unnecessary. This is the primary reason for this PR being marked `backwards-incompat`.

This will create few conflicts with Deneb, which I've already resolved on `tree-states-deneb` and will be happy to backport to Deneb once this PR is merged to unstable.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
database enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants