Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix corrupted DB on networks where the first slot is skipped (Holesky) #4985

Merged
merged 3 commits into from
Dec 7, 2023

Conversation

jimmygchen
Copy link
Member

Issue Addressed

#4943 (co-authored with @michaelsproul)

Proposed Changes

There was a bug in earlier versions of Lighthouse where zero block roots are incorrectly stored before the first block slot. This happens on nodes that were checkpoint sync'd on a network that has skipped slot(s) before the first block. The additional consequence is that Lighthouse BN could return duplicate genesis blocks in BlocksByRange responses.

This bug was fixed in #4820, so it should not be an issue for nodes that are sync'd using v4.6.0 (next release) on Holesky. This PR contains migration script to heal the database on nodes that were checkpoint sync'd using 4.5.0, so they would no longer serve duplicate genesis blocks.

See @michaelsproul's comment here for more details.

@jimmygchen jimmygchen added bug Something isn't working ready-for-review The code is ready for review database labels Dec 6, 2023
@michaelsproul
Copy link
Member

Reproduced the bug by backfilling a Holesky node with v4.5.0:

$ lighthouse db inspect --network holesky --freezer --column bbr --skip 1 --limit 1 --output values
Dec 07 03:31:43.183 INFO Running database manager for holesky network
Dec 07 03:31:43.266 INFO Hot-Cold DB initialized                 split_state: 0xb1aee59b844da3b1b8e71556bc10782345250db35f436b7dcb6db118dc661218, split_slot: 501376
Dec 07 03:31:43.266 INFO Blob DB initialized                     oldest_blob_slot: None, path: "/home/michael/.lighthouse/holesky/beacon/blobs_db"
Successfully saved values to file: "bbr_0000000000000001.ssz"
Num keys: 1
Total: 4096 bytes
$ xxd -len 96 bbr_0000000000000001.ssz
00000000: ab09 edd9 380f 8451 c3ff 5c80 9821 174a  ....8..Q..\..!.J
00000010: 36dc e606 fea8 b5ea 35ea 9369 15db f889  6.......5..i....
00000020: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000030: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000040: 1298 bea7 6ef3 7014 9e65 35e7 3617 fcfb  ....n.p..e5.6...
00000050: 2dcb bb29 601c f951 5ad9 2e55 7e52 acaa  -..)`..QZ..U~R..

There's that 0x0 block root!

After running this PR and healing block roots we get:

$ xxd -len 96 bbr_0000000000000001.ssz
00000000: ab09 edd9 380f 8451 c3ff 5c80 9821 174a  ....8..Q..\..!.J
00000010: 36dc e606 fea8 b5ea 35ea 9369 15db f889  6.......5..i....
00000020: ab09 edd9 380f 8451 c3ff 5c80 9821 174a  ....8..Q..\..!.J
00000030: 36dc e606 fea8 b5ea 35ea 9369 15db f889  6.......5..i....
00000040: 1298 bea7 6ef3 7014 9e65 35e7 3617 fcfb  ....n.p..e5.6...
00000050: 2dcb bb29 601c f951 5ad9 2e55 7e52 acaa  -..)`..QZ..U~R..

Fixed!

Copy link
Member

@michaelsproul michaelsproul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!!

)
.unwrap()
.map(|r| {
println!("{r:?}");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lets delete this println before merging

@michaelsproul michaelsproul added waiting-on-author The reviewer has suggested changes and awaits thier implementation. v4.6.0 ETA Q1 2024 and removed ready-for-review The code is ready for review labels Dec 7, 2023
@michaelsproul michaelsproul merged commit 67e0569 into sigp:unstable Dec 7, 2023
18 of 25 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working database v4.6.0 ETA Q1 2024 waiting-on-author The reviewer has suggested changes and awaits thier implementation.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants