You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
During the recent rollbacks that prompted the release of version 1.5.4, we noticed that some nodes were trying to count votes for superblocks with an odd index, while other nodes were trying to count votes for superblocks with an even index. This shouldn't be possible with the current protocol, because if there is no superblock consensus for superblock N, the next votes will be for superblock N+2.
In PR #2295 I introduced a workaround that works by checking that the superblock index has the expected parity when counting the votes. While this fix may be good enough because in the worst case nodes will only stay forked for one superepoch, we should try to find the root cause of the issue.
I believe this is related to the synchronization code, which in theory already checks for this case as seen by all the % 2 present in this function:
I was able to accidentally reproduce this issue when running a testnet with only one node. Ocasionally when starting the node, I would see the error log from PR #2295. So this makes me think that the issue can be triggered by a lack of new blocks, maybe because then the node goes to synced state without going through the synchronization process.
During the recent rollbacks that prompted the release of version 1.5.4, we noticed that some nodes were trying to count votes for superblocks with an odd index, while other nodes were trying to count votes for superblocks with an even index. This shouldn't be possible with the current protocol, because if there is no superblock consensus for superblock N, the next votes will be for superblock N+2.
In PR #2295 I introduced a workaround that works by checking that the superblock index has the expected parity when counting the votes. While this fix may be good enough because in the worst case nodes will only stay forked for one superepoch, we should try to find the root cause of the issue.
I believe this is related to the synchronization code, which in theory already checks for this case as seen by all the
% 2
present in this function:witnet-rust/node/src/actors/chain_manager/handlers.rs
Line 1827 in 2fa9a2f
But it was impossible for me to reproduce the bug in a local testnet, so not sure what is the actual cause.
The text was updated successfully, but these errors were encountered: