Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

server: Save consistency index and term to backend even when they decease #13903

Merged
merged 1 commit into from
Apr 7, 2022

Conversation

serathius
Copy link
Member

Reason to store CI and term in backend was to make db fully independent
snapshot, it was never meant to interfere with apply logic. Skip of CI
was introduced for v2->v3 migration where we wanted to prevent it from
decreasing when replaying wal in
#5391. By mistake it was added to
apply flow during refactor in
#12855 (comment).

Consistency index and term should only be negotiated and used by raft to make
decisions. Their values should only driven by raft state machine and
backend should only be responsible for storing them.

Fixes #13514
cc @ptabor @ahrtr @spzala

…rease

Reason to store CI and term in backend was to make db fully independent
snapshot, it was never meant to interfere with apply logic. Skip of CI
was introduced for v2->v3 migration where we wanted to prevent it from
decreasing when replaying wal in
etcd-io#5391. By mistake it was added to
apply flow during refactor in
etcd-io#12855 (comment).

Consistency index and term should only be negotiated and used by raft to make
decisions. Their values should only driven by raft state machine and
backend should only be responsible for storing them.
@codecov-commenter
Copy link

Codecov Report

Merging #13903 (12504df) into main (a5b9f72) will decrease coverage by 1.89%.
The diff coverage is 83.33%.

❗ Current head 12504df differs from pull request most recent head 1ea53d5. Consider uploading reports for the commit 1ea53d5 to get more accurate results

@@            Coverage Diff             @@
##             main   #13903      +/-   ##
==========================================
- Coverage   72.39%   70.49%   -1.90%     
==========================================
  Files         469      469              
  Lines       38402    38389      -13     
==========================================
- Hits        27802    27064     -738     
- Misses       8813     9492     +679     
- Partials     1787     1833      +46     
Flag Coverage Δ
all 70.49% <83.33%> (-1.90%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
etcdutl/etcdutl/backup_command.go 9.94% <0.00%> (ø)
etcdutl/snapshot/v3_snapshot.go 66.10% <100.00%> (ø)
server/etcdserver/cindex/cindex.go 92.45% <100.00%> (ø)
server/storage/schema/cindex.go 100.00% <100.00%> (+4.65%) ⬆️
server/proxy/grpcproxy/election.go 8.69% <0.00%> (-73.92%) ⬇️
server/proxy/grpcproxy/lock.go 33.33% <0.00%> (-66.67%) ⬇️
...ver/proxy/grpcproxy/adapter/lock_client_adapter.go 33.33% <0.00%> (-66.67%) ⬇️
...proxy/grpcproxy/adapter/election_client_adapter.go 6.89% <0.00%> (-62.07%) ⬇️
server/etcdserver/api/v3lock/lock.go 10.00% <0.00%> (-60.00%) ⬇️
client/pkg/v3/testutil/pauseable_handler.go 27.27% <0.00%> (-54.55%) ⬇️
... and 72 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update a5b9f72...1ea53d5. Read the comment docs.

if term < oldTerm {
return
}
if index > oldi {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Optional: I would consider being defensive. Something like this:

verify.Assert(func() {  
   oldi, _ := UnsafeReadConsistentIndex(tx)
   if index <= oldi {
     lg.Fatalf("Attempt to apply transaction at <= index detected...");
   } 
 } );
  • This shouldn't be implemented on Term (maybe in 3.6)
  • This should clearly 'fail'/'panic' instead of failing silently.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sense, will prepare a PR.

@serathius serathius merged commit 3dce380 into etcd-io:main Apr 7, 2022
@serathius serathius mentioned this pull request Apr 7, 2022
28 tasks
@ahrtr
Copy link
Member

ahrtr commented Apr 8, 2022

I am thinking this PR might not be fixing the issue, because etcd will always skip smaller consistent_index, see server.go#L1816.

So this PR will change nothing.

@serathius
Copy link
Member Author

serathius commented Apr 8, 2022

I am thinking this PR might not be fixing the issue, because etcd will always skip smaller consistent_index, see server.go#L1816.

This issue was caused by term field, not consistent_index. Removal of consistent_index check was done in addition, as both checks were incorrect from Raft point of view.

@ahrtr
Copy link
Member

ahrtr commented Apr 8, 2022

This issue was caused by term field, not consistent_index. Removal of consistent_index check was done in addition, as both checks were incorrect from Raft point of view.

Do you mean it's possible that the term in an apply entry might be smaller than the existing term in bucket Meta, while consistent_index is bigger than the existing one? Could you provide a detailed workflow how the etcd cluster run into this situlation?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

Data inconsistency in etcd version 3.5.0 (3.5.x rollback> 3.4 upgrade-> 3.5) story
4 participants