Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update ValidateConfigurationChange to nil check for v2store #15905

Closed
wants to merge 0 commits into from

Conversation

geetasg
Copy link

@geetasg geetasg commented May 16, 2023

@geetasg geetasg marked this pull request as draft May 16, 2023 00:02
@geetasg geetasg force-pushed the main branch 4 times, most recently from 5953ee8 to f41ed2e Compare May 23, 2023 23:45
@geetasg geetasg force-pushed the main branch 3 times, most recently from c31ae34 to 338f747 Compare June 1, 2023 04:44
@geetasg geetasg mentioned this pull request Jun 1, 2023
21 tasks
V2_DEPR_2_GONE = V2DeprecationEnum("gone")

//TODO geetasg does thie default need to change
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By implementing v2snapshots from v3 membership data, you basically implement the write-only-drop-data.
I would be for totally implementing this control flags as it gives false information on what is really supported.

Maybe we should improve documentation and explain:

  • etcd v3.4.X - supports both v2, and v3 API.
  • etcd v3.5.X - supports v3 api by default, v2 API needs to be separably enabled.
  • etcd v3.6.X - supports only v3 api, v2 API is not enabled and any v2 data will be purged.

Note; upgrading from v3.5 to v3.6 and then downgrading back to v3.5 will be supported, but v2 data will be lost. cc @ptabor @ahrtr

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sg. I will change the default to write-only-drop-data and update the docs when ready.

var removedMap map[types.ID]bool
membersMap, removedMap = c.be.MustReadMembersFromBackend()
if !shouldApplyV3 {
//TODO geetasg - RaftCluster purely based on backend cannot validate an entry older than ci
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean by validate?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had a test failure on validation which looked like was due to trying to validate using 'future' info. Ref:https://github.com/etcd-io/etcd/pull/15905/files#diff-164b95d19232df0c51cd511fe4ca61de738660fc11db6ddcdafa0ec5f5c80a35R1978

I will revisit to add a test for that code path once it is better understood.


/*
covered by TestV2DeprecationSnapshotMatches
// TestMembershipStore tests code path used by snapshot
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the plan with this test? Having both e2e and unit test is ok.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had added this test to simulate 3.5 style recovery. But now it is getting in the way of removing SetStore (and v2store from RaftCluster). I am investigating modification to the test or will remove it.

zap.String("path", MemberStoreKey(id)),
zap.Error(err),
)
if !strings.Contains(err.Error(), "Key not found") {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What other errors we expect now?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so far only found this one ("key not found") that needs to be absorbed - because we start with empty store now. I will keep an eye out for any other failures that can be expected.

@@ -1104,6 +1097,14 @@ func verifySnapshotIndex(snapshot raftpb.Snapshot, cindex uint64) {
})
}

func verifyConsistentIndexIsLatest(snapshot raftpb.Snapshot, cindex uint64) {
verify.Verify(func() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice

@ahrtr
Copy link
Member

ahrtr commented Jun 2, 2023

Link to #12913

@geetasg
Copy link
Author

geetasg commented Jun 5, 2023

Notes - removing v2store from etcdserver is causing following test to fail. Debugging in progress.

Test from -
etcd/tests/common

To repro failure -
go test -v --tags e2e -run TestMemberList --race --cpu=4

Currently backed out the checkin to not use v2store and v2applier. It will be reintroduced once above failure is fixed. If anyone has any insight on why this test would fail due to v2store removal, please drop a note here.

This is manually reproducible. If we run 3.6 (with dropped v2 applier) along with one or more 3.5 members, the 3.6 members will not show the clienturl for 3.5 members. This seems to be because the 3.5 members are using v2 applier to get the client urls updated.

Code link -https://github.com/etcd-io/etcd/blob/release-3.5/server/etcdserver/server.go#L2041

Stack trace of the affected path from experiment -

d":"ca50e9357181d758","updated-remote-peer-id":"72e86eb5e356a737","updated-remote-client-urls":["http://localhost:20010"]}
goroutine 96 [running]:
runtime/debug.Stack()
        runtime/debug/stack.go:24 +0x65
runtime/debug.PrintStack()
        runtime/debug/stack.go:16 +0x19
go.etcd.io/etcd/server/v3/etcdserver/api/membership.(*RaftCluster).UpdateAttributes(0xc0006eeae0, 0x5b?, {{0xc00024e630, 0x28}, {0xc0005c2400, 0x1, 0x4}}, 0x1)
        go.etcd.io/etcd/server/v3/etcdserver/api/membership/cluster.go:455 +0x485
go.etcd.io/etcd/server/v3/etcdserver.(*applierV2store).Put(0xc0002431e0, 0xc000314240, 0x0?)
        go.etcd.io/etcd/server/v3/etcdserver/apply_v2.go:95 +0x6fb
go.etcd.io/etcd/server/v3/etcdserver.(*EtcdServer).applyV2Request(0xc00038b800, 0xc000314240, 0x60?)
        go.etcd.io/etcd/server/v3/etcdserver/apply_v2.go:142 +0x1c7
go.etcd.io/etcd/server/v3/etcdserver.(*EtcdServer).applyEntryNormal(0xc00038b800, 0xc00080f840)
        go.etcd.io/etcd/server/v3/etcdserver/server.go:1896 +0x59c
go.etcd.io/etcd/server/v3/etcdserver.(*EtcdServer).apply(0xc00038b800, {0xc0002203f0, 0x2, 0xc68226?}, 0xc00080fa70?)
        go.etcd.io/etcd/server/v3/etcdserver/server.go:1818 +0x79a
go.etcd.io/etcd/server/v3/etcdserver.(*EtcdServer).applyEntries(0xc00038b800, 0xc0001ea180, 0x0?)
        go.etcd.io/etcd/server/v3/etcdserver/server.go:1131 +0x265
go.etcd.io/etcd/server/v3/etcdserver.(*EtcdServer).applyAll(0xc00038b800, 0xc0001ea180, 0xc0005d8210)
        go.etcd.io/etcd/server/v3/etcdserver/server.go:926 +0x65
go.etcd.io/etcd/server/v3/etcdserver.(*EtcdServer).run.func8({0xc00080ffa0, 0xc000521320})
        go.etcd.io/etcd/server/v3/etcdserver/server.go:847 +0x25
go.etcd.io/etcd/pkg/v3/schedule.job.Do(...)
        go.etcd.io/etcd/pkg/v3@v3.6.0-alpha.0/schedule/schedule.go:41
go.etcd.io/etcd/pkg/v3/schedule.(*fifo).executeJob(0xc00080ff90?, {0x1244108?, 0xc00051e468?}, 0x0?)
        go.etcd.io/etcd/pkg/v3@v3.6.0-alpha.0/schedule/schedule.go:206 +0x88
go.etcd.io/etcd/pkg/v3/schedule.(*fifo).run(0xc000536380)
        go.etcd.io/etcd/pkg/v3@v3.6.0-alpha.0/schedule/schedule.go:187 +0x11a
created by go.etcd.io/etcd/pkg/v3/schedule.NewFIFOScheduler
        go.etcd.io/etcd/pkg/v3@v3.6.0-alpha.0/schedule/schedule.go:101 +0x18b
{"level":"info","ts":"2023-06-06T01:18:14.596782Z","caller":"membership/cluster.go:447","msg":"updated member","cluster-id":"df14232385b4161a","local-member-id":"ca50e9357181d758","updated-remote-peer-id":"ca50e9357181d758","updated-remote-client-urls":["http://localhost:20000"]}

@serathius looks like some form of v2 applier will need to be there to handle the publish req from 3.5. can you please comment? thx

@geetasg geetasg closed this Jun 23, 2023
@geetasg geetasg force-pushed the main branch 2 times, most recently from 1a25de7 to 3e725ac Compare June 23, 2023 17:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

3 participants