Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tests: add robustness test for issue 17780 #18099

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 9 additions & 0 deletions tests/robustness/failpoint/gofail.go
Original file line number Diff line number Diff line change
Expand Up @@ -59,6 +59,15 @@ var (
BeforeApplyOneConfChangeSleep Failpoint = killAndGofailSleep{"beforeApplyOneConfChange", time.Second}
RaftBeforeSaveSleep Failpoint = gofailSleepAndDeactivate{"raftBeforeSave", time.Second}
RaftAfterSaveSleep Failpoint = gofailSleepAndDeactivate{"raftAfterSave", time.Second}

// AllowBatchCompactBeforeSetFinishedCompactPanic is used to trigger
// that compactBeforeSetFinishedCompact failpoint only if the current
// revision number is higher than that batch limit.
AllowBatchCompactBeforeSetFinishedCompactPanic Failpoint = goPanicFailpoint{
failpoint: "compactBeforeSetFinishedCompact",
trigger: triggerCompact{multiBatchCompaction: true},
target: AnyMember,
}
Comment on lines +62 to +70
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// AllowBatchCompactBeforeSetFinishedCompactPanic is used to trigger
// that compactBeforeSetFinishedCompact failpoint only if the current
// revision number is higher than that batch limit.
AllowBatchCompactBeforeSetFinishedCompactPanic Failpoint = goPanicFailpoint{
failpoint: "compactBeforeSetFinishedCompact",
trigger: triggerCompact{multiBatchCompaction: true},
target: AnyMember,
}
CompactBeforeSetFinishedCompactPanic Failpoint = goPanicFailpoint{"compactBeforeSetFinishedCompact", triggerCompact{multiBatchCompaction: true}, AnyMember}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @serathius , CompactBeforeSetFinishedCompactPanic is already existing at

CompactBeforeSetFinishedCompactPanic Failpoint = goPanicFailpoint{"compactBeforeSetFinishedCompact", triggerCompact{}, AnyMember}
Do you mean that I should replace existing one?

I add new one here because this issue has a requirement that the compactor has to delete tombstone in one batch and update UnsafeSetFinishedCompact value in next round. IMO, it's better to have two go-failpoints here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can have both, best to even duplicate all failpoints with triggerCompact to have option with and without multiBatchCompaction.

)

type goPanicFailpoint struct {
Expand Down
18 changes: 18 additions & 0 deletions tests/robustness/makefile.mk
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,11 @@ test-robustness-issue15271: /tmp/etcd-v3.5.7-failpoints/bin
GO_TEST_FLAGS='-v --run=TestRobustnessRegression/Issue15271 --count 100 --failfast --bin-dir=/tmp/etcd-v3.5.7-failpoints/bin' make test-robustness && \
echo "Failed to reproduce" || echo "Successful reproduction"

.PHONY: test-robustness-issue17780
test-robustness-issue17780: /tmp/etcd-3.5.6b034466aa0ac2b46fe01fb5bd2233946f46d453-failpoints/bin
GO_TEST_FLAGS='-v --run=TestRobustnessRegression/Issue17780 --count 100 --failfast --bin-dir=/tmp/etcd-3.5.6b034466aa0ac2b46fe01fb5bd2233946f46d453-failpoints/bin' make test-robustness && \
echo "Failed to reproduce" || echo "Successful reproduction"

# Failpoints

GOPATH = $(shell go env GOPATH)
Expand Down Expand Up @@ -84,6 +89,19 @@ $(GOPATH)/bin/gofail: tools/mod/go.mod tools/mod/go.sum
make gofail-enable; \
make build;

# REF: https://github.com/etcd-io/etcd/commit/6b034466aa0ac2b46fe01fb5bd2233946f46d453
/tmp/etcd-3.5.6b034466aa0ac2b46fe01fb5bd2233946f46d453-failpoints/bin: $(GOPATH)/bin/gofail
rm -rf /tmp/etcd-3.5.6b034466aa0ac2b46fe01fb5bd2233946f46d453-failpoints/
mkdir -p /tmp/etcd-3.5.6b034466aa0ac2b46fe01fb5bd2233946f46d453-failpoints/
cd /tmp/etcd-3.5.6b034466aa0ac2b46fe01fb5bd2233946f46d453-failpoints/; \
git clone --branch release-3.5 https://github.com/etcd-io/etcd.git .; \
git checkout 6b034466aa0ac2b46fe01fb5bd2233946f46d453; \
go get go.etcd.io/gofail@${GOFAIL_VERSION}; \
(cd server; go get go.etcd.io/gofail@${GOFAIL_VERSION}); \
(cd etcdctl; go get go.etcd.io/gofail@${GOFAIL_VERSION}); \
(cd etcdutl; go get go.etcd.io/gofail@${GOFAIL_VERSION}); \
FAILPOINTS=true ./build;

/tmp/etcd-v3.5.2-failpoints/bin:
/tmp/etcd-v3.5.4-failpoints/bin:
/tmp/etcd-v3.5.5-failpoints/bin:
Expand Down
12 changes: 12 additions & 0 deletions tests/robustness/scenarios.go
Original file line number Diff line number Diff line change
Expand Up @@ -199,5 +199,17 @@ func regressionScenarios(t *testing.T) []testScenario {
cluster: *e2e.NewConfig(opts...),
})
}
scenarios = append(scenarios, testScenario{
name: "Issue17780",
failpoint: failpoint.AllowBatchCompactBeforeSetFinishedCompactPanic,
profile: traffic.LowTraffic,
traffic: traffic.Issue17780EtcdPutDelete,
cluster: *e2e.NewConfig(
e2e.WithClusterSize(1),
e2e.WithCompactionBatchLimit(100),
e2e.WithSnapshotCount(1000),
e2e.WithGoFailEnabled(true),
),
})
return scenarios
}
12 changes: 12 additions & 0 deletions tests/robustness/traffic/etcd.go
Original file line number Diff line number Diff line change
Expand Up @@ -63,6 +63,18 @@ var (
{choice: Compact, weight: 5},
},
}
// Issue17780EtcdPutDelete is to create high chance to have more delete
// requests so that it's likely to compact that revision which is tombstone.
Issue17780EtcdPutDelete = etcdTraffic{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this be inline if it is not used by any other tests?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean inline is about anonymous value during assignment?
etcdTraffic is unexported type and choiceWeight is alsog unexported type...
I think it's hard to make it inline if I understand it correctly.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. How about just name it to EtcdPutDeleteHeavy?

Copy link
Member

@serathius serathius May 31, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 merging it with other traffic and incorporating it with exploratory testing.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @siyuanfoundation @serathius , I want to keep it with special name here because it's hard to reproduce it without special input or setting. If we put it into exploratory testing, existing case might have chance to reproduce it. But it's unknown for us. I was trying many combinations of traffic but it runs one day without luck. And then we introduce compact traffic, it also makes it hard to trigger issue 17780.

If you have better idea, please let me know. Thanks

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer to avoid having a issue specific traffic, to prevent regressions in robustness we don't need 100% reproducability, about 2-5% is ok.

keyCount: 20,
leaseTTL: DefaultLeaseTTL,
largePutSize: 32769,
requests: []choiceWeight[etcdRequestType]{
{choice: List, weight: 50},
{choice: Delete, weight: 20},
{choice: Put, weight: 10},
},
}
)

type etcdTraffic struct {
Expand Down
Loading