Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tests: add robustness test for issue 17780 #18099

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

fuweid
Copy link
Member

@fuweid fuweid commented May 30, 2024

@k8s-ci-robot
Copy link

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@fuweid
Copy link
Member Author

fuweid commented May 30, 2024

ping @ahrtr @serathius @siyuanfoundation

@@ -61,6 +61,18 @@ var (
{choice: LargePut, weight: 5},
},
}
// Issue17780EtcdPutDelete is to create high chance to have more delete
// requests so that it's likely to compact that revision which is tombstone.
Issue17780EtcdPutDelete = etcdTraffic{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this be inline if it is not used by any other tests?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean inline is about anonymous value during assignment?
etcdTraffic is unexported type and choiceWeight is alsog unexported type...
I think it's hard to make it inline if I understand it correctly.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. How about just name it to EtcdPutDeleteHeavy?

Copy link
Member

@serathius serathius May 31, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 merging it with other traffic and incorporating it with exploratory testing.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @siyuanfoundation @serathius , I want to keep it with special name here because it's hard to reproduce it without special input or setting. If we put it into exploratory testing, existing case might have chance to reproduce it. But it's unknown for us. I was trying many combinations of traffic but it runs one day without luck. And then we introduce compact traffic, it also makes it hard to trigger issue 17780.

If you have better idea, please let me know. Thanks

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer to avoid having a issue specific traffic, to prevent regressions in robustness we don't need 100% reproducability, about 2-5% is ok.

@serathius
Copy link
Member

I would like to better understand why this is not aligned with exploratory tests. The goal is to incorporate regressions into exploratory testing, not just create a new scenario.

Comment on lines +62 to +70

// AllowBatchCompactBeforeSetFinishedCompactPanic is used to trigger
// that compactBeforeSetFinishedCompact failpoint only if the current
// revision number is higher than that batch limit.
AllowBatchCompactBeforeSetFinishedCompactPanic Failpoint = goPanicFailpoint{
failpoint: "compactBeforeSetFinishedCompact",
trigger: triggerCompact{multiBatchCompaction: true},
target: AnyMember,
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// AllowBatchCompactBeforeSetFinishedCompactPanic is used to trigger
// that compactBeforeSetFinishedCompact failpoint only if the current
// revision number is higher than that batch limit.
AllowBatchCompactBeforeSetFinishedCompactPanic Failpoint = goPanicFailpoint{
failpoint: "compactBeforeSetFinishedCompact",
trigger: triggerCompact{multiBatchCompaction: true},
target: AnyMember,
}
CompactBeforeSetFinishedCompactPanic Failpoint = goPanicFailpoint{"compactBeforeSetFinishedCompact", triggerCompact{multiBatchCompaction: true}, AnyMember}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @serathius , CompactBeforeSetFinishedCompactPanic is already existing at

CompactBeforeSetFinishedCompactPanic Failpoint = goPanicFailpoint{"compactBeforeSetFinishedCompact", triggerCompact{}, AnyMember}
Do you mean that I should replace existing one?

I add new one here because this issue has a requirement that the compactor has to delete tombstone in one batch and update UnsafeSetFinishedCompact value in next round. IMO, it's better to have two go-failpoints here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can have both, best to even duplicate all failpoints with triggerCompact to have option with and without multiBatchCompaction.

Signed-off-by: Wei Fu <fuweid89@gmail.com>
@serathius
Copy link
Member

serathius commented Jun 15, 2024

See #17680 for an example of how to add new repro. I adjusted couple of parameters.

@k8s-ci-robot
Copy link

PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot
Copy link

@fuweid: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-etcd-integration-2-cpu-amd64 69152bf link false /test pull-etcd-integration-2-cpu-amd64
pull-etcd-unit-test-arm64 69152bf link true /test pull-etcd-unit-test-arm64

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

Revision decreasing after panic during compaction
4 participants