Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

etcdserver: make livez return ok when defrag is active. #16858

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

siyuanfoundation
Copy link
Contributor

server/etcdserver/api/etcdhttp/health.go Outdated Show resolved Hide resolved
server/etcdserver/api/etcdhttp/health.go Outdated Show resolved Hide resolved
server/etcdserver/api/etcdhttp/health.go Outdated Show resolved Hide resolved
server/storage/backend/backend.go Outdated Show resolved Hide resolved
tests/e2e/http_health_check_test.go Show resolved Hide resolved
server/etcdserver/api/etcdhttp/health.go Outdated Show resolved Hide resolved
Copy link
Member

@chaochn47 chaochn47 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks @siyuanfoundation!

if srv.lg == nil {
srv.lg = zap.NewNop()
}
s.Backend().SubscribeDefragNotifier(healthNotifier)
return &authMaintenanceServer{srv, &AuthAdmin{s}}
}

func (ms *maintenanceServer) Defragment(ctx context.Context, sr *pb.DefragmentRequest) (*pb.DefragmentResponse, error) {
ms.lg.Info("starting defragment")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why move notification from here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original notifier is called in grpc server.

  1. It is strange to have a http endpoint depend on grpc server calls. The cluster could be started with grpcEnabled=false.
  2. There could be 2 grpc servers, 1 for insecure and 1 for secure. Which one should be used to determine the http endpoint state?
    Based on these reasons, I think it is better to move the notifiers to the backend.

Copy link
Member

@serathius serathius Nov 16, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, but this is unrelated an change. Still it doesn't solve the issue that there might be multiple calls to Defrag, that will be blocked on somewhere in internals. This still leaves us open to concurrency issue. Let's just use the notifier as it is and do a followup issue to fix notifier.

Copy link
Contributor Author

@siyuanfoundation siyuanfoundation Nov 16, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tbh, it is very gnarly to pass notifiers between http and grpc, something like https://github.com/etcd-io/etcd/compare/main...siyuanfoundation:etcd:defrag2?expand=1.

How about I start a PR #16959 to move the notifiers to the backend first?

@ahrtr
Copy link
Member

ahrtr commented Nov 15, 2023

Please rebase this PR, I will take a look later.

Signed-off-by: Siyuan Zhang <sizhang@google.com>
@k8s-ci-robot
Copy link

PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot
Copy link

@siyuanfoundation: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-etcd-verify a652d02 link true /test pull-etcd-verify
pull-etcd-unit-test-amd64 a652d02 link true /test pull-etcd-unit-test-amd64
pull-etcd-unit-test-arm64 a652d02 link true /test pull-etcd-unit-test-arm64

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

5 participants