Implement kubernetes probes #542

serathius · 2020-06-15T14:22:35Z

/kind feature
Goal of this issue is to create proper definitions of health and readiness for metrics server and implement Kubernetes probes based on them.

Background

Code in metrics server does two following processes:

Periodically scrapes all the nodes in cluster for metrics which is uses to fill internal cache
Serves Metrics API by joining list of nodes/pods fetched from Apiserver with metrics stored in cache

Some problems related:

To provide zero Metrics API downtime during rolling update, new instance of MS needs to have cache filled.
On big clusters it's common to have some requests fail, meaning we cannot depend on succeeding all requests.
Before introducing a change to force a sync on node informer (k8s client api compatible cache), first scrape was running with empty list of nodes making MS ready before having any scrapes done.

Things to consider:

What probes to use? Readiness, Liveness, Startup
What endpoints to use? healthz, readyz, livez

Proposal

Definitions:

Metrics Server is alive if it is able to respond to HTTP request and there was a scrape job that finished (no success needed) within last period
Metrics Server is ready if it is able to respond to HTTP request, was able to access apiserver and last scrape job has succeeded for at least one node

Reasoning:

alive should mean that all processes are working. For MS this is HTTP server working + scraping executing once per period (protects agains deadlock or panic)
ready should mean that MS is ready to serve traffic. This means that we have data to serve (job has at least one succeed), and our dependencies are available (without apiserver we cannot serve request). We cannot check for all scrapes to succeed as in big clusters there is at least one node failing.

Things to consider:

We could also introduce some threshold for number of succeeded scrapes (e.g. 50%), but this really depends on types of workload in cluster (we could consider adding metric for it and allowing users to monitor and set this threshold themselfs).
(not related to probes) Scraping job depends on freshness of list of nodes, we could consider monitoring freshness of node data. I read some PRs to kubernetes proposing for syncing controllers before reconcile. There were some performance problems, but it would be interesting

TODO

We should update/add implementation of livez and readyz based on new definitions. Checking if apiserver is available should be as light as possible (e.g. check if TCP connection is open). In addition we should add/update probes, default configuration (no delay, every 10s with 3 fails threshold) looks ok for now.

Alternatives

No need to implement Startup probe as we don't fall under https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/#when-should-you-use-a-startup-probe

/cc @s-urbaniak

The text was updated successfully, but these errors were encountered:

serathius · 2020-06-21T12:45:28Z

/help

k8s-ci-robot · 2020-06-21T12:45:29Z

@serathius:
This request has been marked as needing help from a contributor.

Please ensure the request meets the requirements listed here.

If this request no longer meets these requirements, the label can be removed
by commenting with the /remove-help command.

In response to this:

/help

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

bytetwin · 2020-06-22T05:20:33Z

I am interested to pick this up.

@serathius Is my understanding correct ?

We need to implement

Liveness(/healthz) endpoint - a 200 response when there was a successful scrape in last 30sec
Readiness(/readyz) endpoint - a 200 response when MS has started up, made a successful scrape and has retrieved metric for atleast one node.

Questions

For liveness, is scrape in last 30 sec considered success. Is this time limit configurable?
For readiness - is it success even if one node was scraped. Is this also configurable ?
API server availability - If api-server becomes unavailable much later after MS has started, the liveness check will fail as there is no successful scrape in last 30 sec. K8S restarting MS wont solve the problem. How to handle this

serathius · 2020-06-22T08:36:52Z

/cc @logicalhan
To get some feedback from someone, who worked on apiserver health endpoints

logicalhan · 2020-06-22T21:04:59Z

/cc @logicalhan
To get some feedback from someone, who worked on apiserver health endpoints

Oh thanks :)

I am interested to pick this up.

@serathius Is my understanding correct ?

We need to implement

Liveness(/healthz) endpoint - a 200 response when there was a successful scrape in last 30sec

"/healthz" is actually deprecated in favor of "/livez" now.

For liveness, is scrape in last 30 sec considered success. Is this time limit configurable?

Yes, there are standard kubernetes configuration options. You can set initialDelaySeconds to specify how long you expect the boot sequence to take (i.e. probing won't start until this time elapses). You can set PeriodSeconds to configure how often to poll and FailureThreshold for how many failures in a row you require before the kubelet will restart the container.

For readiness - is it success even if one node was scraped. Is this also configurable ?

This is configurable as well, but generally you want readiness endpoints to fail fast, since it's usually used for load-balancing.

API server availability - If api-server becomes unavailable much later after MS has started, the liveness check will fail as there is no successful scrape in last 30 sec. K8S restarting MS wont solve the problem. How to handle this

The apiserver doesn't scrape the metrics-server liveness/readiness probes, this is actually a function of the kubelet.

bytetwin · 2020-06-23T05:39:41Z

The apiserver doesn't scrape the metrics-server liveness/readiness probes, this is actually a function of the kubelet.

The design mentions that the /readyz should have a hard-dependancy on api-server. After MS becomes ready (returns 200 for /readyz), the api server becomes unavailable some point in the future, then livez will always return error as no scrape is successful. In this case kubelet will restart MS which won't become ready as the hard dependancy is unavailable. This is my understanding. Want to clarify this.

Wondering if above point is relevant at all

serathius · 2020-06-23T08:45:17Z

The apiserver doesn't scrape the metrics-server liveness/readiness probes, this is actually a function of the kubelet.

The design mentions that the /readyz should have a hard-dependancy on api-server. After MS becomes ready (returns 200 for /readyz), the api server becomes unavailable some point in the future, then livez will always return error as no scrape is successful. In this case kubelet will restart MS which won't become ready as the hard dependancy is unavailable. This is my understanding. Want to clarify this.

Wondering if above point is relevant at all

Liveness of container should not depend on external services (assuming that container can recover from network problems). Apiserver being down should not cause a MS restart (it behaves like that currently and this is not correct). Here proposed definition requires at least one successfully scraped node, meaning that it should not fail on Apiseriver down if we discovered at least one node.

Possible cases when MS don't have nodes to scrape causing liveness check to fail:

Long apiserver downtime leads to node list cache is cleared in MS (need to check if this is possible)
All nodes in cluster becoming unavailable
Externally run Metrics server configured for empty cluster (not supported)

We could change this behavior, but I'm little worried about it making MS ready to early in startup. I would prefer to check if those problems happen in real clusters before trying to solve them.

Liveness(/healthz) endpoint - a 200 response when there was a successful scrape in last 30sec

"/healthz" is actually deprecated in favor of "/livez" now.

Thanks for info, we should change design to use /livez

For liveness, is scrape in last 30 sec considered success. Is this time limit configurable?

Yes, there are standard kubernetes configuration options. You can set initialDelaySeconds to specify how long you expect the boot sequence to take (i.e. probing won't start until this time elapses). You can set PeriodSeconds to configure how often to poll and FailureThreshold for how many failures in a row you require before the kubelet will restart the container.

I think that Hanu meant the period of scraping metrics set in Metrics Server (not the probe period). Answer is yes, scrape period can be changed via a command line flag. Proposed definition of liveness here is to check if this operations is executed on time (+/- 50% to avoid restarting due temporary overload). Value of liveness is independent to period of probes that check it.

For readiness - is it success even if one node was scraped. Is this also configurable ?

This is configurable as well, but generally you want readiness endpoints to fail fast, since it's usually used for load-balancing.

For readiness there is a slightly different working (last scrape not a scrape within 30s). We should check for success in only last scrape attempt and not care about the time.

serathius · 2020-07-24T15:17:35Z

Hey @hanumanthan,
This issue is part of milestone for v0.4.0, that we will be preparing to release. I think your work on probes is really important for next release.
Will you be able to work on your PR to finish it?

bytetwin · 2020-07-25T23:55:55Z

Hi @serathius I would love to but given my personal circumstances, I wont be able to work on this for a while.

serathius · 2020-09-22T21:43:03Z

np, thanks for confirmation.

serathius · 2020-11-13T16:55:06Z

Both probes were implemented. Implementation of readyz skipped checking apiserver, if that's a problem we can revisit it in future.

k8s-ci-robot added the kind/feature Categorizes issue or PR as related to a new feature. label Jun 15, 2020

serathius mentioned this issue Jun 15, 2020

Add endpoint for Readiness check #291

Closed

serathius added this to the v0.4.0 milestone Jun 15, 2020

serathius changed the title ~~Define metrics server health and ready condition~~ Implement kubernetes probes Jun 21, 2020

k8s-ci-robot added the help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. label Jun 21, 2020

bytetwin mentioned this issue Jun 24, 2020

[WIP] Adding livez and readyz endpoints #548

Closed

ialidzhikov mentioned this issue Aug 31, 2020

livenessProbe and readinessProbe for metrics-server gardener/gardener#2794

Closed

serathius mentioned this issue Sep 23, 2020

Adding livez and readyz endpoints #592

Merged

serathius closed this as completed Sep 25, 2020

serathius mentioned this issue Nov 13, 2020

/healthz never recovers from failed state #540

Closed

This was referenced Jun 6, 2023

[concept] add livez/readyz for etcd etcd-io/etcd#16008

Closed

Livez/Readyz etcd-io/etcd#16007

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement kubernetes probes #542

Implement kubernetes probes #542

serathius commented Jun 15, 2020 •

edited

Loading

serathius commented Jun 21, 2020

k8s-ci-robot commented Jun 21, 2020

bytetwin commented Jun 22, 2020

serathius commented Jun 22, 2020 •

edited

Loading

logicalhan commented Jun 22, 2020

bytetwin commented Jun 23, 2020 •

edited

Loading

serathius commented Jun 23, 2020

serathius commented Jul 24, 2020

bytetwin commented Jul 25, 2020

serathius commented Sep 22, 2020

serathius commented Nov 13, 2020

Implement kubernetes probes #542

Implement kubernetes probes #542

Comments

serathius commented Jun 15, 2020 • edited Loading

Background

Proposal

TODO

Alternatives

serathius commented Jun 21, 2020

k8s-ci-robot commented Jun 21, 2020

bytetwin commented Jun 22, 2020

serathius commented Jun 22, 2020 • edited Loading

logicalhan commented Jun 22, 2020

bytetwin commented Jun 23, 2020 • edited Loading

serathius commented Jun 23, 2020

serathius commented Jul 24, 2020

bytetwin commented Jul 25, 2020

serathius commented Sep 22, 2020

serathius commented Nov 13, 2020

serathius commented Jun 15, 2020 •

edited

Loading

serathius commented Jun 22, 2020 •

edited

Loading

bytetwin commented Jun 23, 2020 •

edited

Loading