Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Maintenance precondition check failed because of failurePolicy "Fail" on web hooks #1862

Open
berendt opened this issue Aug 15, 2024 · 3 comments
Labels
kind/bug Categorizes issue or PR as related to a bug.

Comments

@berendt
Copy link

berendt commented Aug 15, 2024

We want to use knative on Kubernetes clusters managed by Gardener. There we have the following issue when working with the knative operator:

Maintenance precondition check failed. Gardener may be unable to perform required actions during maintenance: ValidatingWebhookConfiguration "config.webhook.istio.networking.internal.knative.dev" is problematic: webhook "config.webhook.istio.networking.internal.knative.dev" with failurePolicy "Fail" and 10s timeout might prevent worker nodes from properly joining the shoot cluster

This way we are not able to maintain or hibernate the cluster because of the failed precondition. It is possible to workaround this by manually change the failure policy on the following web hooks:

Namespace	Configuration-Type	                            Configuration-Name
knative-serving	ValidatingWebhookConfiguration	config.webhook.istio.networking.internal.knative.dev
knative-eventig	ValidatingWebhookConfiguration	config.webhook.istio.networking.internal.knative.dev
knative-eventig	ValidatingWebhookConfiguration	config.webhook.serving.knative.dev
knative-eventig	MutatingWebhookConfiguration	webhook.istio.networking.internal.knative.dev

However, this is only temporary; the manual changes are of course overwritten again.

We have not yet found a way to customise this in the knative operator. We have no influence on the Gardener side as it is a managed service that we use for Kubernetes. Any ideas?

@berendt berendt added the kind/bug Categorizes issue or PR as related to a bug. label Aug 15, 2024
@houshengbo
Copy link
Contributor

@berendt Do you suggest any changes to the existing ValidatingWebhookConfiguration and MutatingWebhookConfiguration?

@houshengbo
Copy link
Contributor

It is true that knative operator cannot configure the failurePolicy of any existing ValidatingWebhookConfiguration and MutatingWebhookConfiguration in eventing and serving.

To overcome this, the only thought I have with operator is to use customized manifests, like with the append mode: https://knative.dev/docs/install/operator/configuring-serving-cr/#append-mode You can consolidate all the changes for ValidatingWebhookConfiguration and MutatingWebhookConfiguration, and put one for serving and one for eventing.

Either publish the file somewhere accessible to your kube cluster, so that CR picks it up as additional resources, overriding existing ones if necessary.
Or leverage local volume to host the additional manifests like this: https://vincenthou.medium.com/how-to-customize-the-manifests-for-knative-operator-with-a-local-volume-c576b592d9d7

@rhizoet
Copy link

rhizoet commented Sep 4, 2024

Many thanks for the tip. It is a bit inconvenient that you have to upload a file somewhere in order to integrate it, but it works. We have adapted both webhooks accordingly and integrated them dynamically from the URL.

No more errors and the webhooks are created directly with failurePolicy: Ignore.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

No branches or pull requests

3 participants