From 51ffed70298c859129b1ab15e963c25f3080186f Mon Sep 17 00:00:00 2001 From: Marek Grabowski Date: Fri, 8 Sep 2017 12:05:02 +0100 Subject: [PATCH] Add documentation for TaintNodesByCondition --- docs/concepts/architecture/nodes.md | 5 +++++ .../configuration/taint-and-toleration.md | 13 +++++++++++-- docs/concepts/workloads/controllers/daemonset.md | 16 +++++++++------- 3 files changed, 25 insertions(+), 9 deletions(-) diff --git a/docs/concepts/architecture/nodes.md b/docs/concepts/architecture/nodes.md index 1d361db045e69..cdfde06e23565 100644 --- a/docs/concepts/architecture/nodes.md +++ b/docs/concepts/architecture/nodes.md @@ -67,6 +67,8 @@ If the Status of the Ready condition is "Unknown" or "False" for longer than the In versions of Kubernetes prior to 1.5, the node controller would [force delete](/docs/concepts/workloads/pods/pod/#force-deletion-of-pods) these unreachable pods from the apiserver. However, in 1.5 and higher, the node controller does not force delete pods until it is confirmed that they have stopped running in the cluster. One can see these pods which may be running on an unreachable node as being in the "Terminating" or "Unknown" states. In cases where Kubernetes cannot deduce from the underlying infrastructure if a node has permanently left a cluster, the cluster administrator may need to delete the node object by hand. Deleting the node object from Kubernetes causes all the Pod objects running on it to be deleted from the apiserver, freeing up their names. +In version 1.8 a possibility to automatically create [taints](/docs/concepts/configuration/taint-and-toleration) representing Conditions was added as an alpha feature. Enabling it makes scheduler ignore Conditions when considering a Node, instead it looks at the taints and Pod's tolerations. This allows users to decide whether they want to keep old behavior and don't schedule their Pods on Nodes with some Conditions, or rather corresponding taints, or if they want to add a toleration and allow it. Note that because of small delay (usually <1s) between time when Condition is observed and Taint is created it's possible that enabling this feature will slightly increase number of Pods that are successfully scheduled but rejected by the Kubelet. + ### Capacity Describes the resources available on the node: CPU, memory and the maximum @@ -174,6 +176,9 @@ NodeController is responsible for adding taints corresponding to node problems l node unreachable or not ready. See [this documentation](/docs/concepts/configuration/taint-and-toleration) for details about `NoExecute` taints and the alpha feature. +Since Kubernetes 1.8 NodeController may be made responsible for creating taints represeting +Node Conditions. This is an alpha feature as of 1.8. + ### Self-Registration of Nodes When the kubelet flag `--register-node` is true (the default), the kubelet will attempt to diff --git a/docs/concepts/configuration/taint-and-toleration.md b/docs/concepts/configuration/taint-and-toleration.md index f16f69b88fca9..aacb3209c7ed1 100644 --- a/docs/concepts/configuration/taint-and-toleration.md +++ b/docs/concepts/configuration/taint-and-toleration.md @@ -249,9 +249,18 @@ admission controller](https://git.k8s.io/kubernetes/plugin/pkg/admission/default * `node.alpha.kubernetes.io/unreachable` * `node.alpha.kubernetes.io/notReady` + +This ensures that DaemonSet pods are never evicted due to these problems, +which matches the behavior when this feature is disabled. + +## Taint Nodes by Condition + +In Kuberentes 1.8 we added an alpha feature that makes NodeController create Taints matching Node Conditions and in the same time disables Condition check in scheduler. This assures that Conditions doesn't have influence on what's scheduled on the Node and that user +can choose to ignore some of node's problems by adding appropriate tolerations to hers Pods. + +To make sure that turning on this feature doesn't break Daemon sets from 1.8 DaemonSet controller will automatically add following `NoSchedule` tolerations to all deamons: + * `node.kubernetes.io/memoryPressure` * `node.kubernetes.io/diskPressure` * `node.kubernetes.io/outOfDisk` (*only for critical pods*) -This ensures that DaemonSet pods are never evicted due to these problems, -which matches the behavior when this feature is disabled. diff --git a/docs/concepts/workloads/controllers/daemonset.md b/docs/concepts/workloads/controllers/daemonset.md index 26bc660eefa6f..1bdc4fef8519e 100644 --- a/docs/concepts/workloads/controllers/daemonset.md +++ b/docs/concepts/workloads/controllers/daemonset.md @@ -103,19 +103,21 @@ but they are created with `NoExecute` tolerations for the following taints with - `node.alpha.kubernetes.io/notReady` - `node.alpha.kubernetes.io/unreachable` - - `node.alpha.kubernetes.io/memoryPressure` - - `node.alpha.kubernetes.io/diskPressure` - -When the support to critical pods is enabled and the pods in a DaemonSet are -labelled as critical, the Daemon pods are created with an additional -`NoExecute` toleration for the `node.alpha.kubernetes.io/outOfDisk` taint with -no `tolerationSeconds`. This ensures that when the `TaintBasedEvictions` alpha feature is enabled, they will not be evicted when there are node problems such as a network partition. (When the `TaintBasedEvictions` feature is not enabled, they are also not evicted in these scenarios, but due to hard-coded behavior of the NodeController rather than due to tolerations). + They also tolerate following `NoSchedule` taints: + - `node.kubernetes.io/memoryPressure` + - `node.kubernetes.io/diskPressure` + +When the support to critical pods is enabled and the pods in a DaemonSet are +labelled as critical, the Daemon pods are created with an additional +`NoSchedule` toleration for the `node.kubernetes.io/outOfDisk` taint. + +Note that all above `NoSchedule` taints above are created only in version 1.8 or leater if alpha feature `TaintNodesByCondition` is enabled. ## Communicating with Daemon Pods