Skip to content

Commit

Permalink
Add documentation for TaintNodesByCondition
Browse files Browse the repository at this point in the history
  • Loading branch information
gmarek committed Sep 8, 2017
1 parent a601ca7 commit 51ffed7
Show file tree
Hide file tree
Showing 3 changed files with 25 additions and 9 deletions.
5 changes: 5 additions & 0 deletions docs/concepts/architecture/nodes.md
Original file line number Diff line number Diff line change
Expand Up @@ -67,6 +67,8 @@ If the Status of the Ready condition is "Unknown" or "False" for longer than the

In versions of Kubernetes prior to 1.5, the node controller would [force delete](/docs/concepts/workloads/pods/pod/#force-deletion-of-pods) these unreachable pods from the apiserver. However, in 1.5 and higher, the node controller does not force delete pods until it is confirmed that they have stopped running in the cluster. One can see these pods which may be running on an unreachable node as being in the "Terminating" or "Unknown" states. In cases where Kubernetes cannot deduce from the underlying infrastructure if a node has permanently left a cluster, the cluster administrator may need to delete the node object by hand. Deleting the node object from Kubernetes causes all the Pod objects running on it to be deleted from the apiserver, freeing up their names.

In version 1.8 a possibility to automatically create [taints](/docs/concepts/configuration/taint-and-toleration) representing Conditions was added as an alpha feature. Enabling it makes scheduler ignore Conditions when considering a Node, instead it looks at the taints and Pod's tolerations. This allows users to decide whether they want to keep old behavior and don't schedule their Pods on Nodes with some Conditions, or rather corresponding taints, or if they want to add a toleration and allow it. Note that because of small delay (usually <1s) between time when Condition is observed and Taint is created it's possible that enabling this feature will slightly increase number of Pods that are successfully scheduled but rejected by the Kubelet.

### Capacity

Describes the resources available on the node: CPU, memory and the maximum
Expand Down Expand Up @@ -174,6 +176,9 @@ NodeController is responsible for adding taints corresponding to node problems l
node unreachable or not ready. See [this documentation](/docs/concepts/configuration/taint-and-toleration)
for details about `NoExecute` taints and the alpha feature.

Since Kubernetes 1.8 NodeController may be made responsible for creating taints represeting
Node Conditions. This is an alpha feature as of 1.8.

### Self-Registration of Nodes

When the kubelet flag `--register-node` is true (the default), the kubelet will attempt to
Expand Down
13 changes: 11 additions & 2 deletions docs/concepts/configuration/taint-and-toleration.md
Original file line number Diff line number Diff line change
Expand Up @@ -249,9 +249,18 @@ admission controller](https://git.k8s.io/kubernetes/plugin/pkg/admission/default

* `node.alpha.kubernetes.io/unreachable`
* `node.alpha.kubernetes.io/notReady`

This ensures that DaemonSet pods are never evicted due to these problems,
which matches the behavior when this feature is disabled.

## Taint Nodes by Condition

In Kuberentes 1.8 we added an alpha feature that makes NodeController create Taints matching Node Conditions and in the same time disables Condition check in scheduler. This assures that Conditions doesn't have influence on what's scheduled on the Node and that user
can choose to ignore some of node's problems by adding appropriate tolerations to hers Pods.

To make sure that turning on this feature doesn't break Daemon sets from 1.8 DaemonSet controller will automatically add following `NoSchedule` tolerations to all deamons:

* `node.kubernetes.io/memoryPressure`
* `node.kubernetes.io/diskPressure`
* `node.kubernetes.io/outOfDisk` (*only for critical pods*)

This ensures that DaemonSet pods are never evicted due to these problems,
which matches the behavior when this feature is disabled.
16 changes: 9 additions & 7 deletions docs/concepts/workloads/controllers/daemonset.md
Original file line number Diff line number Diff line change
Expand Up @@ -103,19 +103,21 @@ but they are created with `NoExecute` tolerations for the following taints with

- `node.alpha.kubernetes.io/notReady`
- `node.alpha.kubernetes.io/unreachable`
- `node.alpha.kubernetes.io/memoryPressure`
- `node.alpha.kubernetes.io/diskPressure`

When the support to critical pods is enabled and the pods in a DaemonSet are
labelled as critical, the Daemon pods are created with an additional
`NoExecute` toleration for the `node.alpha.kubernetes.io/outOfDisk` taint with
no `tolerationSeconds`.

This ensures that when the `TaintBasedEvictions` alpha feature is enabled,
they will not be evicted when there are node problems such as a network partition. (When the
`TaintBasedEvictions` feature is not enabled, they are also not evicted in these scenarios, but
due to hard-coded behavior of the NodeController rather than due to tolerations).

They also tolerate following `NoSchedule` taints:
- `node.kubernetes.io/memoryPressure`
- `node.kubernetes.io/diskPressure`

When the support to critical pods is enabled and the pods in a DaemonSet are
labelled as critical, the Daemon pods are created with an additional
`NoSchedule` toleration for the `node.kubernetes.io/outOfDisk` taint.

Note that all above `NoSchedule` taints above are created only in version 1.8 or leater if alpha feature `TaintNodesByCondition` is enabled.

## Communicating with Daemon Pods

Expand Down

0 comments on commit 51ffed7

Please sign in to comment.