You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have already searched this project's issues to determine if a bug report has already been made.
What happened?
Kube-fledged periodically runs pods on every node that attempt to pull images to ensure that node's image cache is up to date. This runs every 3 minutes in the current stack configuration.
However, while these pods are running, Karpenter cannot disrupt the nodes because the kube-fledged pods are bound to their nodes and cannot be rescheduled on different nodes (a requirement of karpeneter scale-down). Since kube-fledged runs so often, these often leaves Karpenter perpetually unable to disrupt nodes.
The challenge is that the kube-fledged sync does not run automatically on new node creation so unless the sync runs often its possible a node might not have images in its image cache when needed.
Personally, it seems like we might need to fork kube-fledged to add this capability since the project seems relatively unmaintained.
Steps to Reproduce
Default behavior of the stack. Simply observe.
Relevant log output
not all pods would schedule, linkerd/linkerd-proxy-czh6d-7h5r4 => incompatible wit │
│ h nodepool "spot-arm", daemonset overhead={"cpu":"304m","memory":"815053350","pods":"8"}, incompatible requirements, key kubernetes.io/hostname │
│ , kubernetes.io/hostname In [ip-10-0-150-93.us-east-2.compute.internal] not in kubernetes.io/hostname In [hostname-placeholder-5098]; incompati │
│ ble with nodepool "spot", daemonset overhead={"cpu":"304m","memory":"815053350","pods":"8"}, incompatible requirements, key kubernetes.io/hostn │
│ ame, kubernetes.io/hostname In [ip-10-0-150-93.us-east-2.compute.internal] not in kubernetes.io/hostname In [hostname-placeholder-5099]; incomp │
│ atible with nodepool "burstable-arm", daemonset overhead={"cpu":"304m","memory":"815053350","pods":"8"}, incompatible requirements, key kuberne │
│ tes.io/hostname, kubernetes.io/hostname In [ip-10-0-150-93.us-east-2.compute.internal] not in kubernetes.io/hostname In [hostname-placeholder-5 │
│ 00[]; incompatible with nodepool "burstable", daemonset overhead={"cpu":"304m","memory":"815053350","pods":"8"}, incompatible requirements, key │
│ kubernetes.io/hostname, kubernetes.io/hostname In [ip-10-0-150-93.us-east-2.compute.internal] not in kubernetes.io/hostname In [hostname-place │
│ older-5101[]; incompatible with nodepool "on-demand-arm", daemonset overhead={"cpu":"304m","memory":"815053350","pods":"8"}, incompatible requi │
│ rements, key kubernetes.io/hostname, kubernetes.io/hostname In [ip-10-0-150-93.us-east-2.compute.internal] not in kubernetes.io/hostname In [ho │
│ tname-placeholder-5102[]; incompatible with nodepool "on-demand", daemonset overhead={"cpu":"304m","memory":"815053350","pods":"8"}, incompatib │
│ le requirements, key kubernetes.io/hostname, kubernetes.io/hostname In [ip-10-0-150-93.us-east-2.compute.internal] not in kubernetes.io/hostnam │
│ e In [hostname-placeholder-5103]
The text was updated successfully, but these errors were encountered:
@wesbragagt I am still collecting cost data to determine the impact, but I suspect it has an impact.
For this and a few other reasons, we are likely to going to fork the kube-fledged project and manage a custom version ourself that plays nicer with modern cluster components (kube-fledged is unmaintained it seems). Our goal is to have that integrated by the next stable release.
Prior Search
What happened?
Kube-fledged periodically runs pods on every node that attempt to pull images to ensure that node's image cache is up to date. This runs every 3 minutes in the current stack configuration.
However, while these pods are running, Karpenter cannot disrupt the nodes because the kube-fledged pods are bound to their nodes and cannot be rescheduled on different nodes (a requirement of karpeneter scale-down). Since kube-fledged runs so often, these often leaves Karpenter perpetually unable to disrupt nodes.
The challenge is that the kube-fledged sync does not run automatically on new node creation so unless the sync runs often its possible a node might not have images in its image cache when needed.
Personally, it seems like we might need to fork kube-fledged to add this capability since the project seems relatively unmaintained.
Steps to Reproduce
Default behavior of the stack. Simply observe.
Relevant log output
The text was updated successfully, but these errors were encountered: