Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: kube-fledged image cache sync interferes with Karpenter scale-down #127

Open
1 task done
fullykubed opened this issue Sep 4, 2024 · 2 comments
Open
1 task done
Assignees
Labels
bug Something isn't working

Comments

@fullykubed
Copy link
Collaborator

fullykubed commented Sep 4, 2024

Prior Search

  • I have already searched this project's issues to determine if a bug report has already been made.

What happened?

Kube-fledged periodically runs pods on every node that attempt to pull images to ensure that node's image cache is up to date. This runs every 3 minutes in the current stack configuration.

However, while these pods are running, Karpenter cannot disrupt the nodes because the kube-fledged pods are bound to their nodes and cannot be rescheduled on different nodes (a requirement of karpeneter scale-down). Since kube-fledged runs so often, these often leaves Karpenter perpetually unable to disrupt nodes.

The challenge is that the kube-fledged sync does not run automatically on new node creation so unless the sync runs often its possible a node might not have images in its image cache when needed.

Personally, it seems like we might need to fork kube-fledged to add this capability since the project seems relatively unmaintained.

Steps to Reproduce

Default behavior of the stack. Simply observe.

Relevant log output

not all pods would schedule, linkerd/linkerd-proxy-czh6d-7h5r4 => incompatible wit │
│ h nodepool "spot-arm", daemonset overhead={"cpu":"304m","memory":"815053350","pods":"8"}, incompatible requirements, key kubernetes.io/hostname │
│ , kubernetes.io/hostname In [ip-10-0-150-93.us-east-2.compute.internal] not in kubernetes.io/hostname In [hostname-placeholder-5098]; incompati │
│ ble with nodepool "spot", daemonset overhead={"cpu":"304m","memory":"815053350","pods":"8"}, incompatible requirements, key kubernetes.io/hostn │
│ ame, kubernetes.io/hostname In [ip-10-0-150-93.us-east-2.compute.internal] not in kubernetes.io/hostname In [hostname-placeholder-5099]; incomp │
│ atible with nodepool "burstable-arm", daemonset overhead={"cpu":"304m","memory":"815053350","pods":"8"}, incompatible requirements, key kuberne │
│ tes.io/hostname, kubernetes.io/hostname In [ip-10-0-150-93.us-east-2.compute.internal] not in kubernetes.io/hostname In [hostname-placeholder-5 │
│ 00[]; incompatible with nodepool "burstable", daemonset overhead={"cpu":"304m","memory":"815053350","pods":"8"}, incompatible requirements, key │
│  kubernetes.io/hostname, kubernetes.io/hostname In [ip-10-0-150-93.us-east-2.compute.internal] not in kubernetes.io/hostname In [hostname-place │
│ older-5101[]; incompatible with nodepool "on-demand-arm", daemonset overhead={"cpu":"304m","memory":"815053350","pods":"8"}, incompatible requi │
│ rements, key kubernetes.io/hostname, kubernetes.io/hostname In [ip-10-0-150-93.us-east-2.compute.internal] not in kubernetes.io/hostname In [ho │
│ tname-placeholder-5102[]; incompatible with nodepool "on-demand", daemonset overhead={"cpu":"304m","memory":"815053350","pods":"8"}, incompatib │
│ le requirements, key kubernetes.io/hostname, kubernetes.io/hostname In [ip-10-0-150-93.us-east-2.compute.internal] not in kubernetes.io/hostnam │
│ e In [hostname-placeholder-5103]
@fullykubed fullykubed added the bug Something isn't working label Sep 4, 2024
@fullykubed fullykubed self-assigned this Sep 4, 2024
@wesbragagt
Copy link
Contributor

@fullykubed would this increase cost for users running kube_fledged?

@fullykubed
Copy link
Collaborator Author

@wesbragagt I am still collecting cost data to determine the impact, but I suspect it has an impact.

For this and a few other reasons, we are likely to going to fork the kube-fledged project and manage a custom version ourself that plays nicer with modern cluster components (kube-fledged is unmaintained it seems). Our goal is to have that integrated by the next stable release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants