Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GLBC] LB garbage collection orphans named ports in instance groups #43

Closed
bowei opened this issue Oct 11, 2017 · 18 comments
Closed

[GLBC] LB garbage collection orphans named ports in instance groups #43

bowei opened this issue Oct 11, 2017 · 18 comments
Assignees
Labels
backend/gce kind/bug Categorizes issue or PR as related to a bug. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness.

Comments

@bowei
Copy link
Member

bowei commented Oct 11, 2017

From @nicksardo on May 8, 2017 18:54

The GLBC does not remove the named port from an instance group when a backend is being deleted.

If users are frequently created/deleting services and ingresses, instance groups will become polluted with old node ports. Eventually, users will hit a max limit.

Exceeded limit 'MAX_DISTINCT_NAMED_PORTS' on resource 'k8s-ig--aaaaaaaaaaaa'

Temporary workaround

region=us-central1
cluster_id=$(kubectl get configmaps ingress-uid -o jsonpath='{.data.uid}' --namespace=kube-system)
ports=$(gcloud compute backend-services list --global --format='value(port,port)' |  xargs printf 'port%s:%s,')
for zone in b c f; do
  gcloud compute instance-groups unmanaged set-named-ports k8s-ig--$cluster_id --zone=$region-$zone --named-ports=$ports
done

Modify the region and list of zone suffix in the script.

Copied from original issue: kubernetes/ingress-nginx#695

@bowei
Copy link
Member Author

bowei commented Oct 11, 2017

From @nicksardo on May 26, 2017 22:20

Requires kubernetes/kubernetes#46457

@bowei
Copy link
Member Author

bowei commented Oct 11, 2017

From @porridge on June 5, 2017 12:44

I see that the dependency was merged a week ago. Is there a plan/ETA for this one?

@bowei
Copy link
Member Author

bowei commented Oct 11, 2017

From @nicksardo on June 5, 2017 17:33

@porridge I'm waiting on one more PR to merge before we can update this repo's vendored dependencies.

I expect to have a fix in for the next release of GLBC, but there's no ETA for that just yet.

@bowei
Copy link
Member Author

bowei commented Oct 11, 2017

From @pawloKoder on July 4, 2017 15:54

Any update on this issue?

@bowei
Copy link
Member Author

bowei commented Oct 11, 2017

From @G-Harmon on September 27, 2017 23:53

Hi, I'm starting to work on this now, as my first Kubernetes fix.

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

Prevent issues from auto-closing with an /lifecycle frozen comment.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or @fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 9, 2018
@fejta-bot
Copy link

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten
/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Feb 10, 2018
@nicksardo
Copy link
Contributor

/remove-lifecycle rotten
/lifecycle frozen

@k8s-ci-robot k8s-ci-robot added lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. and removed lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. labels Feb 12, 2018
@nicksardo nicksardo added the kind/bug Categorizes issue or PR as related to a bug. label May 4, 2018
@icco
Copy link

icco commented Jun 10, 2018

I'm seeing this still, as the suggested fix doesn't work. Any other thoughts?

@kraklin
Copy link

kraklin commented Jul 27, 2018

Any new information about this issue. We had to create new cluster and migrate all things to it because of this bug :/

@G-Harmon
Copy link
Contributor

Why didn't the suggested workaround work, @icco?

@rramkumar1
Copy link
Contributor

/assign

@icco
Copy link

icco commented Aug 8, 2018

@G-Harmon not sure, but it comes back like once a week.

@EliezerIsrael
Copy link

I just got bit by this one. Any news?

@bowei
Copy link
Member Author

bowei commented May 10, 2021

/reopen

@k8s-ci-robot
Copy link
Contributor

@bowei: Reopened this issue.

In response to this:

/reopen

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot reopened this May 10, 2021
@DaveWelling
Copy link

DaveWelling commented Jun 10, 2021

It seems like the workaround for this is to move the orphaned ports to another region? Is that right? I'm guessing that means there is no way to directly remove the ports? I cannot see any obvious method via the gcloud compute instance-groups cli. Did anybody find a better solution? I'm worried I will just have to jump from region to region. (sorry I misinterpreted how set-named-ports works)
Obviously, I'm also going to avoid recreating ingresses unnecessarily, but that also doesn't seem like permanent solution when we are in heavy development and have frequent deployments.

It seems strange to me that more people are not hitting this issue. Is there some mitigating difference in now people are deploying services that I haven't thought of?

@swetharepakula
Copy link
Member

As the last report of this was almost a year ago, we are closing out this issue. We recommend users to move to NEGs (container-native) and migrate off of InstanceGroups. If there are still issues, please open a new github issue. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend/gce kind/bug Categorizes issue or PR as related to a bug. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness.
Projects
None yet
Development

No branches or pull requests