Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cert-manager cannot renew k8s-io-prod certificate due to second IPv6 ingress #1476

Closed
munnerz opened this issue Dec 10, 2020 · 23 comments
Closed
Assignees
Labels
area/apps/cert-manager cert-manager, code in apps/cert-manager/ lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. sig/k8s-infra Categorizes an issue or PR as relevant to SIG K8s Infra.

Comments

@munnerz
Copy link
Member

munnerz commented Dec 10, 2020

As initially reported and discussed on Slack, the k8s-io-prod certificate (used for the redirector service) is failing to renew.

After some debugging, there are two issues at play here:

  • Hello IPv6 I'm kubernetes #1374 adding a second IPv6 only Ingress resource and the AAAA record configured to point to this - cert-manager only knows to update a single Ingress (named via the edit-in-place annotation) to inject path entries for the HTTP01 challenge solvers. As per https://letsencrypt.org/docs/ipv6-support/, if an AAAA record is returned then Let's Encrypt will prefer that and utilise it first. If we update cert-manager to only update the IPv6 Ingress resource, Let's Encrypt will quite likely pass validation (as it won't check IPv4) however, because cert-manager performs a 'self check' to ensure all routes are serving traffic correctly, and because our Pods do not utilise IPv6, the self check performed by cert-manager will never pass either. Ultimately, we need to ensure both Ingress resources contain the challenge path entries (which is not something that cert-manager supports today).

  • When running kubectl describe on our Ingress resources, the follow error is shown:

  Warning  Sync    6m38s (x29 over 85m)  loadbalancer-controller  Error during sync: error running backend syncing routine: googleapi: Error 403: Exceeded limit 'MAX_DISTINCT_NAMED_PORTS' on resource 'k8s-ig--ea949c440a044527'. Limit: 1000.0, limitExceeded

It appears that ingress-gce does not clean up 'unused named ports' - it was originally added in kubernetes/ingress-gce#430, but was later reverted in kubernetes/ingress-gce#585.

We can see that there are a lot of named ports associated with the 3 'unmanaged instance groups' that ingress-gce creates:

gcloud compute instance-groups get-named-ports k8s-ig--ea949c440a044527 --project kubernetes-public --zone us-central1-c | wc -l
1001

As you can see here, there are far fewer than 1000 nodePorts in our aaa cluster:

kubectl get services -A -o yaml | grep nodePort
      nodePort: 30062
      nodePort: 32044
      nodePort: 30404
      nodePort: 31694
      nodePort: 32072
      nodePort: 32212
      nodePort: 32382
      nodePort: 30980
      nodePort: 30566
      nodePort: 30633
      nodePort: 32142
      nodePort: 31365
      nodePort: 31558
      nodePort: 31752
      nodePort: 32464
      nodePort: 30414
      nodePort: 32125
      nodePort: 32392
      nodePort: 32046
      nodePort: 31204
      nodePort: 32185
      nodePort: 30887
      nodePort: 30923
      nodePort: 32006
      nodePort: 30046
      nodePort: 32489
      nodePort: 31023
    - nodePort: 31015
    - nodePort: 31614
    - nodePort: 30938
    - nodePort: 30242
    - nodePort: 32282
    - nodePort: 30382

(note that some of these nodePort entries are for the 'challenge solvers' for the currently on-going/blocked renewal, and so they are not present in the get-named-ports output).

Short term solutions/moving forward

The current certificate expires on December 19th (so in ~9 days). We need to resolve both of these issues to get a renewal now.

For (1), I propose we take the simplest approach of manually copying the path entries that cert-manager injects into the second Ingress resource. We will then manually remove them again afterwards. This will allow both the v4 and v6 front end IPs to respond to HTTP01 challenge requests.

For (2), we need to manually clean up some of these named ports. We have a list of all the nodePort allocations from kubectl get svc -A above, so we can write a script to calculate which ports are not actually used and set the full list appropriately for each of the instance groups that ingress-gce manages. If we make a mistake here or miss a port, I am not sure whether GCP will actually just reject because it'd break a load balancer, or if it'll make the associated service be unavailable until that port is added back.

Longer term solutions

For (1), I see a few avenues:

a) cert-manager is modified to be able to update multiple Ingress resources to inject routes for solving. This is a little out of the ordinary, but isn't the worst thing, especially given how many other awkward hoops we have to jump through to make HTTP01 solving work with that wide variety of ingress controller implementations.

b) write a controller that can be used by ingress-gce users to 'sync' the path entries on Ingress resources, treating one as authoritative. This would mean we could configure either the v4 or v6 Ingress to mirror the routes specified on the other one (that cert-manager updates). This feels cleaner than cert-manager updating multiple resources, but it'd be good to get feedback here

c) improve cert-manager's extensibility story to allow for "out of tree" HTTP01 solvers, which would in turn mean we could have a standalone 'ingress-gce-solver' which would understand ingress-gce's nuances. (this would also allow for e.g. an out of tree IngressRoute/VirtualService/Ingress v2 solver too). This is certainly something that the cert-manager project should do anyway, though may be a bit more involved as a resolution given the scope of the issue here.

d) not use IPv6, or use something like ingress-nginx running in-cluster (exposed with a TCP load balancer) to handle ingresses

For (2), we are likely to not be affected for a while again, but:

a) patching ingress-gce to allow it to clean up unused ports (may take a while to have this change in effect)

b) write automation to make it easy for us to clean up unused ports (after learning how to do this safely whilst resolving the issue we face today)


I'm going to mark this as priority/critical-urgent as we have a count-down timer on this before it becomes very visible/public 😅 - if people are available for a call today/over the coming days, we should try and move swiftly to get agreement on the short term solution so we can all go home for the holidays 🎅 🎄

/priority critical-urgent
/area cert-manager
/cc @thockin @dims @aojea @BenTheElder @bartsmykla

@k8s-ci-robot k8s-ci-robot added the priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. label Dec 10, 2020
@k8s-ci-robot
Copy link
Contributor

@munnerz: The label(s) area/cert-manager cannot be applied, because the repository doesn't have them

In response to this:

As initially reported and discussed on Slack, the k8s-io-prod certificate (used for the redirector service) is failing to renew.

After some debugging, there are two issues at play here:

  • Hello IPv6 I'm kubernetes #1374 adding a second IPv6 only Ingress resource and the AAAA record configured to point to this - cert-manager only knows to update a single Ingress (named via the edit-in-place annotation) to inject path entries for the HTTP01 challenge solvers. As per https://letsencrypt.org/docs/ipv6-support/, if an AAAA record is returned then Let's Encrypt will prefer that and utilise it first. If we update cert-manager to only update the IPv6 Ingress resource, Let's Encrypt will quite likely pass validation (as it won't check IPv4) however, because cert-manager performs a 'self check' to ensure all routes are serving traffic correctly, and because our Pods do not utilise IPv6, the self check performed by cert-manager will never pass either. Ultimately, we need to ensure both Ingress resources contain the challenge path entries (which is not something that cert-manager supports today).

  • When running kubectl describe on our Ingress resources, the follow error is shown:

 Warning  Sync    6m38s (x29 over 85m)  loadbalancer-controller  Error during sync: error running backend syncing routine: googleapi: Error 403: Exceeded limit 'MAX_DISTINCT_NAMED_PORTS' on resource 'k8s-ig--ea949c440a044527'. Limit: 1000.0, limitExceeded

It appears that ingress-gce does not clean up 'unused named ports' - it was originally added in kubernetes/ingress-gce#430, but was later reverted in kubernetes/ingress-gce#585.

We can see that there are a lot of named ports associated with the 3 'unmanaged instance groups' that ingress-gce creates:

gcloud compute instance-groups get-named-ports k8s-ig--ea949c440a044527 --project kubernetes-public --zone us-central1-c | wc -l
1001

As you can see here, there are far fewer than 1000 nodePorts in our aaa cluster:

kubectl get services -A -o yaml | grep nodePort
     nodePort: 30062
     nodePort: 32044
     nodePort: 30404
     nodePort: 31694
     nodePort: 32072
     nodePort: 32212
     nodePort: 32382
     nodePort: 30980
     nodePort: 30566
     nodePort: 30633
     nodePort: 32142
     nodePort: 31365
     nodePort: 31558
     nodePort: 31752
     nodePort: 32464
     nodePort: 30414
     nodePort: 32125
     nodePort: 32392
     nodePort: 32046
     nodePort: 31204
     nodePort: 32185
     nodePort: 30887
     nodePort: 30923
     nodePort: 32006
     nodePort: 30046
     nodePort: 32489
     nodePort: 31023
   - nodePort: 31015
   - nodePort: 31614
   - nodePort: 30938
   - nodePort: 30242
   - nodePort: 32282
   - nodePort: 30382

(note that some of these nodePort entries are for the 'challenge solvers' for the currently on-going/blocked renewal, and so they are not present in the get-named-ports output).

Short term solutions/moving forward

The current certificate expires on December 19th (so in ~9 days). We need to resolve both of these issues to get a renewal now.

For (1), I propose we take the simplest approach of manually copying the path entries that cert-manager injects into the second Ingress resource. We will then manually remove them again afterwards. This will allow both the v4 and v6 front end IPs to respond to HTTP01 challenge requests.

For (2), we need to manually clean up some of these named ports. We have a list of all the nodePort allocations from kubectl get svc -A above, so we can write a script to calculate which ports are not actually used and set the full list appropriately for each of the instance groups that ingress-gce manages. If we make a mistake here or miss a port, I am not sure whether GCP will actually just reject because it'd break a load balancer, or if it'll make the associated service be unavailable until that port is added back.

Longer term solutions

For (1), I see a few avenues:

a) cert-manager is modified to be able to update multiple Ingress resources to inject routes for solving. This is a little out of the ordinary, but isn't the worst thing, especially given how many other awkward hoops we have to jump through to make HTTP01 solving work with that wide variety of ingress controller implementations.

b) write a controller that can be used by ingress-gce users to 'sync' the path entries on Ingress resources, treating one as authoritative. This would mean we could configure either the v4 or v6 Ingress to mirror the routes specified on the other one (that cert-manager updates). This feels cleaner than cert-manager updating multiple resources, but it'd be good to get feedback here

c) improve cert-manager's extensibility story to allow for "out of tree" HTTP01 solvers, which would in turn mean we could have a standalone 'ingress-gce-solver' which would understand ingress-gce's nuances. (this would also allow for e.g. an out of tree IngressRoute/VirtualService/Ingress v2 solver too). This is certainly something that the cert-manager project should do anyway, though may be a bit more involved as a resolution given the scope of the issue here.

d) not use IPv6, or use something like ingress-nginx running in-cluster (exposed with a TCP load balancer) to handle ingresses

For (2), we are likely to not be affected for a while again, but:

a) patching ingress-gce to allow it to clean up unused ports (may take a while to have this change in effect)

b) write automation to make it easy for us to clean up unused ports (after learning how to do this safely whilst resolving the issue we face today)


I'm going to mark this as priority/critical-urgent as we have a count-down timer on this before it becomes very visible/public 😅 - if people are available for a call today/over the coming days, we should try and move swiftly to get agreement on the short term solution so we can all go home for the holidays 🎅 🎄

/priority critical-urgent
/area cert-manager
/cc @thockin @dims @aojea @BenTheElder @bartsmykla

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@munnerz
Copy link
Member Author

munnerz commented Dec 10, 2020

/area infra/cert-manager

@k8s-ci-robot k8s-ci-robot added the area/apps/cert-manager cert-manager, code in apps/cert-manager/ label Dec 10, 2020
@ameukam
Copy link
Member

ameukam commented Dec 10, 2020

@munnerz A possible short-term workaround for (2) : kubernetes/ingress-gce#43 (comment)

@munnerz
Copy link
Member Author

munnerz commented Dec 10, 2020

Nice find 😄 here is a list of in-use ports after running ports=$(gcloud compute backend-services list --global --format='value(port,port)' | xargs printf 'port%s:%s,'):

port30046:30046,port30062:30062,port30242:30242,port30382:30382,port30938:30938,port31015:31015,port31023:31023,port31614:31614,port32282:32282,port32489:32489,port32044:32044,port30923:30923,

I can go ahead and update the named ports for each instance group to this list. What is the best way to coordinate running a potentially risky command like this? :)

@bowei
Copy link
Member

bowei commented Dec 10, 2020

/assign

@bowei
Copy link
Member

bowei commented Dec 10, 2020

@spencerhance @rramkumar1

@cblecker
Copy link
Member

@munnerz What services would be potentially impacted (documenting a list would be great)? That way we can determine risk and who we'd have to notify.

@munnerz
Copy link
Member Author

munnerz commented Dec 11, 2020

An update - @thockin and I worked together to 1) clean up old named ports and 2) deal with the "second Ingress" issues (caused by the addition of the IPv6 ingress) by copying across rules temporarily for this renewal only to get us over the 'hump'

The certificate now has notAfter=Mar 11 18:26:22 2021 GMT.

In the meantime, there's a number of issues we should work to resolve (copied from Slack):

  1. I think the whole “cert-manager updating two ingresses” thing could be solved by either having cert-manager understand to:
    a) update two ingresses
    b) update all ingresses with path configurations for the domain being solved within a namespace (more complex, and doesn’t account for default backends)
    c) write an ‘ingress rule sync controller’ - which would also be generally useful for other ingress-gce users wanting to run dual stack services

  2. having two addresses associated with the one ingress-gce Ingress (one v4 and one v6) would be great, though I’m not sure how realistically we’ll be able to achieve this in the next 3mths? (?)

  3. have ingress-gce clean up named ports

@cblecker you can see the list here, for future reference 😄:

dnsNames:
- k8s.io
- apt.k8s.io
- apt.kubernetes.io
- blog.k8s.io
- blog.kubernetes.io
- changelog.k8s.io
- changelog.kubernetes.io
- ci-test.k8s.io
- ci-test.kubernetes.io
- code.k8s.io
- code.kubernetes.io
- dl.k8s.io
- dl.kubernetes.io
- docs.k8s.io
- docs.kubernetes.io
- examples.k8s.io
- examples.kubernetes.io
- feature.k8s.io
- feature.kubernetes.io
- features.k8s.io
- features.kubernetes.io
- get.k8s.io
- get.kubernetes.io
- git.k8s.io
- git.kubernetes.io
- go.k8s.io
- go.kubernetes.io
- issue.k8s.io
- issue.kubernetes.io
- issues.k8s.io
- issues.kubernetes.io
- pr-test.k8s.io
- pr-test.kubernetes.io
- pr.k8s.io
- pr.kubernetes.io
- prs.k8s.io
- prs.kubernetes.io
- rel.k8s.io
- rel.kubernetes.io
- releases.k8s.io
- releases.kubernetes.io
- sigs.k8s.io
- sigs.kubernetes.io
- submit-queue.k8s.io
- submit-queue.kubernetes.io
- www.k8s.io
- youtube.k8s.io
- youtube.kubernetes.io
- yt.k8s.io
- yt.kubernetes.io
- yum.k8s.io
- yum.kubernetes.io

@munnerz
Copy link
Member Author

munnerz commented Dec 11, 2020

/priority important-soon
/remove-priority critical-urgent

@k8s-ci-robot k8s-ci-robot added priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. and removed priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. labels Dec 11, 2020
@munnerz munnerz changed the title cert-manager cannot renew k8s-io-prod certificate cert-manager cannot renew k8s-io-prod certificate due to second IPv6 ingress Dec 11, 2020
@cblecker
Copy link
Member

awesome! thank you!

@dims
Copy link
Member

dims commented Dec 11, 2020

Thanks a ton @munnerz @thockin

@thockin
Copy link
Member

thockin commented Dec 11, 2020

GCP folks are looking at (3) and then (2) longer term

@thockin thockin self-assigned this Dec 11, 2020
@munnerz
Copy link
Member Author

munnerz commented Mar 2, 2021

/assign

I'm going to work on 1(c) above today (controller to sync rules portion of Ingress resources). In the meantime, as we are nearing March 11th and this has not been resolved yet, I am applying the same workaround as in December to get us over the hump.

@munnerz
Copy link
Member Author

munnerz commented Mar 2, 2021

Update - certificate has been renewed:

  status:
    conditions:
    - lastTransitionTime: "2020-03-06T11:15:52Z"
      message: Certificate is up to date and has not expired
      reason: Ready
      status: "True"
      type: Ready
    notAfter: "2021-05-31T10:22:27Z"

@DaveWelling
Copy link

DaveWelling commented Jun 10, 2021

It seems like the workaround for this is to move the orphaned ports to another region? Is that right? I'm guessing that means there is no way to directly remove the ports? I cannot see any obvious method via the gcloud compute instance-groups cli. Did anybody find a better solution? I'm worried I will just have to jump from region to region. (sorry misinterpreted the way set-named-ports works)
Obviously, I'm also going to avoid recreating ingresses unnecessarily, but that doesn't seem like a permanent solution when we are in heavy development and have frequent deployments.

It seems strange to me that more people are not hitting this issue. Is there some mitigating difference in how people are deploying services that I haven't thought of?

@k8s-ci-robot k8s-ci-robot added sig/k8s-infra Categorizes an issue or PR as relevant to SIG K8s Infra. and removed wg/k8s-infra labels Sep 29, 2021
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 28, 2021
@ameukam
Copy link
Member

ameukam commented Jan 3, 2022

/remove-lifecycle stale
/lifecycle frozen

@k8s-ci-robot k8s-ci-robot added lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Jan 3, 2022
@thockin
Copy link
Member

thockin commented Jan 3, 2022 via email

@ameukam
Copy link
Member

ameukam commented Jan 3, 2022

@thockin I don't think we need this again. I just don't want to close it until cert-manager is fully removed.

@riaankleinhans
Copy link
Contributor

@ameukam should this be on the radar for 2023?

@BenTheElder
Copy link
Member

Yes we should be removing cert-manager #4160

@ameukam
Copy link
Member

ameukam commented Aug 23, 2024

Issue addressed and GKE Networking now offers HTTP Route and support dual-stack.

/close

@k8s-ci-robot
Copy link
Contributor

@ameukam: Closing this issue.

In response to this:

Issue addressed and GKE Networking now offers HTTP Route and support dual-stack.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/apps/cert-manager cert-manager, code in apps/cert-manager/ lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. sig/k8s-infra Categorizes an issue or PR as relevant to SIG K8s Infra.
Projects
Status: Won't implement
Development

Successfully merging a pull request may close this issue.