Issue with Certificates HTTP-01 LE challenge with ingress-nginx #354

ashokspeelyaal · 2022-10-18T08:01:44Z

I have the cluster setup and running, I have the below config with respect to ingress controller and cert_manager


enable_cert_manager = true
  enable_nginx = true
  enable_traefix = false

Then I have created a cluster issuer as below,

kind: ClusterIssuer
metadata:
  name: letsencrypt-staging
spec:
  acme:
    # You must replace this email address with your own.
    # Let's Encrypt will use this to contact you about expiring
    # certificates, and issues related to your account.
    email: abc@mydomain.com
    server: https://acme-staging-v02.api.letsencrypt.org/directory
    privateKeySecretRef:
      # Secret resource that will be used to store the account's private key.
      name: staging-issuer-account-key
    # Add a single challenge solver, HTTP01 using nginx
    solvers:
      - http01:
          ingress:
            class: nginx

But the certificates are not being issued. I see the below status

  Conditions:
    Last Transition Time:        2022-10-18T07:50:22Z
    Message:                     Issuing certificate as Secret does not exist
    Observed Generation:         1
    Reason:                      DoesNotExist
    Status:                      False
    Type:                        Ready
    Last Transition Time:        2022-10-18T07:50:22Z
    Message:                     Issuing certificate as Secret does not exist
    Observed Generation:         1
    Reason:                      DoesNotExist
    Status:                      True
    Type:                        Issuing
  Next Private Key Secret Name:  tls-grafana-nonprod-2t6c7
Events:
  Type    Reason     Age   From                                       Message
  ----    ------     ----  ----                                       -------
  Normal  Issuing    60s   cert-manager-certificates-trigger          Issuing certificate as Secret does not exist
  Normal  Generated  59s   cert-manager-certificates-key-manager      Stored new private key in temporary Secret resource "tls-grafana-nonprod-2t6c7"
  Normal  Requested  59s   cert-manager-certificates-request-manager  Created new CertificateRequest resource "tls-grafana-nonprod-vmnbr"

When I check the logs of cert-manager pod, I see this error

E1018 07:56:31.619347 1 sync.go:190] cert-manager/challenges "msg"="propagation check failed" "error"="failed to perform self check GET request 'http://<mydomain>/.well-known/acme-challenge/TFjxhOtvqJusiURDD5OZHATbUuchvFdfmZNQ': Get \"http://<mydomain>/.well-known/acme-challenge/TFjxhOtvqJugBOEfl9siURDD5OZHATbUuchvFdfmZNQ\": EOF" "dnsName"="<mydomain>" "resource_kind"="Challenge" "resource_name"="tls-grafana-nonprod-vmnbr-468113688-1685101689" "resource_namespace"="grafana-stack" "resource_version"="v1" "type"="HTTP-01"

As per the cert-manager community, the url http://<mydomain>/.well-known/acme-challenge/TFjxhOtvqJusiURDD5OZHATbUuchvFdfmZNQ is accessible with in the cluster

I am able to access this url from browser, but When I try to curl from a pod I get 52: empty reply from server

If any of you have done this successfully, can you let me know what needs to be fixed?

Note: I have tried with Production cluster-issuer as well, but the issue remains the same

The text was updated successfully, but these errors were encountered:

mysticaltech · 2022-10-18T08:19:18Z

Weird - Please post your full kube.tf without the sensitive values.

mysticaltech · 2022-10-18T08:23:08Z

And are you 100% positive that your DNS points to the generated ingress LB IPs? Both A and AAAA records?

If you just did the change, make sure to give some time for the DNS to propagate or set the same DNS servers you have on the cluster, there is a variable for that.

You can also use the dig command inside your pods/containers to see if the name resolution is correct.

ashokspeelyaal · 2022-10-18T08:30:42Z

@mysticaltech , Yes, I am sure about DNS, because I use external-dns and verified the A records are updated. As I said before, the challege URL is accessible from browser.

I create a module called kube-environment to add more modules (like external-dns and grafana)

Here is the default variables list

variable "hcloud_token" {
  default = ""
}




variable "agent_nodepools" {
  default = [
    {
      name        = "agent-small-fsn1",
      server_type = "cpx21",
      location    = "fsn1",
      labels      = [],
      taints      = [],
      count       = 2
    },
    {
      name        = "agent-large-nbg1",
      server_type = "cpx21",
      location    = "nbg1",
      labels      = [],
      taints      = [],
      count       = 2
    },
    {
      name        = "agent-large-hel1",
      server_type = "cpx21",
      labels      = [],
      taints      = [],
      location    = "hel1",
      count = 2
      # In the case of using Longhorn, you can use Hetzner volumes instead of using the node's own storage by specifying a value from 10 to 10000 (in GB)
      # It will create one volume per node in the nodepool, and configure Longhorn to use them.
      # longhorn_volume_size = 20
    },
    {
      name        = "storage",
      server_type = "cpx21",
      location    = "fsn1",
      # Fully optional, just a demo
      labels = [
        "node.kubernetes.io/server-usage=storage"
      ],
      taints = [
        "server-usage=storage:NoSchedule"
      ],
      count = 1
      # In the case of using Longhorn, you can use Hetzner volumes instead of using the node's own storage by specifying a value from 10 to 10000 (in GB)
      # It will create one volume per node in the nodepool, and configure Longhorn to use them.
      # longhorn_volume_size = 20
    }
  ]
}
variable "ssh_public_key_path" {
  default = ""
}
variable "ssh_private_key_path" {
  default = ""
}
variable "k8s_network_region" {
  default = "eu-central"
}
variable "load_balancer_type" {
  default = "lb11"
}
variable "load_balancer_location" {
  default = "nbg1"
}
variable "control_plane_nodepools" {
  default =  [

    {
      name        = "control-plane-nbg1",
      server_type = "cpx11",
      location    = "nbg1",
      labels      = [],
      taints      = [],
      count       = 1
    },
    {
      name        = "control-plane-fsn1",
      server_type = "cpx11",
      location    = "fsn1",
      labels      = [],
      taints      = [],
      count       = 1
    },
    {
      name        = "control-plane-hel1",
      server_type = "cpx11",
      location    = "hel1",
      labels      = [],
      taints      = [],
      count       = 1
    }
  ]
}
variable "enable_nginx" {
  default = false
}
variable "enable_traefix" {
  default = false
}
variable "cluster_name" {
  default = "app-factory-nonprod"
}
variable "extra_firewall_rules" {
  default = []
}
variable "enable_cert_manager" {
  default = false
}
variable "use_control_plane_lb" {
  default = true
}

And this is how I am using

module "kube_environment_nonprod" {
  source = "../../../modules/hetzner/kube-environment"


  cluster_name = var.cluster_name
  ssh_private_key_path = var.ssh_private_key_path
  ssh_public_key_path = var.ssh_public_key_path
  use_control_plane_lb = true
  enable_cert_manager = true
  enable_nginx = true
  enable_traefix = false
  hcloud_token = var.hcloud_token
  agent_nodepools = [

    {
      name        = "agent-medium-fsn1",
      server_type = "cpx21",
      location    = "fsn1",
      labels      = [],
      taints      = [],
      count       = 2
    },
    {
      name        = "agent-medium-nbg1",
      server_type = "cpx21",
      location    = "nbg1",
      labels      = [],
      taints      = [],
      count       = 2
    },
    {
      name        = "agent-medium-hel1",
      server_type = "cpx21",
      labels      = [],
      taints      = [],
      location    = "hel1",
      count = 2

    },
    {
      name        = "agent-small-nbg1",
      server_type = "cpx11",
      location    = "nbg1",
      labels      = [],
      taints      = [],
      count       = 2
    }
  ]






  providers = {
    hcloud = hcloud
  }


}

Here is the ingress: (I have used this ingress before in other managed k8s clusters)

kind: Ingress
metadata:
  name: grafana-dashboard-ingress
  namespace: {{.Values.namespace}}
  annotations:
    kubernetes.io/ingress.class: nginx
    ingress.kubernetes.io/rewrite-target: /
    cert-manager.io/cluster-issuer: letsencrypt-staging
    nginx.ingress.kubernetes.io/enable-cors: "true"
    nginx.ingress.kubernetes.io/use-regex: "true"
    external-dns.alpha.kubernetes.io/access: "public"
    #nginx.ingress.kubernetes.io/ssl-redirect: "false"
spec:
  tls:
    - hosts:
        - {{.Values.dashboardHost}}
      secretName: tls-grafana-{{.Values.environment}}
  rules:
    - host: {{.Values.dashboardHost}}
      http:
        paths:
          - path: /
            pathType: Prefix
            backend:
              service:
                name: grafana
                port:
                  number: {{ .Values.grafana.port }}

mysticaltech · 2022-10-18T12:03:23Z

@ashokspeelyaal Thanks for sharing those details, first time I see it done that way for the module, interesting!

Your physical location and the location of the cluster in the cloud are not the same. The DNS servers in the cloud may not have been propagated with your changes at the time you tried. Please retry and let me know!

WolfspiritM · 2022-10-18T14:59:01Z

I'm actually having exactly the same issue setting up a cluster right now. I have the feeling it has something to do with the loadbalancer not being able to redirect traffic from inside the cluster back to the cluster out of some reason or it's an issue with hetzner in general right now.

WolfspiritM · 2022-10-18T16:21:41Z

I found the reason why that happens.

It seems like the Hetzner Loadbalancer does handle internal traffic differently to external traffic. Especially it doesn't send the PROXY protocol header for internal traffic it seems so nginx ingress doesn't know what to do with it. nginx ingress seems to only allow either proxy header or not depending on the configuration. Not sure how traefik handles that.

I've now turned off the proxy protocol by setting this from the nginx config:

"use-proxy-protocol": "false"

and this to the service annotations:

"load-balancer.hetzner.cloud/uses-proxyprotocol": "false"

Now both external and internal traffic works as expected. However that means that we only get the IP of the loadbalancer in the kube cluster I guess as nginx ingress doesn't know about the real ip anymore.

mysticaltech · 2022-10-18T19:14:26Z

Interesting issue, thanks for contributing a solution @WolfspiritM!

mysticaltech · 2022-10-18T19:14:51Z

@phaer Any ideas on how to fix this for good?

mysticaltech · 2022-10-19T23:39:07Z

@ashokspeelyaal @WolfspiritM I found the issue. It's related to cert-manager/cert-manager#466.

Could you folks please try setting this annotation to the nginx service, it should fix it, but I need confirmation:

load-balancer.hetzner.cloud/hostname

And it should be given the value of an FQDN that points to your LB. Please let me know how it goes, folks! 🙏

If this works, we could just explain the procedure in the docs.

Sources:
https://github.com/hetznercloud/hcloud-cloud-controller-manager#kube-proxy-mode-ipvs-and-hcloud-loadbalancer
https://github.com/hetznercloud/hcloud-cloud-controller-manager/blob/main/internal/annotation/load_balancer.go

WolfspiritM · 2022-10-20T10:29:07Z

This seems to have fixed it!

I'm still not exactly sure what that changed, cause even curl requests from the agent machines directly didn't work but setting the hostname resolved it somehow.

So what I understand is that somehow traffic to the external loadbalancer ip is redirected directly to the nginx ingress service which then doesn't have the proxy-ip.

My guess is that setting the hostname somehow makes nginx ingress aware that the request is coming from the loadbalancer instead of from internally?!

mysticaltech · 2022-10-20T12:11:28Z

Thanks for trying, @WolfspiritM; yes, exactly... It takes the "external route" somehow. The problem with the "internal route," from my understanding, was that it lacks the PROXY header.

By setting the hostname, we bypass that issue!

mysticaltech · 2022-10-20T13:29:16Z

README.md updated; you can find the new recommendation in Examples > Ingress with TLS.

ashokspeelyaal · 2022-10-20T14:37:00Z

Thanks a lot @mysticaltech @WolfspiritM

RudlTier · 2022-11-26T20:56:35Z

Thanks @mysticaltech and @WolfspiritM!
how can multible different subdomains be handeled with this setup?
eg. app1.demo.com + app2.demo.com
What value would need to be added to load-balancer.hetzner.cloud/hostname?

WolfspiritM · 2022-11-26T23:11:30Z

@RudlTier Just a public subdomain that maps to the loadbalancer. In our case we use lb.demo.com. It doesn't mean that it's using just that domain for TLS.

RudlTier · 2022-11-27T10:41:05Z

@WolfspiritM thanks for the quick answer. that worked! :)

aleksasiriski · 2023-02-13T17:37:56Z

If you use nginx_values then lb_hostname is ignored. Don't use nginx_values, if you need HTTP-01 challenge then leaving the default values and setting lb_hostname is the correct solution.

This is a known problem across EVERY cloud provider, here's the source:

Reason: I enable the proxy protocol on the load balancers so that my ingress controller and applications can "see" the real IP address of the client. However when this is enabled, there is a problem where cert-manager fails http01 challenges; you can find an explanation of why here but the easy fix provided by some providers - including Hetzner - is to configure the load balancer so that it uses a hostname instead of an IP. Again, read the explanation for the reason but if you care about seeing the actual IP of the client then I recommend you use these two annotations.

mysticaltech changed the title ~~Issue with Certificates~~ Issue with Certificates HTTP-01 LE challenge with ingress-nginx Oct 19, 2022

mysticaltech closed this as completed Oct 20, 2022

vinnytwice mentioned this issue Jul 27, 2023

How to expose services using TCP ports in nginx_values and set Load balancer hostname in one pass? #910

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issue with Certificates HTTP-01 LE challenge with ingress-nginx #354

Issue with Certificates HTTP-01 LE challenge with ingress-nginx #354

ashokspeelyaal commented Oct 18, 2022 •

edited

Loading

mysticaltech commented Oct 18, 2022

mysticaltech commented Oct 18, 2022

ashokspeelyaal commented Oct 18, 2022 •

edited

Loading

mysticaltech commented Oct 18, 2022 •

edited

Loading

WolfspiritM commented Oct 18, 2022

WolfspiritM commented Oct 18, 2022 •

edited

Loading

mysticaltech commented Oct 18, 2022

mysticaltech commented Oct 18, 2022

mysticaltech commented Oct 19, 2022 •

edited

Loading

WolfspiritM commented Oct 20, 2022

mysticaltech commented Oct 20, 2022

mysticaltech commented Oct 20, 2022

ashokspeelyaal commented Oct 20, 2022

RudlTier commented Nov 26, 2022

WolfspiritM commented Nov 26, 2022

RudlTier commented Nov 27, 2022

aleksasiriski commented Feb 13, 2023 •

edited

Loading

Issue with Certificates HTTP-01 LE challenge with ingress-nginx #354

Issue with Certificates HTTP-01 LE challenge with ingress-nginx #354

Comments

ashokspeelyaal commented Oct 18, 2022 • edited Loading

mysticaltech commented Oct 18, 2022

mysticaltech commented Oct 18, 2022

ashokspeelyaal commented Oct 18, 2022 • edited Loading

mysticaltech commented Oct 18, 2022 • edited Loading

WolfspiritM commented Oct 18, 2022

WolfspiritM commented Oct 18, 2022 • edited Loading

mysticaltech commented Oct 18, 2022

mysticaltech commented Oct 18, 2022

mysticaltech commented Oct 19, 2022 • edited Loading

WolfspiritM commented Oct 20, 2022

mysticaltech commented Oct 20, 2022

mysticaltech commented Oct 20, 2022

ashokspeelyaal commented Oct 20, 2022

RudlTier commented Nov 26, 2022

WolfspiritM commented Nov 26, 2022

RudlTier commented Nov 27, 2022

aleksasiriski commented Feb 13, 2023 • edited Loading

ashokspeelyaal commented Oct 18, 2022 •

edited

Loading

ashokspeelyaal commented Oct 18, 2022 •

edited

Loading

mysticaltech commented Oct 18, 2022 •

edited

Loading

WolfspiritM commented Oct 18, 2022 •

edited

Loading

mysticaltech commented Oct 19, 2022 •

edited

Loading

aleksasiriski commented Feb 13, 2023 •

edited

Loading