TokenCredentialRequestAPI related errors when ImpersonationProxy is being used instead #1920

KihyeokK · 2024-04-19T21:52:48Z

What happened?
I am running pinniped-concierge v0.28.0 image, using the helm chart https://github.com/vmware-tanzu/pinniped/releases/download/v0.20.0/install-pinniped-concierge.yaml on GKE 1.28. If I understand things correctly, ImpersonationProxy should be being used instead of TokenCredentialRequestAPI, as I can't run any custom pod on the same control plane node running kube-controller-manager, and the helm chart sets spec.impersonationProxy.mode as auto by default, as mentioned in the docs here. However, I am getting error logs from concierge that seem to be related to the use of TokenCredentialRequestAPI like the following

{"level":"error","timestamp":"2024-04-19T19:40:28.363250Z","caller":"k8s.io/apiserver@v0.28.4/pkg/server/dynamiccertificates/tlsconfig.go:275$dynamiccertificates.(*DynamicServingCertificateController).processNextWorkItem","message":"key failed with : not loading an empty serving certificate from \"concierge-serving-cert\""}

{"level":"error","timestamp":"2024-04-19T21:43:28.896540Z","caller":"go.pinniped.dev/internal/controllerlib/controller.go:222$controllerlib.(*controller).handleKey","message":"kube-cert-agent-controller: { } failed with: could not find a healthy kube-controller-manager pod (0 candidates)"}

There was also an error log about: "tls: failed to verify certificate: x509: certificate signed by unknown authority"

Are these error logs supposed to be there when Impersonation Proxy is being used instead of TokenCredentialRequestAPI? Is there a way to disable these logs when Impersonation Proxy is being used?

Thank you!

What did you expect to happen?

Errors mentioned above are not shown when Impersonation Proxy is being used instead of TokenCredentialRequestAPI.

What is the simplest way to reproduce this behavior?

Running helm chart https://github.com/vmware-tanzu/pinniped/releases/download/v0.20.0/install-pinniped-concierge.yaml on GKE 1.28.

In what environment did you see this bug?

Pinniped server version:
Pinniped client version:
Pinniped container image (if using a public container image): projects.registry.vmware.com/pinniped/pinniped-server:v0.28.0
Pinniped configuration (what IDP(s) are you using? what downstream credential minting mechanisms are you using?):
Kubernetes version (use kubectl version): 1.28.3
Kubernetes installer & version (e.g., kubeadm version):
Cloud provider or hardware configuration: GKE
OS (e.g: cat /etc/os-release):
Kernel (e.g. uname -a):
Others:

What else is there to know about this bug?

The text was updated successfully, but these errors were encountered:

cfryanr · 2024-04-19T21:59:01Z

Hi @KihyeokK, thanks for creating an issue.

When the impersonation proxy is enabled, clients still use the TokenCredentialRequest API during authentication. The TokenCredentialRequest returns an mTLS client certificate. When you are not using the impersonation proxy, that client cert is signed by the Kubernetes API server. When you are using the impersonation proxy, then that client cert is signed by the impersonation proxy itself. Either way, the client may then submit that mTLS client cert as proof of identity when making calls to Kubernetes APIs (either directly or through the impersonation proxy).

Aside from some potentially confusing log messages, are you having any trouble authenticating or making API calls?

KihyeokK · 2024-04-19T22:15:47Z

Hi @cfryanr , thank you for the fast response! Aside from the logs, there seems to be no issue at all with interacting with the Kubernetes API server. Also just a note, this error log {"level":"error","timestamp":"2024-04-19T21:43:28.896540Z","caller":"go.pinniped.dev/internal/controllerlib/controller.go:222$controllerlib.(*controller).handleKey","message":"kube-cert-agent-controller: { } failed with: could not find a healthy kube-controller-manager pod (0 candidates)"} was seen in Pinniped v0.20.0 too before upgrading to use the v0.28.0 image and helm chart.

cfryanr · 2024-04-19T23:11:28Z

That error is part of how auto mode chooses that it should enable the impersonation proxy. It first tries the other strategy (which involves finding the kube-controller-manager pod and then starting a new kube cert agent pod), and only when it sees that the other strategy does not work, then it starts the impersonation proxy.

Sorry that the log messages errors can be confusing. They are valuable for debugging when something goes wrong, but unfortunately they can also be confusing when everything is working exactly as expected.

Shall we close this issue, or did you have any other concerns here?

KihyeokK · 2024-04-20T07:15:27Z

@cfryanr Thank you for the clarification! I would like to ask just two more questions:

Could I assume that the certificate related error logs like the following are also normal for auto mode and is just a part of the steps of enabling the impersonation proxy?

{"level":"error","timestamp":"2024-04-19T19:40:28.363250Z","caller":"k8s.io/apiserver@v0.28.4/pkg/server/dynamiccertificates/tlsconfig.go:275$dynamiccertificates.(*DynamicServingCertificateController).processNextWorkItem","message":"key failed with : not loading an empty serving certificate from \"concierge-serving-cert\""}

"tls: failed to verify certificate: x509: certificate signed by unknown authority"

Would it make more sense to change the log level of the above mentioned could not find a healthy kube-controller-manager pod (0 candidates)" log into "info" from "error" in a future release?

cfryanr · 2024-04-22T16:00:26Z

For question 1: Yes, this could also be normal if it only happens briefly after installation and then those errors stop happening. That certificate Secret initially does not exist, and very quickly after installation Pinniped should auto-create and auto-populate that certificate Secret.

For question 2: That's a little complicated because of the way that our controller library works (controllers need to return errors when they want to schedule a retry) but perhaps we could find a way to improve it. If we can't find a way to downgrade the error, then we could at least change the text of the error message to make it say that this is normal behavior on cloud provider clusters. I will take a look.

cfryanr · 2024-04-22T16:32:53Z

Closing this for now because the original purpose of this issue is resolved, but please keep asking questions and making suggestions. Thanks for the discussion!

KihyeokK · 2024-04-23T00:56:41Z

@cfryanr Thank you for the help!

cfryanr mentioned this issue Apr 22, 2024

clarify error message for when there is no healthy controller manager #1922

Merged

cfryanr closed this as completed Apr 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TokenCredentialRequestAPI related errors when ImpersonationProxy is being used instead #1920

TokenCredentialRequestAPI related errors when ImpersonationProxy is being used instead #1920

KihyeokK commented Apr 19, 2024

cfryanr commented Apr 19, 2024

KihyeokK commented Apr 19, 2024

cfryanr commented Apr 19, 2024

KihyeokK commented Apr 20, 2024

cfryanr commented Apr 22, 2024

cfryanr commented Apr 22, 2024

KihyeokK commented Apr 23, 2024

TokenCredentialRequestAPI related errors when ImpersonationProxy is being used instead #1920

TokenCredentialRequestAPI related errors when ImpersonationProxy is being used instead #1920

Comments

KihyeokK commented Apr 19, 2024

cfryanr commented Apr 19, 2024

KihyeokK commented Apr 19, 2024

cfryanr commented Apr 19, 2024

KihyeokK commented Apr 20, 2024

cfryanr commented Apr 22, 2024

cfryanr commented Apr 22, 2024

KihyeokK commented Apr 23, 2024