Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(grafana): use loopback alias for Grafana health check #352

Merged
merged 3 commits into from
Mar 31, 2022

Conversation

ebaron
Copy link
Member

@ebaron ebaron commented Mar 7, 2022

This PR adds a HostAlias for 127.0.0.1 to a cryostat-health.local hostname. We add this hostname as a SAN to our self-signed certificate issued by cert-manager. This is preferred over adding the loopback address directly, since our custom hostname is unlikely to be resolvable outside of our pod.

Fixes: #351, #357

Previous approach using 127.0.0.1 directly:
Currently failing due to lack of loopback address as a Subject Alternative Name:

INFO: (10.217.0.1:44504): GET /health 200 16ms
Mar 07, 2022 10:25:15 PM io.cryostat.core.log.Logger warn
WARNING: Exception thrown
java.io.IOException: javax.net.ssl.SSLHandshakeException: Failed to create SSL connection
	at io.cryostat.net.web.http.generic.HealthGetHandler.lambda$checkUri$0(HealthGetHandler.java:164)
	at io.vertx.ext.web.client.impl.HttpContext.handleFailure(HttpContext.java:309)
	at io.vertx.ext.web.client.impl.HttpContext.execute(HttpContext.java:303)
	at io.vertx.ext.web.client.impl.HttpContext.next(HttpContext.java:275)
	at io.vertx.ext.web.client.impl.predicate.PredicateInterceptor.handle(PredicateInterceptor.java:70)
	at io.vertx.ext.web.client.impl.predicate.PredicateInterceptor.handle(PredicateInterceptor.java:32)
	at io.vertx.ext.web.client.impl.HttpContext.next(HttpContext.java:272)
	at io.vertx.ext.web.client.impl.HttpContext.fire(HttpContext.java:282)
	at io.vertx.ext.web.client.impl.HttpContext.fail(HttpContext.java:262)
	at io.vertx.ext.web.client.impl.HttpContext.lambda$handleSendRequest$7(HttpContext.java:422)
	at io.vertx.core.impl.FutureImpl.tryFail(FutureImpl.java:195)
	at io.vertx.ext.web.client.impl.HttpContext.lambda$handleSendRequest$15(HttpContext.java:518)
	at io.vertx.core.http.impl.HttpClientRequestBase.handleException(HttpClientRequestBase.java:133)
	at io.vertx.core.http.impl.HttpClientRequestImpl.handleException(HttpClientRequestImpl.java:371)
	at io.vertx.core.http.impl.HttpClientRequestImpl.lambda$null$6(HttpClientRequestImpl.java:473)
	at io.vertx.core.impl.ContextImpl.executeTask(ContextImpl.java:366)
	at io.vertx.core.impl.EventLoopContext.execute(EventLoopContext.java:43)
	at io.vertx.core.impl.ContextImpl.executeFromIO(ContextImpl.java:229)
	at io.vertx.core.impl.ContextImpl.executeFromIO(ContextImpl.java:221)
	at io.vertx.core.http.impl.HttpClientRequestImpl.lambda$connect$7(HttpClientRequestImpl.java:472)
	at io.vertx.core.http.impl.HttpClientImpl.lambda$getConnectionForRequest$4(HttpClientImpl.java:1048)
	at io.vertx.core.http.impl.ConnectionManager.lambda$getConnection$7(ConnectionManager.java:159)
	at io.vertx.core.http.impl.pool.Pool.connectFailed(Pool.java:397)
	at io.vertx.core.http.impl.pool.Pool.access$600(Pool.java:89)
	at io.vertx.core.http.impl.pool.Pool$Holder.lambda$connect$0(Pool.java:129)
	at io.vertx.core.impl.FutureImpl.tryFail(FutureImpl.java:195)
	at io.vertx.core.http.impl.HttpChannelConnector.connectFailed(HttpChannelConnector.java:255)
	at io.vertx.core.http.impl.HttpChannelConnector.lambda$doConnect$0(HttpChannelConnector.java:164)
	at io.vertx.core.net.impl.ChannelProvider.lambda$connect$1(ChannelProvider.java:78)
	at io.vertx.core.net.impl.ChannelProvider$1.userEventTriggered(ChannelProvider.java:117)
	at io.netty.channel.AbstractChannelHandlerContext.invokeUserEventTriggered(AbstractChannelHandlerContext.java:346)
	at io.netty.channel.AbstractChannelHandlerContext.invokeUserEventTriggered(AbstractChannelHandlerContext.java:332)
	at io.netty.channel.AbstractChannelHandlerContext.fireUserEventTriggered(AbstractChannelHandlerContext.java:324)
	at io.netty.handler.ssl.SslHandler.handleUnwrapThrowable(SslHandler.java:1260)
	at io.netty.handler.ssl.SslHandler.decodeJdkCompatible(SslHandler.java:1241)
	at io.netty.handler.ssl.SslHandler.decode(SslHandler.java:1285)
	at io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:507)
	at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:446)
	at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:276)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
	at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
	at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919)
	at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:166)
	at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:719)
	at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:655)
	at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:581)
	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:493)
	at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:986)
	at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
	at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
	at java.base/java.lang.Thread.run(Thread.java:829)
Caused by: javax.net.ssl.SSLHandshakeException: Failed to create SSL connection
	at io.vertx.core.net.impl.ChannelProvider$1.userEventTriggered(ChannelProvider.java:115)
	... 25 more
Caused by: javax.net.ssl.SSLHandshakeException: No subject alternative names matching IP address 127.0.0.1 found
	at java.base/sun.security.ssl.Alert.createSSLException(Alert.java:131)
	at java.base/sun.security.ssl.TransportContext.fatal(TransportContext.java:349)
	at java.base/sun.security.ssl.TransportContext.fatal(TransportContext.java:292)
	at java.base/sun.security.ssl.TransportContext.fatal(TransportContext.java:287)
	at java.base/sun.security.ssl.CertificateMessage$T12CertificateConsumer.checkServerCerts(CertificateMessage.java:654)
	at java.base/sun.security.ssl.CertificateMessage$T12CertificateConsumer.onCertificate(CertificateMessage.java:473)
	at java.base/sun.security.ssl.CertificateMessage$T12CertificateConsumer.consume(CertificateMessage.java:369)
	at java.base/sun.security.ssl.SSLHandshake.consume(SSLHandshake.java:392)
	at java.base/sun.security.ssl.HandshakeContext.dispatch(HandshakeContext.java:443)
	at java.base/sun.security.ssl.SSLEngineImpl$DelegatedTask$DelegatedAction.run(SSLEngineImpl.java:1074)
	at java.base/sun.security.ssl.SSLEngineImpl$DelegatedTask$DelegatedAction.run(SSLEngineImpl.java:1061)
	at java.base/java.security.AccessController.doPrivileged(Native Method)
	at java.base/sun.security.ssl.SSLEngineImpl$DelegatedTask.run(SSLEngineImpl.java:1008)
	at io.netty.handler.ssl.SslHandler.runDelegatedTasks(SslHandler.java:1549)
	at io.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1395)
	at io.netty.handler.ssl.SslHandler.decodeJdkCompatible(SslHandler.java:1236)
	... 20 more
Caused by: java.security.cert.CertificateException: No subject alternative names matching IP address 127.0.0.1 found
	at java.base/sun.security.util.HostnameChecker.matchIP(HostnameChecker.java:165)
	at java.base/sun.security.util.HostnameChecker.match(HostnameChecker.java:101)
	at java.base/sun.security.ssl.X509TrustManagerImpl.checkIdentity(X509TrustManagerImpl.java:455)
	at java.base/sun.security.ssl.X509TrustManagerImpl.checkIdentity(X509TrustManagerImpl.java:429)
	at java.base/sun.security.ssl.X509TrustManagerImpl.checkTrusted(X509TrustManagerImpl.java:283)
	at java.base/sun.security.ssl.X509TrustManagerImpl.checkServerTrusted(X509TrustManagerImpl.java:141)
	at java.base/sun.security.ssl.CertificateMessage$T12CertificateConsumer.checkServerCerts(CertificateMessage.java:632)
	... 31 more

While we could easily add the loopback address as a SAN to the certificate generated by cert-manager, this may not be ideal from a security perspective.

@ebaron
Copy link
Member Author

ebaron commented Mar 7, 2022

It would also be easy to use the service hostname instead, but we could run into issues on some setups where the pod isn't able to reach itself via service: https://kubernetes.io/docs/tasks/debug-application-cluster/debug-service/#a-pod-fails-to-reach-itself-via-the-service-ip

@andrewazores any ideas on how to proceed with this?

@andrewazores
Copy link
Member

Adding localhost/127.0.0.1 to the SAN does seem like bad practice for a certificate that we're installing into a production environment. The common workaround I have found in reading about this is to add a DNS entry for ex. localhost.example.com which resolves to 127.0.0.1, but I don't think that's very practical here and doesn't even really buy us much that we didn't already have by simply using the service URL.

I don't suppose it's possible to just make an http:// request over loopback here and not deal with the cert at all? I'm not entirely sure how the cert is set up with cert-manager, tbh.

@ebaron
Copy link
Member Author

ebaron commented Mar 8, 2022

I don't suppose it's possible to just make an http:// request over loopback here and not deal with the cert at all? I'm not entirely sure how the cert is set up with cert-manager, tbh.

Unfortunately not:

$ curl -i http://127.0.0.1:3000/api/health
HTTP/1.0 400 Bad Request

Client sent an HTTP request to an HTTPS server.

@ebaron
Copy link
Member Author

ebaron commented Mar 29, 2022

This should also fix #357 by bypassing the route/ingress.

@ebaron ebaron linked an issue Mar 29, 2022 that may be closed by this pull request
@ebaron ebaron changed the title fix(grafana): use loopback interface for Grafana health check fix(grafana): use loopback alias for Grafana health check Mar 29, 2022
@ebaron ebaron marked this pull request as ready for review March 29, 2022 20:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Grafana health check not trusted Set new external Grafana dashboard environment variable
2 participants