Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Elastic Agent] Better error message for self-signed certificates #26365

Closed
mostlyjason opened this issue Jun 18, 2021 · 9 comments
Closed

[Elastic Agent] Better error message for self-signed certificates #26365

mostlyjason opened this issue Jun 18, 2021 · 9 comments
Assignees
Labels
Team:Elastic-Agent Label for the Agent team

Comments

@mostlyjason
Copy link

mostlyjason commented Jun 18, 2021

We've gotten quite a few slack and discuss issues for certificate problems because the instructions are not clear. When the user follows the in-product installation instructions for Fleet server on a self-managed cluster, they see this error message when they attempt to install an Elastic Agent:

$ sudo ./elastic-agent install -f --url=https://fleetserver:8220 --enrollment-token=mytoken
The Elastic Agent is currently in BETA and should not be used in production

Error: fail to enroll: fail to execute request to fleet-server: x509: certificate signed by unknown authority
Error: enroll command failed with exit code: 1

This error message leaves the user stuck and without clear guidance on what to do next. The correct next step is add --insecure to the command and run it again. Can we can change error to provide better instructions?

Error: fail to enroll: fail to execute request to fleet-server: x509: certificate signed by unknown authority. If you used the quick start method which generates a self-signed certificate for Fleet Server, add --insecure to the command. If you used a production certificate, check your parameters for --certificate-authorities for CA file paths.
@mostlyjason mostlyjason added v7.14.0 Team:Elastic-Agent Label for the Agent team labels Jun 18, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/agent (Team:Agent)

@ruflin
Copy link
Member

ruflin commented Jun 18, 2021

We could also add a link to the docs. I would prefer not to directly recommend users to use --insecure, otherwise it is likely to stay in. Users that use this flag should really understand what it means.

@mostlyjason
Copy link
Author

@ruflin the message says its a self-signed certificate and that its insecure. What else should they know?

I suggested adding a link to the docs here #26367

I filed another issue to mention --insecure in our docs since it won't work without it elastic/observability-docs#802

I also filed an issue that will remove the need to add the insecure flag in the future https://github.com/elastic/beats/issues/25705

@ruflin
Copy link
Member

ruflin commented Jun 22, 2021

I see your point Jason and don't have a great alternative suggestion. I'm mostly worried about that at some point users will used it and later discover it and and mention: but this is what was recommended to do ...

@ph
Copy link
Contributor

ph commented Jun 22, 2021

As @ruflin pointed out, I am also worried to return use --insecure in the error messages.
Do we agree to do #26367 instead, if we agree we could close this one?

I do not think we can detect correctly the detect "self signed scenario", the certificate signed by unknown authority is a real common error and it clearly demonstrates a problem in the chain of trust of the certificate.

@EricDavisX
Copy link
Contributor

I'm doing follow-ups on open issue for 7.14 - is this still under review / possible improvement for 7.14? If we aren't attempting it, then please update the label to 7.15 or beyond and finish any other processing, please. @nimarezainia @mostlyjason

@faec
Copy link
Contributor

faec commented Aug 19, 2021

As of this PR, failed enrollment links to the fleet troubleshooting guide which includes this error message and its solution. I agree with earlier concerns: generally we have no way of being sure that the failure is happening because the user is intentionally installing with a correct self-managed certificate, so we shouldn't recommend skipping the certificate integrity checks without more context. I think with the addition of the troubleshooting link this can be safely closed now.

@faec faec closed this as completed Aug 19, 2021
@tnjman
Copy link

tnjman commented Oct 24, 2022

I'm doing a follow-up comment to disagree with the name of the "--insecure" option title. If, for example, my Fleet servers are internal-only and will never be exposed externally, then they're not insecure due to using a self-signed or an internal Private CA SSL cert; unless I'm misunderstanding. A much better option title would be "--self-signed" for built-in certs and "--internal" if using those from my own internal CA and "--external" for servers using publicly-vetted certs. If you've properly set the servers to trust your self-signed SSL certs or your Internal Private CA certs generated by your own CA, then it's not "insecure," it's simply not vetted by a "public SSL cert authority" and, therefore, should not be used if those specific cert-protected servers would be exposed externally. I've managed very large infrastructure, with many servers protected only by internal Private CA certs; but those servers are not exposed to the outside; though they are perfectly secure internally. Really, it comes down to more of a matter of design and intent. Additionally, if servers never will be exposed externally, then it is also wrong to say "Never use this (self-signed or internal certs) for Production mode." This statement only would be true if the user ever plans to have externally-exposed hosts. I do realize self-signed/built-in certs are not as robust as those signed by a true internal CA; so maybe there's where some "less-secure" area could possibly be seen. Either way, "--insecure" is not, imo, an accurate title. Thanks

@immauss
Copy link

immauss commented Nov 8, 2022

I've recently run into the same issue. Only we are on an isolated network (not connected to the internet) with a Full PKI infrastructure. I lost a day or so trying to figure out what I had done wrong. The "--insecure" option is very misleading in its name for all the reasons @tnjman has already stated.

I also feel like this is inconsistent with other Elastic applications where specifying the CA certificate or CA signature allows for the proper validation of the certificates.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Team:Elastic-Agent Label for the Agent team
Projects
None yet
Development

No branches or pull requests

8 participants