Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EC2 credentials discovery fails (instance-data host gone?) #271

Open
screamish opened this issue Feb 11, 2016 · 22 comments
Open

EC2 credentials discovery fails (instance-data host gone?) #271

screamish opened this issue Feb 11, 2016 · 22 comments
Labels

Comments

@screamish
Copy link

I'm trying to use Discover for credentials from on an AWS EC2 instance and it's throwing this exception:

"MissingFileError \"/root/.aws/credentials\"

However I've had a look at the code that's doing this and this function seems like Amazon might have broken a dependency.

https://github.com/brendanhay/amazonka/blob/develop/amazonka/src/Network/AWS/EC2/Metadata.hs#L282

When I try to run this via ssh on my EC2 instance I get the following:

$ curl http://instance-data/latest/meta-data/
curl: (6) Could not resolve host: instance-data

but if I try the following it works:

$ curl http://169.254.169.254/latest/meta-data/
ami-id
ami-launch-index
... snip ...

I couldn't find anything on AWS's docs about them removing that hostname. What do you think?

@brendanhay
Copy link
Owner

That is concerning. The instance-data name resolution is used because it's convenient for the EC2 test, if that no longer existed I'd need some alternative way of 'quickly' determining if the host is in EC2 or not.

Which region and instance type are you encountering this issue on?

@screamish
Copy link
Author

ap-southeast-2, t2.small

Is there a reason you can't just use this:

latest :: Text
latest = "http://169.254.169.254/latest/"

like you do in all the other calls?

@brendanhay
Copy link
Owner

Because it relies on name resolution failure, instead of a timeout. If 169.254.169.254 is used, a timeout needs to be set for http-client, with circumstances such as an overloaded host resulting in the timeout being reached, and the instance not being detected as an EC2 host. I'm open to other suggestions, but I'd prefer to avoid the timeout if possible.

In the interim, if you're deploying to EC2 I'd suggest using newEnv Sydney (FromProfile "default"), or whatever the default IAM profile on that host happens to be. This way the isEC2 check will be skipped entirely.

@screamish
Copy link
Author

Ah, thanks for the explanation, that makes good sense. And thanks for the workaround, it covers my use case since I don't need to deploy this multi-region. If I think of any ideas for quick, robust way to implement isEC2 I'll let you know.

@gregwebs
Copy link

I just ran into this same issue. I would suggest not worrying about the overloaded host issue so much unless you have good reason to. My reasoning is that generally finding credentials is done at startup. If the host is already overloaded and you start a new process you are probably in for trouble already.

@brendanhay
Copy link
Owner

brendanhay commented Aug 17, 2016

Another issue with timeout is after searching for relevant environment variables or ~/.aws/credentials, the isEC2 check is performed before credential discovery fails.

On a development machine that incorrectly sets environment or credential information, failure will always block for the timeout length.

@brendanhay
Copy link
Owner

I'm going to share some information here from a colleague of mine who discovered what is likely the root cause and one possible solution:

It appears instance-data doesn't resolve inside VPCs that have not been configured using enableDnsSupport and enableDnsHostnames. If you can set both of these parameters for your specific VPC then the isEC2 check will work and instance-data will resolve, otherwise it's still recommended to use the workaround above or to preload the EC2 instance check using newEnvWith Oregon Discover (Just True) ....

@LoicGombeaud
Copy link

@brendanhay Thanks for the explanation, I was wondering why instance-data wasn't resolving from within our new VPC!

@AlistairB
Copy link

I'm having some trouble with this when running on the ec2 instance.

  • Discover - MissingFileError "/home/appuser/.aws/credentials"
  • FromProfile "default" - Essentially this won't work because the role name is autogenerated by our deployment tool
  • newEnvWith Discover (Just True) httpManager - MissingFileError "/home/appuser/.aws/credentials"

Is there some way to do this when the role name is not known?

http://169.254.169.254/latest/meta-data/iam/security-credentials/ does return only one role which is correct.

@kim
Copy link
Contributor

kim commented Aug 29, 2017

@AlistairB did you check whether instance-data resolves on your instances (see #271 (comment))?

Note that, contrary to what the documentation suggests, newEnvWith ... (Just True) ... will execute isEC2 if the first argument is Discover (via getAuth). This may be considered a bug, but it's not obvious how to fix it properly.

However, you can find out the role at runtime (and before initializing your Env) via metadata.

@AlistairB
Copy link

@kim Thanks for the comment. Didn't realise it will still run the isEC2 logic.

Yes instance-data does not resolve and I cannot change the VPC settings.

Good point, I will manually hit the metadata endpoint to retrieve the role and create env with FromProfile.

@screamish
Copy link
Author

You can just create an entry in your /etc/hosts for instance-data to the metadata ip, ie, 169.254.169.254, and it should just work. You don't have to go fiddling with the DNS for your whole VPC. That's how we're working around this issue for now.

@tmspzz

This comment has been minimized.

@koterpillar
Copy link
Contributor

Just a note that according to AWS, instance-data is an undocumented feature and shouldn't be relied upon.

@qrilka
Copy link

qrilka commented Apr 3, 2019

@koterpillar did you want to put a different link in your reply probably?

@koterpillar
Copy link
Contributor

You are right, here is a reply from an AWS employee about this instead.

@realdimas
Copy link

realdimas commented Jun 11, 2019

@brendanhay we are also impacted by this issue.
Please get rid of instance-data domain name resolution of from the code, its known not to work even on EC2.

These are the only officially recognized ways to identify an EC2 instance:
https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/identify_ec2_instances.html

None of them is ideal, each is a compromise (accuracy - false positive scenario / false negative scenarios, availability/dependencies, forward-compatibility).

CC @amarpotghan

@aissarmurad
Copy link

aissarmurad commented Jun 9, 2020

I had the same issue

I believe is better to follow the AWS boto3 credentials chain, because boto3 did not fail in the same situation:

The mechanism in which boto3 looks for credentials is to search through a
list of possible locations and stop as soon as it finds credentials.
The order in which Boto3 searches for credentials is:
    * Passing credentials as parameters in the boto.client() method
    * Passing credentials as parameters when creating a Session object
    * Environment variables
    * Shared credential file (~/.aws/credentials)
    * AWS config file (~/.aws/config)
    * Assume Role provider
    * Boto2 config file (/etc/boto.cfg and ~/.boto)
    * Instance metadata service on an Amazon EC2 instance that has an IAM role configured.

REFERENCE
https://boto3.amazonaws.com/v1/documentation/api/latest/guide/configuration.html#configuring-credentials

@aissarmurad
Copy link

aissarmurad commented Jun 9, 2020

@realdimas I've tested your solution,
it's only working on HVM instances

If you are using docker, like me, it will fail

@aissarmurad
Copy link

Possible workarounds:

  • Create a DNS zone instance-data
    or
  • create an entry in /etc/hosts file

@endgame
Copy link
Collaborator

endgame commented Sep 29, 2021

Too gnarly for 2.0. Someone will need to poke around on a bunch of instances (PV, HVM, Amazon Linux, other distros) and see whether we can reliably get a system UUID without needing root. If we could determine that easily and quickly, then we could make a query against 169.254.169.254 with a timeout, and not impact non-EC2 users that badly?

@ketzacoatl
Copy link

There is some prior work on the general question "how do we know if we're on an EC2 instance, and how can we be sure of that?"

1 provides a pretty thorough review of options.

There is also AWS' doc at 2 which basically recommends hitting the instance meta-data url and inspect the instance identity documents 3. The document provides a few parameters that could be useful for this, and the document is signed, so it can be verified.

WRT timeouts for non-EC2, it might be useful to make that timeout configurable, or disable this check completely (empower users with a method for overrides).

@endgame endgame mentioned this issue Jan 10, 2022
5 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests