Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot connect to AWS resources when using pod security groups #2912

Closed
ahilmathew opened this issue May 13, 2024 · 4 comments
Closed

Cannot connect to AWS resources when using pod security groups #2912

ahilmathew opened this issue May 13, 2024 · 4 comments
Labels

Comments

@ahilmathew
Copy link

What happened:
Our EKS clusters are using public endpoint and the nodes are on public subnet. We've configured security groups for pods in the cluster and have added security group policies CRDs for our pods.
The first issue was with a connection timeout to the sts endpoint. https:://sts.us-west-2.amazonaws.com:443. After having a look at the issue #1796 , I tried adding a VPC endpoint for sts and it started working. The question here is why do I need a VPC endpoint when I am on public network and the security group currently allows all ingress and egress connections.
The cluster and node security groups allow ingress from the pod security groups.

The second part of this issue is that, we need connectivity to a dynamodb in another region us-east-1 from our cluster in us-west-2. This also started throwing timeout issue for us. In this case we cannot add a VPC endpoint as the resource is in another region. I would like to get some clarity on the root cause of this network issue.

send request failed\ncaused by: Post \"https://dynamodb.us-east-1.amazonaws.com/\": dial tcp 3.218.181.176:443: i/o timeout"}

Environment:

  • Kubernetes version (use kubectl version):
    Client Version: v1.29.1
    Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
    Server Version: v1.29.3-eks-adc7111
  • CNI Version - v1.18.0-eksbuild.1
  • OS (e.g: cat /etc/os-release):
  • Kernel (e.g. uname -a):
@ahilmathew
Copy link
Author

I had run nslookup from a debug container using the same SecurityGroupPolicy and DNS resolution works fine-

~ $ nslookup sts.us-west-2.amazonaws.com
Server:		172.20.0.10
Address:	172.20.0.10:53

Non-authoritative answer:

Non-authoritative answer:
Name:	sts.us-west-2.amazonaws.com
Address: 10.80.98.26
Name:	sts.us-west-2.amazonaws.com
Address: 10.80.85.195
Name:	sts.us-west-2.amazonaws.com
Address: 10.80.118.106
Name:	sts.us-west-2.amazonaws.com
Address: 10.80.138.46

~ $ nslookup dynamodb.us-east-1.amazonaws.com
Server:		172.20.0.10
Address:	172.20.0.10:53

Non-authoritative answer:

Non-authoritative answer:
Name:	dynamodb.us-east-1.amazonaws.com
Address: 52.119.233.250

@orsenthil
Copy link
Member

The question here is why do I need a VPC endpoint when I am on public network and the security group currently allows all ingress and egress connections.

Does this behavior happen after your enabled security groups for pods (on existing pods)?
Does this happen with new pods or new nodes?

What is your SecurityGroupPolicy and security group rules that demonstrate this behavior?

Copy link

This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 14 days

@github-actions github-actions bot added the stale Issue or PR is stale label Jul 15, 2024
Copy link

This issue is now closed. Comments on closed issues are hard for our team to see.
If you need more assistance, please either tag a team member or open a new issue that references this one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants