-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Drop cluster-internal endpoint filtering / pod monitoring #9982
Drop cluster-internal endpoint filtering / pod monitoring #9982
Conversation
@@ -52,12 +47,6 @@ func (proxy *ovsProxyPlugin) Start(baseHandler pconfig.EndpointsConfigHandler) e | |||
|
|||
proxy.baseEndpointsHandler = baseHandler | |||
|
|||
// Populate pod info map synchronously so that kube proxy can filter endpoints to support isolation | |||
pods, err := proxy.registry.GetAllPods() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't need GetAllPods() anymore now either.
Last commit removing pod watch LGTM though we can remove some additional registry code too. |
Old clusters can still have old data with manually created endpoints with now-disallowed addresses. Need to at least document the need to check/delete such endpoints on upgrade, even though the admission plugin will keep new ones from being created |
e65eecd
to
4b52b98
Compare
Added a coment to the release notes bug |
"hostSubnets": hostSubnetStorage, | ||
"netNamespaces": netNamespaceStorage, | ||
"clusterNetworks": clusterNetworkStorage, | ||
"egressNetworkPolicies": egressNetworkPolicyStorage, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Enable this resource only if openshift-sdn plugin is used?
We no longer need to do any filtering of cluster network / service network IPs, because the endpoint admission controller does it for us.
4b52b98
to
aeee7c8
Compare
Any reason this can't be merged? Changes look ok to me. |
Can we add information to the release note that would help an admin who wanted to find/remove old illegal endpoints? |
continue EndpointLoop | ||
} | ||
} else { | ||
if !ni.ClusterNetwork.Contains(IP) && !ni.ServiceNetwork.Contains(IP) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is pre-existing, but if an endpoint contains any address outside the allowed range, we drop the entire endpoint object?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah... with the old (cluster-internal) filtering, you could really only hit this case if you were pretty explicitly trying to be evil, so there wasn't really any reason for us to be forgiving. I guess with the cluster-external filtering maybe you're more likely to do this accidentally so it might be better to delete just the bad addresses...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, not sure what the right thing to do is, can you spawn a follow-up issue to track?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
filed #10212
lgtm as well, doc request and question notwithstanding |
It was blocking on the egress-firewall merge, but that has happened now |
The suggestion that's already there ("look in your logs from before you upgraded") is really the only easy answer. There's no way to search for "all endpoints corresponding to services that don't have selectors and that have addresses that match a particular CIDR" so you'd basically have to just manually examine every endpoint. Or we could provide a script or something, but it would basically just be the code we're deleting here. We could move the illegal-endpoint-checking code so that it only gets run once, at startup, on the master, and have it output suggested "oc delete" commands along with the warnings. (And then drop it in 3.4?) |
I would recommend coming up with simple bash snippet that would find the oc get endpoints --all-namespaces --template '{{ range .items }}{{ . or something On Tue, Aug 2, 2016 at 10:41 AM, Dan Winship notifications@github.com
|
That works for IPs in the service network, but for IPs in the cluster network, you only want to look at the endpoints that weren't generated by the endpoints controller, but you can only figure that out by comparing against the service table... so it would have to be something like
(assuming they're using the default 3.2 ClusterNetworkCIDR and not the 3.1 default or a custom value). This is already gross and it's not complete yet. And they've already got a list of all the bad endpoints repeated every 30 seconds in their logs... |
noticing the "init containers" release note... maybe we need a upgrade migration/helper script? |
|
LGTM [merge] - please remember to add the release note |
[Test]ing while waiting on the merge queue |
Evaluated for origin test up to aeee7c8 |
continuous-integration/openshift-jenkins/test FAILURE (https://ci.openshift.redhat.com/jenkins/job/test_pr_origin/7684/) |
continuous-integration/openshift-jenkins/merge SUCCESS (https://ci.openshift.redhat.com/jenkins/job/test_pr_origin/7719/) (Image: devenv-rhel7_4797) |
Looks busted
|
Evaluated for origin merge up to aeee7c8 |
@danwinship Does this mean that the endpoint points to pod/svc in other project which created by an user with system:endpoint-controller role can be accessed from the user's project under multitenant plugin? |
Yes, but this is not intentional/supported, and it might not work that way in the future if we manage to implement the idea in #9255 (comment). (If that happened, then the privileged user would still be able to create the endpoint, but trying to connect to the endpoint would fail, just like trying to connect to the other user's pod directly would.) |
With #9383 merged, we no longer need to sanity-check cluster-internal service endpoints, so we can drop the pod monitor.
Unfortunately, for the moment at least, we still need to filter out cluster-external endpoints that violate EgressFirewall rules (#9227). Eventually we will hopefully be able to get rid of that. For now, this branch is based on top of the EgressFirewall branch (since otherwise we'd be completely dropping the endpoint filter in this branch, and then bringing it back in the EgressFirewall branch). It can land after that does.
Closes #9255.
@openshift/networking PTAL. Only the last commit is new; the rest is from 9227.