Drop cluster-internal endpoint filtering / pod monitoring #9982

danwinship · 2016-07-21T20:50:18Z

With #9383 merged, we no longer need to sanity-check cluster-internal service endpoints, so we can drop the pod monitor.

Unfortunately, for the moment at least, we still need to filter out cluster-external endpoints that violate EgressFirewall rules (#9227). Eventually we will hopefully be able to get rid of that. For now, this branch is based on top of the EgressFirewall branch (since otherwise we'd be completely dropping the endpoint filter in this branch, and then bringing it back in the EgressFirewall branch). It can land after that does.

Closes #9255.

@openshift/networking PTAL. Only the last commit is new; the rest is from 9227.

dcbw · 2016-07-22T21:20:50Z

pkg/sdn/plugin/proxy.go

@@ -52,12 +47,6 @@ func (proxy *ovsProxyPlugin) Start(baseHandler pconfig.EndpointsConfigHandler) e

 	proxy.baseEndpointsHandler = baseHandler

-	// Populate pod info map synchronously so that kube proxy can filter endpoints to support isolation
-	pods, err := proxy.registry.GetAllPods()


Don't need GetAllPods() anymore now either.

dcbw · 2016-07-22T21:21:47Z

Last commit removing pod watch LGTM though we can remove some additional registry code too.

liggitt · 2016-07-23T02:02:04Z

Old clusters can still have old data with manually created endpoints with now-disallowed addresses. Need to at least document the need to check/delete such endpoints on upgrade, even though the admission plugin will keep new ones from being created

danwinship · 2016-07-25T15:08:16Z

Old clusters can still have old data with manually created endpoints with now-disallowed addresses. Need to at least document the need to check/delete such endpoints on upgrade, even though the admission plugin will keep new ones from being created

Added a coment to the release notes bug

pravisankar · 2016-07-28T00:03:50Z

pkg/cmd/server/origin/master.go

+		"hostSubnets":           hostSubnetStorage,
+		"netNamespaces":         netNamespaceStorage,
+		"clusterNetworks":       clusterNetworkStorage,
+		"egressNetworkPolicies": egressNetworkPolicyStorage,


Enable this resource only if openshift-sdn plugin is used?

We no longer need to do any filtering of cluster network / service network IPs, because the endpoint admission controller does it for us.

smarterclayton · 2016-08-02T13:39:53Z

Any reason this can't be merged? Changes look ok to me.

liggitt · 2016-08-02T14:00:19Z

Can we add information to the release note that would help an admin who wanted to find/remove old illegal endpoints?

liggitt · 2016-08-02T14:05:44Z

pkg/sdn/plugin/proxy.go

-						continue EndpointLoop
-					}
-				} else {
+				if !ni.ClusterNetwork.Contains(IP) && !ni.ServiceNetwork.Contains(IP) {


this is pre-existing, but if an endpoint contains any address outside the allowed range, we drop the entire endpoint object?

yeah... with the old (cluster-internal) filtering, you could really only hit this case if you were pretty explicitly trying to be evil, so there wasn't really any reason for us to be forgiving. I guess with the cluster-external filtering maybe you're more likely to do this accidentally so it might be better to delete just the bad addresses...

yeah, not sure what the right thing to do is, can you spawn a follow-up issue to track?

filed #10212

liggitt · 2016-08-02T14:07:44Z

lgtm as well, doc request and question notwithstanding

danwinship · 2016-08-02T14:08:58Z

Any reason this can't be merged? Changes look ok to me.

It was blocking on the egress-firewall merge, but that has happened now

danwinship · 2016-08-02T14:41:50Z

Can we add information to the release note that would help an admin who wanted to find/remove old illegal endpoints?

The suggestion that's already there ("look in your logs from before you upgraded") is really the only easy answer. There's no way to search for "all endpoints corresponding to services that don't have selectors and that have addresses that match a particular CIDR" so you'd basically have to just manually examine every endpoint. Or we could provide a script or something, but it would basically just be the code we're deleting here.

We could move the illegal-endpoint-checking code so that it only gets run once, at startup, on the master, and have it output suggested "oc delete" commands along with the warnings. (And then drop it in 3.4?)

smarterclayton · 2016-08-02T18:53:56Z

I would recommend coming up with simple bash snippet that would find the
affected ones and document that:

oc get endpoints --all-namespaces --template '{{ range .items }}{{ .
metadata.name }} {{ range .subsets }}{{ range .addresses }}{{ .ip }} {{ end
}}{{ end }}{{ end }}' | grep -v ' 172.30.' | cut -f 1-1 -d ' '

or something

On Tue, Aug 2, 2016 at 10:41 AM, Dan Winship notifications@github.com
wrote:

Can we add information to the release note that would help an admin who
wanted to find/remove old illegal endpoints?

The suggestion that's already there ("look in your logs from before you
upgraded") is really the only easy answer. There's no way to search for
"all endpoints corresponding to services that don't have selectors and that
have addresses that match a particular CIDR" so you'd basically have to
just manually examine every endpoint. Or we could provide a script or
something, but it would basically just be the code we're deleting here.

We could move the illegal-endpoint-checking code so that it only gets run
once, at startup, on the master, and have it output suggested "oc delete"
commands along with the warnings. (And then drop it in 3.4?)

—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
#9982 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/ABG_p-xqEC9cpgCvZjBHauALCWLkjH5Qks5qb1cwgaJpZM4JSM_g
.

danwinship · 2016-08-02T20:17:23Z

That works for IPs in the service network, but for IPs in the cluster network, you only want to look at the endpoints that weren't generated by the endpoints controller, but you can only figure that out by comparing against the service table... so it would have to be something like

for endpoint in $(oc get services --template '[something that selects only selectorless services]'); do
    oc get endpoint $endpoint --template '[...]' | grep ' 10\.(12[8-9]|1[3-9][0-9]|2[0-5][0-9])\.' | ...

(assuming they're using the default 3.2 ClusterNetworkCIDR and not the 3.1 default or a custom value). This is already gross and it's not complete yet. And they've already got a list of all the bad endpoints repeated every 30 seconds in their logs...

danwinship · 2016-08-02T20:19:00Z

noticing the "init containers" release note... maybe we need a upgrade migration/helper script?

danwinship · 2016-08-04T13:23:44Z

for ep in $(oc get services --all-namespaces --template '{{ range .items}}{{ range .spec.selector }}{{ else }}{{ .metadata.namespace}}:{{ .metadata.name }} {{ end }}{{ end }}'); do
    oc get endpoints --namespace $(echo $ep | sed -e 's/:.*//') $(echo $ep | sed -e 's/.*://') --template '{{ .metadata.namespace }}:{{ .metadata.name }} {{ range .subsets }}{{ range .addresses }}{{ .ip }} {{ end }}{{ end }}{{ "\n" }}' | awk '/ 10\.(12[8-9]|1[3-9][0-9]|2[0-5][0-9])\./ { print $1 }'
done

smarterclayton · 2016-08-09T17:29:19Z

LGTM [merge] - please remember to add the release note

openshift-bot · 2016-08-09T18:07:25Z

[Test]ing while waiting on the merge queue

openshift-bot · 2016-08-09T18:08:02Z

Evaluated for origin test up to aeee7c8

openshift-bot · 2016-08-09T18:27:20Z

continuous-integration/openshift-jenkins/test FAILURE (https://ci.openshift.redhat.com/jenkins/job/test_pr_origin/7684/)

openshift-bot · 2016-08-09T19:37:21Z

continuous-integration/openshift-jenkins/merge SUCCESS (https://ci.openshift.redhat.com/jenkins/job/test_pr_origin/7719/) (Image: devenv-rhel7_4797)

smarterclayton · 2016-08-09T19:58:47Z

Looks busted

danwinship · 2016-08-10T11:47:25Z

https://ci.openshift.redhat.com/jenkins/job/test_pr_origin/7688/ was #10311. [merge]

openshift-bot · 2016-08-10T11:52:19Z

Evaluated for origin merge up to aeee7c8

bmeng · 2016-08-11T10:22:34Z

@danwinship Does this mean that the endpoint points to pod/svc in other project which created by an user with system:endpoint-controller role can be accessed from the user's project under multitenant plugin?

danwinship · 2016-08-11T13:06:43Z

@danwinship Does this mean that the endpoint points to pod/svc in other project which created by an user with system:endpoint-controller role can be accessed from the user's project under multitenant plugin?

Yes, but this is not intentional/supported, and it might not work that way in the future if we manage to implement the idea in #9255 (comment). (If that happened, then the privileged user would still be able to create the endpoint, but trying to connect to the endpoint would fail, just like trying to connect to the other user's pod directly would.)

dcbw reviewed Jul 22, 2016
View reviewed changes

danwinship force-pushed the drop-endpoint-filtering branch from e65eecd to 4b52b98 Compare July 25, 2016 14:53

danwinship mentioned this pull request Jul 25, 2016

OCP 3.3 release notes tracker openshift/openshift-docs#2507

Closed

pravisankar reviewed Jul 28, 2016
View reviewed changes

danwinship mentioned this pull request Jul 29, 2016

Egress firewall support #9227

Merged

Drop the SDN endpoint filter pod watch

aeee7c8

We no longer need to do any filtering of cluster network / service network IPs, because the endpoint admission controller does it for us.

danwinship force-pushed the drop-endpoint-filtering branch from 4b52b98 to aeee7c8 Compare August 2, 2016 12:15

liggitt reviewed Aug 2, 2016
View reviewed changes

danwinship changed the title ~~Drop cluster-internal endpoint filtering / pod monitoring [DO NOT MERGE]~~ Drop cluster-internal endpoint filtering / pod monitoring Aug 2, 2016

openshift-bot merged commit 1c1cb0b into openshift:master Aug 10, 2016

danwinship deleted the drop-endpoint-filtering branch August 10, 2016 13:14

danwinship mentioned this pull request Aug 19, 2016

Multi-tenant SDN adds a pod cache to all nodes which is O(pods) and also exceeds allowed info #7801

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Drop cluster-internal endpoint filtering / pod monitoring #9982

Drop cluster-internal endpoint filtering / pod monitoring #9982

danwinship commented Jul 21, 2016

dcbw Jul 22, 2016

dcbw commented Jul 22, 2016

liggitt commented Jul 23, 2016

danwinship commented Jul 25, 2016

pravisankar Jul 28, 2016

smarterclayton commented Aug 2, 2016

liggitt commented Aug 2, 2016

liggitt Aug 2, 2016

danwinship Aug 2, 2016

liggitt Aug 2, 2016

danwinship Aug 4, 2016

liggitt commented Aug 2, 2016

danwinship commented Aug 2, 2016

danwinship commented Aug 2, 2016

smarterclayton commented Aug 2, 2016

danwinship commented Aug 2, 2016

danwinship commented Aug 2, 2016

danwinship commented Aug 4, 2016

smarterclayton commented Aug 9, 2016

openshift-bot commented Aug 9, 2016

openshift-bot commented Aug 9, 2016

openshift-bot commented Aug 9, 2016

openshift-bot commented Aug 9, 2016 •

edited

Loading

smarterclayton commented Aug 9, 2016 via email

danwinship commented Aug 10, 2016

openshift-bot commented Aug 10, 2016

bmeng commented Aug 11, 2016

danwinship commented Aug 11, 2016

Drop cluster-internal endpoint filtering / pod monitoring #9982

Drop cluster-internal endpoint filtering / pod monitoring #9982

Conversation

danwinship commented Jul 21, 2016

dcbw Jul 22, 2016

Choose a reason for hiding this comment

dcbw commented Jul 22, 2016

liggitt commented Jul 23, 2016

danwinship commented Jul 25, 2016

pravisankar Jul 28, 2016

Choose a reason for hiding this comment

smarterclayton commented Aug 2, 2016

liggitt commented Aug 2, 2016

liggitt Aug 2, 2016

Choose a reason for hiding this comment

danwinship Aug 2, 2016

Choose a reason for hiding this comment

liggitt Aug 2, 2016

Choose a reason for hiding this comment

danwinship Aug 4, 2016

Choose a reason for hiding this comment

liggitt commented Aug 2, 2016

danwinship commented Aug 2, 2016

danwinship commented Aug 2, 2016

smarterclayton commented Aug 2, 2016

danwinship commented Aug 2, 2016

danwinship commented Aug 2, 2016

danwinship commented Aug 4, 2016

smarterclayton commented Aug 9, 2016

openshift-bot commented Aug 9, 2016

openshift-bot commented Aug 9, 2016

openshift-bot commented Aug 9, 2016

openshift-bot commented Aug 9, 2016 • edited Loading

smarterclayton commented Aug 9, 2016 via email

danwinship commented Aug 10, 2016

openshift-bot commented Aug 10, 2016

bmeng commented Aug 11, 2016

danwinship commented Aug 11, 2016

openshift-bot commented Aug 9, 2016 •

edited

Loading