add autoscaler for kube-state-metrics #200

WIZARD-CXY · 2017-08-03T06:12:54Z

For #124
(provide deployment manifest that scales with cluster size using pod nanny)
This setup is tested on my gke cluster, now kube-state-metrics pod has two container, kube-state-metrics and pod-nanny

➜  kube-state-metrics git:(addautoscaler) ✗ kubectl get po -n kube-system
NAME                                                  READY     STATUS    RESTARTS   AGE
event-exporter-v0.1.4-4272745813-m1sqp                2/2       Running   0          6d
fluentd-gcp-v2.0-290kq                                2/2       Running   0          6d
fluentd-gcp-v2.0-7wmrd                                2/2       Running   0          6d
fluentd-gcp-v2.0-qtr9r                                2/2       Running   0          6d
heapster-v1.4.0-2764992688-s3q6v                      2/2       Running   0          6d
kube-dns-1413379277-387r0                             3/3       Running   0          6d
kube-dns-1413379277-z057j                             3/3       Running   0          6d
kube-dns-autoscaler-3880103346-73r5g                  1/1       Running   0          6d
kube-proxy-gke-cluster-1-default-pool-d10ff02d-cft7   1/1       Running   0          6d
kube-proxy-gke-cluster-1-default-pool-d10ff02d-mmfk   1/1       Running   0          6d
kube-proxy-gke-cluster-1-default-pool-d10ff02d-x4pg   1/1       Running   0          6d
kube-state-metrics-1585740284-w6lnr                   2/2       Running   0          1m
kubernetes-dashboard-1962351010-km5wg                 1/1       Running   0          6d
l7-default-backend-2954409777-rrwkq                   1/1       Running   0          6d

The cpu and memory parameters can be adjusted too.

This change is

WIZARD-CXY · 2017-08-03T06:39:53Z

I think since the nanny scales resources linearly with the number of nodes in the cluster, I can use small starting resource for kube-state-metrics container. Here I use request resource for resource base requirement in nanny.

brancz · 2017-08-03T07:47:33Z

Having never used the nanny pod, this might be a incorrect, but I'd expect it to change the deployment manifest, meaning it needs the RBAC roles for it, no?

brancz · 2017-08-03T08:09:05Z

I looked at the addon-resizer code and from what I can tell it needs:

nodes: list, watch
deployments: get, update
pods: get

What that would basically mean is that we need to add the last two points to our RBAC roles.

WIZARD-CXY · 2017-08-03T08:31:51Z

Good point, but I don't add the last two, it works... and the example in https://github.com/kubernetes/autoscaler/tree/master/addon-resizer
is a simple deployment without any other RBAC config.Maybe it exists some tricks

brancz · 2017-08-03T08:52:34Z

Depending on the ServiceAccount you're using, you might have given your pod cluster admin roles as it's in the kube-system namsepace.

@piosz @loburm are you aware of the RBAC roles the addon-resizer needs? From looking at the code it doesn't seem to do anything special so I'd expect it to require it.

WIZARD-CXY · 2017-08-03T09:36:25Z

@brancz I started a new namespace and deployed pod-nanny, it complains it can't get pod and update deployment due to limited permission.So maybe is it due to kube-system(a special ns)?

brancz · 2017-08-03T09:40:01Z

Yes that was my suspicion. Would you care to adapt? 🙂

WIZARD-CXY · 2017-08-03T09:43:54Z

you mean add

deployment: update
pods: get

in this file https://github.com/kubernetes/kube-state-metrics/blob/master/kubernetes/kube-state-metrics-cluster-role.yaml?

brancz · 2017-08-03T09:45:26Z

Yes, but also get for deployments, as currently it's just list/watch.

WIZARD-CXY · 2017-08-03T09:46:08Z

yeah, exactly what i want to say due to this func
https://github.com/kubernetes/autoscaler/blob/master/addon-resizer/nanny/kubernetes_client.go#L69

WIZARD-CXY · 2017-08-03T10:00:10Z

Changed according to your suggestion,PTAL @brancz.Tested ok in my cluster.

brancz

small nit, then lgtm

brancz · 2017-08-03T10:11:46Z

kubernetes/kube-state-metrics-cluster-role.yaml

+  - deployments
+  verbs: ["list", "watch", "get", "update"]
+- apiGroups: ["extensions"]
+  resources:
  - daemonsets
  - deployments


this can be removed as it's redundant with the above rule

sorry will do

WIZARD-CXY · 2017-08-03T10:19:13Z

fixed PTAL

WIZARD-CXY · 2017-08-03T10:22:23Z

I read the doc https://kubernetes.io/docs/admin/authorization/rbac/, and find that "Default RBAC policies grant scoped permissions to control-plane components, nodes, and controllers, but grant no permissions to service accounts outside the “kube-system” namespace (beyond discovery permissions given to all authenticated users)."
maybe service account in kube-system namespace is privileged just as our kube-state-metrics sa in kube-system.

WIZARD-CXY · 2017-08-03T10:24:00Z

maybe @piosz @loburm can give an answer?

brancz · 2017-08-03T10:28:50Z

We actually run kube-state-metrics not in the kube-system namespace, so we don't really have that problem, or rather therefore we do have this very problem. Yes kube-system is a special namespace, and that's sort of why we chose it to be the example namespace for this component as we can always assume it's there. If you want to we can change the example manifests to deploy to the default namespace instead, maybe that makes the need for RBAC more clear, wdyt?

WIZARD-CXY · 2017-08-03T11:04:16Z

I think it should be in kube-system, no need to change IMHO

brancz · 2017-08-03T11:31:41Z

Agreed, but it's still up to the user, so RBAC roles should be properly documented.

piosz · 2017-08-04T09:33:53Z

@kubernetes/sig-api-machinery-pr-reviews for rbac stuff

piosz · 2017-08-04T09:35:58Z

Also we use pod-nanny with Heapster already. I'm not sure where we can find rbac rules for it.

FYI @kubernetes/sig-autoscaling-misc

liggitt · 2017-08-04T10:40:02Z

kubernetes/kube-state-metrics-cluster-role.yaml

  - deployments
+  verbs: ["list", "watch", "get", "update"]


Letting this update any replicaset and daemonset in the system doesn't seem appropriate. I'd expect those permissions limited to the namespace and specific replicaset/daemonset it is responsible for. Also, the update permission belongs in a separate role specific to the nanny, since someone who is not using the nanny should not need to give write permission to their metrics pod

replicaset and daemonset are just given list and watch permission, no update.

Ah, sorry, misread the diff. Same comment applies to deployments re update permission just on this particular deployment in this namespace

In that case we need to separate the list/watch from the get and update as list/watch needs to be on all namespaces for the purpose of collecting the objects to produce the metrics of those. Thus get and update only need to be on that specific namespace.

In that case we need to separate the list/watch from the get and update as list/watch needs to be on all namespaces for the purpose of collecting the objects to produce the metrics of those. Thus get and update only need to be on that specific namespace.

exactly, in a Role in the kube-system namespace, bound with a RoleBinding. the rule should also specify resourceNames:["kube-state-metrics"] to only allow modifying the relevant deployment.

But I think maybe it's a little too tight, from the code we can see that pod-nanny only get and update the relevant deployment we specified. I think we can trust it.

That's less about not trusting the code/binary of the addon-resizer, than an attacker being able to modify arbitrary deployments. (aka principle of least privilege) Which as a side note makes a great example of why we should not run kube-state-metrics in the kube-system namespace.

WIZARD-CXY · 2017-08-07T01:32:03Z

So I will submit another commit to fix the rbac issue

WIZARD-CXY · 2017-08-07T03:07:14Z

@liggitt @brancz. I add Role and Rolebinding according to your suggestion.PTAL

liggitt

Suggestion on patch. LGTM otherwise

liggitt · 2017-08-07T03:10:08Z

kubernetes/kube-state-metrics-role.yaml

+  resources:
+  - deployments
+  resourceNames: ["kube-state-metrics"]
+  verbs: ["get", "update"]


Might want to allow patch as well… not sure how the nanny updates

the nanny updates is using https://github.com/kubernetes/autoscaler/blob/master/addon-resizer/nanny/kubernetes_client.go#L69

just using get and update

WIZARD-CXY · 2017-08-07T03:59:10Z

@liggitt another question here: If I already have a role to give "list pod" permission, is it right that I already have "get a specific pod" permission?

WIZARD-CXY · 2017-08-07T04:03:58Z

@brancz I found that we actually don't need to add "get pods" permission even if we deploy kube-state-metrics all in another fresh-new namespace.Like below, I deploy it in monitor namspace


➜  kube-state-metrics git:(addautoscaler) ✗ kubectl get po -n monitor
NAME                                  READY     STATUS    RESTARTS   AGE
kube-state-metrics-2057856429-k0t4d   2/2       Running   0          1m

Tested on my gke (k8s version 1.7.2)

brancz · 2017-08-07T07:43:41Z

I would expect this to cause errors at runtime, did you have a look at the logs of the addon-resizer? The actions should be distinct, also we should trigger the case that autoscaling kicks in, to see that it actually works properly.

WIZARD-CXY · 2017-08-07T09:00:49Z

I looked at the log of addon-resizer, it didn't complain. Below is its log.

The addon-resizer actually changed the kube-state-metrics resource configuration from cpu 100 to 103 and mem 30 to 36Mi

You can test my pr in your environment and see if anything change. My env is gke k8s 1.7.2

brancz · 2017-08-07T09:06:30Z

@ericchiang @liggitt is this intended, that get/list RBAC actions are both allowed when only one of them is specified?

liggitt · 2017-08-07T13:27:08Z

@ericchiang @liggitt is this intended, that get/list RBAC actions are both allowed when only one of them is specified?

That is not the case. The clusterrole already grants list pods to the service account, so anything running with that service account would have list permission. If the resizer is doing a get of a pod, either RBAC (or another configured authorizer) is allowing an explicit get. Did you set up your gke cluster with legacy abac permissions disabled?

WIZARD-CXY · 2017-08-07T15:13:00Z

@liggitt no extra config, I just use default setting

WIZARD-CXY · 2017-08-07T15:23:49Z

Do you know how to get apiserver start configuration on gke? I can't find it anywhere, maybe google hide it...

liggitt · 2017-08-07T16:06:22Z

Do you know how to get apiserver start configuration on gke? I can't find it anywhere, maybe google hide it...

I don't think it's accessible. Did you start your cluster passing --no-enable-legacy-authorization to gcloud? If not, I think you still have legacy ABAC policy in place that makes every service account a cluster admin

WIZARD-CXY · 2017-08-08T01:57:05Z

@LiGgit, I check the configuration, it is "Legacy authorization
Enabled", the default value is enabled as u can see below

WIZARD-CXY · 2017-08-08T02:29:50Z

Finally get these authorization things straight, thanks to @brancz @liggitt. add "get pod" rule in pod-nanny role.PTAL

brancz · 2017-08-08T06:17:26Z

Happy we could figure it out, lgtm now 👍 . I'll hold off on a final lgtm from @liggitt though.

liggitt · 2017-08-08T06:25:06Z

seems fine as a starting point. as a test, I'd recommend deploying the addon-resizer in its own pod with its own service account, and seeing if the role is sufficient.

WIZARD-CXY · 2017-08-08T10:23:06Z

@liggitt If we deploy addon-resizer in its own pod with its own sa, we also need a cluster role for list nodes permission due to it must get the cluster node number to determine whether to scale or not. Tested on my cluster with this. Now it is using the kube-state-metrics clusterrole to work.

liggitt · 2017-08-08T11:54:04Z

we also need a cluster role for list nodes permission due to it must get the cluster node number to determine whether to scale or not. Tested on my cluster with this. Now it is using the kube-state-metrics clusterrole to work.

that's worth noting, but given the Role for the addon-resizer is pretty specific to kube-state-metrics, granting the kube-state-metrics clusterrole to the resizer wouldn't be terrible. the important thing is to keep the deployment mutation permissions separate so that someone running without the addon-resizer wouldn't be required to give their metrics pod permission to modify the deployment. this PR does that, so I think it's fine as-is.

brancz

small nits on the naming

brancz · 2017-08-08T12:57:46Z

kubernetes/kube-state-metrics-deployment.yaml

@@ -30,4 +30,30 @@ spec:
          limits:
            memory: 50Mi
            cpu: 200m
-
+      - name: pod-nanny


as the container image is called addon-resizer, let's call the container the same

brancz · 2017-08-08T12:58:58Z

kubernetes/kube-state-metrics-role.yaml

+kind: Role
+metadata:
+  namespace: kube-system
+  name: pod-nanny


same as above, I'd like to change this to something more expressive to what it is, maybe something like: s/pod-nanny/kube-state-metrics-resizer/

brancz · 2017-08-08T12:59:29Z

kubernetes/kube-state-metrics-role-binding.yaml

+roleRef:
+  apiGroup: rbac.authorization.k8s.io
+  kind: Role
+  name: pod-nanny


due to my suggestion to change the name of this, we need to adapt here as well

WIZARD-CXY · 2017-08-08T14:37:49Z

fixed @brancz ptal

WIZARD-CXY · 2017-08-08T14:39:41Z

tested ok on my cluster

brancz · 2017-08-08T15:13:27Z

Great thanks! I think we can move forward and get 1.0 out!

add autoscaler for kube-state-metrics

k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Aug 3, 2017

WIZARD-CXY force-pushed the autoscaler branch 2 times, most recently from 89a11dd to 68cf172 Compare August 3, 2017 06:34

WIZARD-CXY force-pushed the autoscaler branch from 68cf172 to cf16ee0 Compare August 3, 2017 09:56

brancz reviewed Aug 3, 2017

View reviewed changes

add autoscaler for kube-state-metrics

c21d68c

WIZARD-CXY force-pushed the autoscaler branch from cf16ee0 to c21d68c Compare August 3, 2017 10:18

liggitt reviewed Aug 4, 2017

View reviewed changes

WIZARD-CXY force-pushed the autoscaler branch from 21a076b to 4c268ee Compare August 7, 2017 03:00

liggitt reviewed Aug 7, 2017

View reviewed changes

WIZARD-CXY force-pushed the autoscaler branch from 4c268ee to 18411bb Compare August 7, 2017 03:49

WIZARD-CXY force-pushed the autoscaler branch from 18411bb to b9f576b Compare August 8, 2017 02:24

brancz reviewed Aug 8, 2017

View reviewed changes

fix rbac issue

1dff559

WIZARD-CXY force-pushed the autoscaler branch from b9f576b to 1dff559 Compare August 8, 2017 14:35

brancz merged commit d0febc4 into kubernetes:master Aug 8, 2017

This was referenced Aug 23, 2017

No logs available #112

Closed

Match deployment resources to recommendation #238

Merged

andyxning added this to the 1.0 milestone Sep 13, 2017

andyxning mentioned this pull request Sep 13, 2017

1.0 stabilization #124

Closed

4 tasks

while1malloc0 pushed a commit to while1malloc0/kube-state-metrics that referenced this pull request Jul 2, 2018

Merge pull request kubernetes#200 from WIZARD-CXY/autoscaler

420a7a4

add autoscaler for kube-state-metrics

add autoscaler for kube-state-metrics #200

add autoscaler for kube-state-metrics #200

Conversation

WIZARD-CXY commented Aug 3, 2017 • edited Loading

WIZARD-CXY commented Aug 3, 2017

brancz commented Aug 3, 2017

brancz commented Aug 3, 2017

WIZARD-CXY commented Aug 3, 2017

brancz commented Aug 3, 2017

WIZARD-CXY commented Aug 3, 2017

brancz commented Aug 3, 2017

WIZARD-CXY commented Aug 3, 2017 • edited Loading

brancz commented Aug 3, 2017

WIZARD-CXY commented Aug 3, 2017

WIZARD-CXY commented Aug 3, 2017

brancz left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

WIZARD-CXY commented Aug 3, 2017

WIZARD-CXY commented Aug 3, 2017

WIZARD-CXY commented Aug 3, 2017

brancz commented Aug 3, 2017

WIZARD-CXY commented Aug 3, 2017

brancz commented Aug 3, 2017

piosz commented Aug 4, 2017

piosz commented Aug 4, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

brancz Aug 4, 2017 • edited Loading

Choose a reason for hiding this comment

WIZARD-CXY commented Aug 7, 2017

WIZARD-CXY commented Aug 7, 2017

liggitt left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

WIZARD-CXY commented Aug 7, 2017

WIZARD-CXY commented Aug 7, 2017 • edited Loading

brancz commented Aug 7, 2017 • edited Loading

WIZARD-CXY commented Aug 7, 2017

brancz commented Aug 7, 2017 • edited Loading

liggitt commented Aug 7, 2017

WIZARD-CXY commented Aug 7, 2017

WIZARD-CXY commented Aug 7, 2017

liggitt commented Aug 7, 2017

WIZARD-CXY commented Aug 8, 2017 • edited Loading

WIZARD-CXY commented Aug 8, 2017 • edited Loading

brancz commented Aug 8, 2017

liggitt commented Aug 8, 2017

WIZARD-CXY commented Aug 8, 2017

liggitt commented Aug 8, 2017

brancz left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

WIZARD-CXY commented Aug 8, 2017

WIZARD-CXY commented Aug 8, 2017

brancz commented Aug 8, 2017

WIZARD-CXY commented Aug 3, 2017 •

edited

Loading

WIZARD-CXY commented Aug 3, 2017 •

edited

Loading

brancz Aug 4, 2017 •

edited

Loading

WIZARD-CXY commented Aug 7, 2017 •

edited

Loading

brancz commented Aug 7, 2017 •

edited

Loading

brancz commented Aug 7, 2017 •

edited

Loading

WIZARD-CXY commented Aug 8, 2017 •

edited

Loading

WIZARD-CXY commented Aug 8, 2017 •

edited

Loading