Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] Flux Multi-Tenancy Mode #2093

Closed
wants to merge 2 commits into from
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
206 changes: 206 additions & 0 deletions rfcs/0003-multi-tenancy-mode/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,206 @@
# RFC-0003 Flux Multi-Tenancy Mode

**Status:** provisional

**Creation date:** 2021-11-16

**Last update:** 2022-02-03

## Summary

For multi-tenant environments, we want to offer an easy way of configuring Flux to enforce tenant isolation
(as defined by the Soft Multi-Tenancy model from RFC-0001).

When running in the multi-tenant mode, Flux will lock down access to sources (as defined by RFC-0002),
and will use the tenant service account instead of defaulting to `cluster-admin`.

From an end-user perspective, the multi-tenancy mode means that:

- Platform admins have to create a Kubernetes service account and RBAC in each namespace where
Flux performs source-to-cluster reconciliation on behalf of tenants.
By default, Flux will have no permissions to reconcile the tenants sources onto clusters.
- Source owners have to specify with which tenants they wish to share their sources.
By default, nothing is shared between tenants.

## Motivation

As of [version 0.26](https://github.com/fluxcd/flux2/releases/tag/v0.26.0) (Feb 2022),
configuring Flux for soft multi-tenancy requires platform admins to:
- Deny cross-namespace access to Flux custom resources by setting the `--no-cross-namespace-refs` flag.
- Enforce impersonation by setting a default service account with the `--default-service-account` flag.

Instead of using a Kustomize patch to lock down Flux as descried in the
[multi-tenancy lockdown documentation](https://fluxcd.io/docs/installation/#multi-tenancy-lockdown),
we could extend `flux install` and `flux bootstrap` and offer a flag to configure Flux with multi-tenancy enforcements.

### Goals

- Enforce service account impersonation for source-to-cluster reconciliation.
- Enforce ACLs for cross-namespace access to sources.

### Non-Goals

- Enforce tenant's workload isolation with network policies and pod security standards as described
[here](https://kubernetes.io/blog/2021/04/15/three-tenancy-models-for-kubernetes/#security-considerations).

## Proposal

### User Stories

#### Story 1

> As a platform admin, I want to install Flux with lowest privilege/permission level possible.

#### Story 2

> As a platform admin, I want to give tenants full control over their assigned namespaces.
> So that tenants could use their own repositories and manager the app delivery with Flux.

#### Story 3

> As a platform admin, I want to prevent tenants from changing the cluster-wide configuration.
> If a tenant adds to their repository a cluster-scoped resource such as a namespace or cluster role,
> Flux should reject the change and notify the tenant that this operation is not allowed.

### Multi-tenant Bootstrap

When bootstrapping Flux, platform admins should have the option to lock down Flux for multi-tenant environments e.g.:

```shell
flux bootstrap --security-profile=multi-tenant
```

The security profile flag accepts two values: `single-tenant` and `multi-tenant`.
Platform admins may switch between the two modes at any time, either by rerunning bootstrap
or by patching the Flux manifests in Git.

The `multi-tenant` profile is just a shortcut to setting the following container args in the Flux deployment manifests:

```yaml
containers:
- name: manager
args:
- --default-service-account=flux
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 for this being configurable

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Likewise. This provides a nice way to adopt ServiceAccounts that are already set up.

- --enable-source-acl=true
```

And for disabling cross-namespace references when using the notification API:

```yaml
kind: Deployment
metadata:
name: notification-controller
spec:
template:
spec:
containers:
- name: manager
args:
- --no-cross-namespace-refs=true
```

When running in the `multi-tenant` mode, Flux behaves differently:

- The source-to-cluster reconciliation no longer runs under the service account of
the Flux controllers. The controller service account, is only used to impersonate
the service account specified in the Flux custom resources (`Kustomizations`, `HelmReleases`).
- When no service account name is specified in a Flux custom resource,
stefanprodan marked this conversation as resolved.
Show resolved Hide resolved
a default will be used e.g. `system:serviceaccount:<tenant-namespace>:flux`.
stefanprodan marked this conversation as resolved.
Show resolved Hide resolved
- When a Flux custom resource (`Kustomizations`, `HelmReleases`, `ImagePolicies`, `ImageUpdateAutomations`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we clarify the exact permissions that we are giving to the Flux controllers in this mode? I'm assuming GET/LIST access to .fluxcd.io CRDS, secrets, etc. across all namespaces?

Copy link
Member Author

@stefanprodan stefanprodan Nov 22, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The controller cluster role is defined here: https://github.com/fluxcd/flux2/blob/main/manifests/rbac/controller.yaml

Should I paste this in RFC?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this would be good to add into the RFC for everyone's visibility

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm assuming that people read Flux docs not just this RFC. The RBAC and impersonation is covered here https://fluxcd.io/docs/security/#controller-permissions.

Copy link
Contributor

@jonathan-innis jonathan-innis Nov 29, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the implementation of this security proposal, we can drop the cluster-reconciler in this case for the kustomize-controller and helm-controller, though. I'm thinking this would be good to call out vs. the status-quo where they are given cluster-admin as default

Copy link
Member Author

@stefanprodan stefanprodan Nov 29, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can’t drop the cluster role binding to cluster-admin, admins can switch between single-tenant and multi-tenant at any time. The controllers use the cluster-admin to impersonate the tenant account.

From the docs I linked above:

However in a soft multi-tenancy setup, Flux does not reconcile a tenant’s repo under the cluster-admin role. Instead you specify a different service account in your manifest, and the Flux controllers will use the Kubernetes Impersonation API under cluster-admin to impersonate that service account.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this mode, cluster-reconciler could bind to a ClusterRole that only provides the impersonation powers it requires. (i.e. allowing a Platform administrator to further limit the ClusterRoleBinding to specific accounts it can run as)

refers to a source in a different namespace, access is granted based the source access control list.
If no ACL is defined for a source, cross-namespace access is denied.
- When a Flux notification (`Alerts`, `Receivers`)
refers to a resource in a different namespace, access is denied.

### Tenants Onboarding

When onboarding tenants, platform admins should have the option to assign namespaces, set
permissions and register the tenants repositories onto clusters in a declarative manner.

The Flux CLI offers an easy way of generating all the Kubernetes manifests needed to onboard tenants:

- `flux create tenant` command generates namespaces, service accounts and Kubernetes RBAC
with restricted access to the cluster resources, given tenants access only to their namespaces.
- `flux create secret git` command generates SSH keys used by Flux to clone the tenants repositories.
- `flux create source git` command generates the configuration that tells Flux which repositories belong to tenants.
- `flux create kustomization` command generates the configuration that tells Flux how to reconcile the manifests found in the tenants repositories.

All the above commands have an `--export` flag for generating the Kubernetes resources in YAML format.
The platform admins should place the generated manifests in the repository that defines the cluster(s) desired state.

Here is an example of the generated manifests:

```yaml
---
apiVersion: v1
kind: Namespace
metadata:
name: tenant1
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: flux
namespace: tenant1
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: flux
namespace: tenant1
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: cluster-admin
subjects:
- kind: ServiceAccount
name: flux
namespace: tenant1
---
apiVersion: source.toolkit.fluxcd.io/v1beta1
kind: GitRepository
metadata:
name: tenant1
namespace: tenant1
spec:
interval: 5m0s
ref:
branch: main
secretRef:
name: tenant1-git-auth
url: ssh://git@github.com/org/tenant1
---
apiVersion: kustomize.toolkit.fluxcd.io/v1beta2
kind: Kustomization
metadata:
name: tenant1
namespace: tenant1
spec:
interval: 10m0s
path: ./
prune: true
serviceAccountName: flux
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With this proposed method, the kustomize-controller or helm-controller can reconcile any service account within the namespace of the CR. With this in mind, a user needs to be hyper-aware that any service account in that namespace could be used for reconciliation and must not role-bind to give any service account any kind of elevated privilege that it did not intend for the flux reconcilers (such as giving access to create HelmReleases across the cluster, which essentially opens up Pandora's box). Will this be noted somewhere specifically/is there a way to ensure that flux can only reconcile with service accounts with .metadata.labels intended specifically for flux?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we shouldn't enforce any limitations for the service accounts that are provisioned by platform admins because they may chose to allow a tenant to own multiple namespaces, so they can bind a SA to all those namespaces. A tenant can't create cluster role bindings so it's not possible for them to evade their boundary.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't this same concern apply to e.g. a Deployment which accepts any arbitrary service account as a configuration value? Which I assume, isn't a problem. Raising the question why it would be different for a Flux Custom Resource?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't this same concern apply to e.g. a Deployment which accepts any arbitrary service account as a configuration value?

These are the same technically -- a Deployment can run a container that uses kubectl to cause mischief. But the opportunity for mistakes is higher when Flux is introduced. With only Deployments and other "normal" objects you would not usually consider creating a service account with wide-ranging permissions. When you're using Flux, you have much more (RBAC) to think about.

Copy link
Member Author

@stefanprodan stefanprodan Dec 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With only Deployments and other "normal" objects you would not usually consider creating a service account with wide-ranging permissions.

Depend on what you consider to be "normal" here, every single Kubernetes addon/controller needs wide-ranging permissions, are these abnormal?

Copy link
Member

@hiddeco hiddeco Dec 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think my point is more that if you make extensive use of RBAC, you should be aware of what this entails (for your tenants).

The recommendation of not sharing a service account between multiple resources, or using any arbitrary account that is available to you, does not change for a Deployment or a Flux resource.

Copy link
Member

@hiddeco hiddeco Dec 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not saying here that this means we should not highlight or be upfront about it, but rather that I am not really enthusiastic about adding e.g. label selectors for service accounts.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not really enthusiastic about adding e.g. label selectors for service accounts.

Right -- requiring labels is not a protection, since any process that can run with the service account can do mischief, not just the Flux controllers.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main challenge on the model in which the tenant's service account is created in the tenant's namespace is that anyone with pod create permissions in the tenant's namespace would be able to privesc to whatever permissions the flux tenant service account is running as - current example is cluster-admin within that namespace.

In this model platform admins may have to lean on admission controllers to block that path to privilege escalation.

Copy link
Member

@pjbgf pjbgf Dec 13, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added some recommendations on how the tenant can be structured to protect their service accounts in the RFC004 (xref: #2086 (comment)) without requiring admission controllers.

sourceRef:
kind: GitRepository
name: tenant1
```

Note that the [cluster-admin](https://kubernetes.io/docs/reference/access-authn-authz/rbac/#user-facing-roles)
role is used in a `RoleBinding`, this only gives full control over every resource in the role binding's namespace.

Once the tenants repositories are registered on the cluster(s), the tenants can configure their app delivery
in Git using Kubernetes namespace-scoped resources such as `Deployments`, `Services`, Flagger `Canaries`,
Flux `Kustomizations`, `HelmReleases`, `ImageUpdateAutomations`, `Alerts`, `Receivers`, etc.

## Alternatives

Instead of introducing the security profile flag to `flux bootstrap`,
we could document how to patch each controller deployment with Kustomize as described in the
[multi-tenancy lockdown documentation](https://fluxcd.io/docs/installation/#multi-tenancy-lockdown).

Having an easy way of locking down Flux with a single flag, make users aware of the security implications
and improves the user experience.

## Implementation History

- Disabling cross-namespace access and providing a default service account was first released in flux2 **v0.26.0**.