Most clusters drift toward cluster-admin because it is the path of least resistance: someone hits a Forbidden, a binding gets widened, and nobody ever narrows it back. This is a playbook for doing the opposite — scoping permissions deliberately, binding to groups instead of people, and proving on a schedule that the model still holds.
1. RBAC primitives, revisited
Four object kinds do all the work. The split that trips people up is scope, not function.
| Kind | Scope | Grants permissions in |
|---|---|---|
Role |
namespaced | its own namespace |
ClusterRole |
cluster-wide | all namespaces, plus cluster-scoped & non-resource URLs |
RoleBinding |
namespaced | one namespace (subject ← Role or ClusterRole) |
ClusterRoleBinding |
cluster-wide | every namespace |
The non-obvious combination: a RoleBinding can reference a ClusterRole. That lets you author a permission set once as a ClusterRole and grant it per namespace via RoleBinding — the workhorse pattern for multi-tenant clusters.
RBAC is purely additive. There are no deny rules; effective permission is the union of every binding a subject matches. You reduce access by removing or narrowing bindings, never by adding a denial.
A rule is apiGroups × resources × verbs, optionally narrowed by resourceNames:
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: deploy-restart
namespace: team-payments
rules:
- apiGroups: ["apps"]
resources: ["deployments"]
verbs: ["get", "list", "watch", "patch"] # patch = rollout restart
- apiGroups: ["apps"]
resources: ["deployments/scale"] # subresource is a separate grant
verbs: ["update", "patch"]
Subjects are not Kubernetes objects. Users and groups are opaque strings asserted by the authenticator (your OIDC provider or client cert). The API server never validates that a user “exists” — it only matches the string. ServiceAccounts are real namespaced objects.
2. Map personas to scoped roles
Start from who is asking, not from a default role. A workable baseline for a shared cluster:
| Persona | Scope | Core verbs |
|---|---|---|
| App developer | own namespace | read all; create/patch/delete deployments, configmaps, services; get pod logs; create pods/exec (gated) |
| SRE / on-call | cluster-wide | read everything; delete pods; patch nodes (cordon/drain); read events |
| CI pipeline (SA) | own namespace | apply app resources; no secrets read, no exec |
| Tenant team | own namespace | the developer set, bound only in their namespace |
Author each as a ClusterRole so it is reusable, then bind with a namespaced RoleBinding:
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: app-developer
rules:
- apiGroups: ["", "apps", "batch", "networking.k8s.io"]
resources: ["deployments", "replicasets", "pods", "services",
"configmaps", "jobs", "cronjobs", "ingresses"]
verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]
- apiGroups: [""]
resources: ["pods/log"]
verbs: ["get", "list"]
# NOTE: secrets and pods/exec are deliberately omitted
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: payments-developers
namespace: team-payments
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole # reuse the cluster role, scope it here
name: app-developer
subjects:
- apiGroup: rbac.authorization.k8s.io
kind: Group
name: "eng-payments" # an OIDC group, not a person
Note that pods/log and pods/exec are separate subresources. Granting log access does not grant exec — keep exec on its own, more tightly bound role.
3. Aggregated ClusterRoles for maintainable sets
Hand-maintaining the rule list above across a dozen teams is how drift starts. Use ClusterRole aggregation: a controller automatically merges the rules of any ClusterRole whose labels match an aggregation selector.
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: monitoring-reader # the aggregate — leave rules empty
aggregationRule:
clusterRoleSelectors:
- matchLabels:
rbac.kv.io/aggregate-to-monitoring: "true"
rules: [] # filled in by the controller; do not hand-edit
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: monitoring-prometheus
labels:
rbac.kv.io/aggregate-to-monitoring: "true" # contributes into the aggregate
rules:
- apiGroups: [""]
resources: ["nodes/metrics", "services", "endpoints", "pods"]
verbs: ["get", "list", "watch"]
Bind a subject to monitoring-reader; later, drop in a new labeled ClusterRole and the grant expands with zero edits to bindings. This is exactly how the built-in admin, edit, and view roles absorb CRD permissions — operators ship roles labeled rbac.authorization.k8s.io/aggregate-to-edit: "true" and they merge automatically.
Aggregation is convenient and a footgun. Any actor who can create a
ClusterRolewith the magic label silently widens every aggregate. Treatcreateonclusterrolesas a privileged grant (more on this in section 7).
4. ServiceAccount hygiene
Every pod runs as a ServiceAccount. If you do nothing, the default SA token is mounted into every pod at /var/run/secrets/kubernetes.io/serviceaccount/ — a ready-made credential for anyone who lands RCE in a container.
Disable automount unless the workload calls the API. Set it on the SA (and you can override per-pod):
apiVersion: v1
kind: ServiceAccount
metadata:
name: web-frontend
namespace: team-payments
automountServiceAccountToken: false # frontend never talks to the API
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: web-frontend
namespace: team-payments
spec:
template:
spec:
serviceAccountName: web-frontend
automountServiceAccountToken: false # belt-and-suspenders at pod level
Give each workload its own SA and bind only the verbs it needs. Never reuse default, and never bind anything to the default SA. Find pods still riding the default SA:
kubectl get pods -A -o json | jq -r '
.items[]
| select((.spec.serviceAccountName // "default") == "default")
| "\(.metadata.namespace)/\(.metadata.name)"'
Since Kubernetes 1.24, SAs no longer auto-create long-lived Secret tokens; pods get short-lived bound tokens via the TokenRequest API, auto-rotated and audience-scoped. If you find a manually created kubernetes.io/service-account-token Secret, treat it as a static credential to eliminate.
5. Bind to OIDC groups, not individuals
Per-user bindings are unauditable and they outlive the user. Wire the API server to your IdP (Entra ID, Okta, or Dex fronting either) and bind to groups.
API server flags (managed control planes expose these as cluster settings — e.g. AKS/EKS OIDC integration — rather than raw flags):
kube-apiserver \
--oidc-issuer-url=https://login.microsoftonline.com/<tenant-id>/v2.0 \
--oidc-client-id=<app-client-id> \
--oidc-username-claim=sub \
--oidc-username-prefix="oidc:" \
--oidc-groups-claim=groups \
--oidc-groups-prefix="oidc:"
The prefixes namespace external identities so they cannot collide with built-ins like system:masters. Subjects then reference the prefixed group:
subjects:
- apiGroup: rbac.authorization.k8s.io
kind: Group
name: "oidc:eng-payments"
Now access control lives in your IdP: add a user to the eng-payments group and they get exactly the bound role; remove them and access is gone at next token refresh. No cluster change, full IdP audit trail. For local CLI auth, kubectl uses the oidc exec/auth plugin (or kubelogin for Entra) to fetch and refresh tokens.
6. Find over-permissioned subjects
kubectl auth can-i is the built-in primitive — and the only one you should trust as ground truth, because it asks the API server’s actual authorizer:
# Impersonate a subject and ask a specific question
kubectl auth can-i delete secrets -n team-payments \
--as="oidc:alice@corp.com" --as-group="oidc:eng-payments"
# List everything a ServiceAccount can do
kubectl auth can-i --list \
--as=system:serviceaccount:team-payments:ci-deployer
For the reverse question — who can do something — reach for community tools. They parse RBAC objects, so confirm hits with auth can-i:
# krew plugins
kubectl krew install who-can rbac-tool access-matrix
# Who can read secrets anywhere?
kubectl who-can get secrets -A
# Resolve a subject's full effective permissions (transitively)
kubectl rbac-tool lookup oidc:alice@corp.com
kubectl rbac-tool policy-rules -e '^system:serviceaccount:.*'
# rakkess: per-resource access matrix for the current (or impersonated) subject
kubectl access-matrix --as=system:serviceaccount:team-payments:ci-deployer
rbac-tool also generates a least-privilege ClusterRole from observed audit-log activity (rbac-tool gen) — a strong starting point when retrofitting a permissive SA.
7. Catch privilege-escalation paths
Some grants are dangerous regardless of how narrow they look, because they let a subject grant themselves more. Hunt these specifically:
| Verb / resource | Why it is an escalation path |
|---|---|
escalate on roles/clusterroles |
create/update a role with more rights than you hold (bypasses the built-in escalation check) |
bind on roles/clusterroles |
bind an existing high-priv role to yourself |
impersonate on users/groups/serviceaccounts |
act as any subject — effectively a superuser |
create on pods (+ a privileged SA in ns) |
launch a pod mounting another SA’s token, or a hostPath/privileged pod |
get/list on secrets |
read SA tokens, then authenticate as those SAs |
pods/exec, pods/attach |
execute inside a running pod, inheriting its identity |
update on */status or nodes |
tamper with scheduling / admission outcomes |
Normally the API server blocks you from creating a role more powerful than your own. The escalate and bind verbs opt out of that check — so granting them, even scoped, hands over the keys. Find every subject holding them:
# Any binding granting escalate/bind/impersonate
kubectl rbac-tool policy-rules -o wide \
| grep -Ei 'escalate|impersonate|(^|[^a-z])bind([^a-z]|$)'
# Wildcards are almost always over-grants — surface them
kubectl get clusterroles -o json | jq -r '
.items[]
| select([.rules[]? | select((.verbs[]?=="*") or (.resources[]?=="*"))] | length > 0)
| .metadata.name'
Reading Secrets is the most underrated escalation. A “read-only” role that includes
secretscan read every mounted SA token in the namespace and then become those SAs. Excludesecretsfrom broad read roles and grant specific secrets byresourceNamesonly.
8. Continuous RBAC auditing
A correct model on Tuesday means nothing if Friday’s incident-fix binding never gets reverted. Make auditing continuous.
Enable the audit log and capture authorization decisions. A minimal policy that records RBAC changes and denials at metadata level:
apiVersion: audit.k8s.io/v1
kind: Policy
rules:
- level: RequestResponse
verbs: ["create", "update", "patch", "delete"]
resources:
- group: "rbac.authorization.k8s.io"
resources: ["roles", "clusterroles", "rolebindings", "clusterrolebindings"]
- level: Metadata
resources:
- group: ""
resources: ["secrets"] # who read what, without logging values
authorization.k8s.io/decision: forbid annotations in the log are gold — a steady stream of denials for one subject is a missing (legitimate) grant or an attacker probing. Ship the log to your SIEM and alert on:
- any write to
clusterrolebindings, - new subjects bound to
cluster-adminorsystem:masters, - spikes in
forbiddecisions per subject.
Policy-as-code stops bad grants before they merge. With Kyverno or Gatekeeper/OPA, reject dangerous bindings at admission. A Kyverno policy blocking new cluster-admin ClusterRoleBindings:
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
name: restrict-cluster-admin
spec:
validationFailureAction: Enforce
rules:
- name: no-cluster-admin-binding
match:
any:
- resources:
kinds: ["ClusterRoleBinding"]
validate:
message: "Binding to cluster-admin is not allowed; request a scoped role."
deny:
conditions:
any:
- key: "{{ request.object.roleRef.name }}"
operator: Equals
value: "cluster-admin"
Keep all Roles/Bindings in Git, reconciled by Argo CD or Flux. GitOps gives you the missing pieces: drift detection (the cluster is corrected back to Git), peer review on every permission change, and a full history of who widened what and when.
Enterprise scenario
A fintech platform team running multi-tenant EKS thought their tenants were boxed into their own namespaces. Each tenant got the app-developer ClusterRole bound via a namespaced RoleBinding — the canonical pattern from section 2. During a quarterly access review, rbac-tool flagged that one tenant’s CI ServiceAccount could read Secrets in every namespace. The grant looked scoped, so nobody believed it until auth can-i confirmed it.
The cause was the EKS-managed aws-node and a vendor monitoring operator both shipping ClusterRoles labeled rbac.authorization.k8s.io/aggregate-to-edit: "true". A platform engineer, building a self-service “give tenants edit in their namespace” flow, had bound tenants to the built-in edit ClusterRole. Aggregation silently merged the vendor’s secrets: ["get","list"] rule into edit cluster-wide — and because the tenant binding used edit, every tenant inherited Secret read in their namespace, including mounted SA tokens they could then impersonate.
The fix was to stop binding the aggregated built-ins for tenants and pin an explicit, non-aggregating ClusterRole instead, plus a Kyverno policy rejecting any new ClusterRole carrying an aggregation label unless it lives in an allowlisted namespace prefix.
# Prove the blast radius before and after the fix
kubectl auth can-i list secrets -n tenant-acme \
--as=system:serviceaccount:tenant-acme:ci # was: yes -> now: no
# Audit which ClusterRoles feed the built-in edit aggregate
kubectl get clusterroles -l rbac.authorization.k8s.io/aggregate-to-edit=true
Lesson: binding aggregated roles is binding a moving target. Any operator you install can widen them.
Verify
Prove the model end to end before you call it done.
# 1. The least-priv personas can do their job...
kubectl auth can-i patch deployments -n team-payments \
--as=oidc:alice@corp.com --as-group=oidc:eng-payments # expect: yes
# 2. ...and cannot do what they must not
kubectl auth can-i get secrets -n team-payments \
--as=oidc:alice@corp.com --as-group=oidc:eng-payments # expect: no
kubectl auth can-i create pods/exec -n team-payments \
--as=oidc:alice@corp.com --as-group=oidc:eng-payments # expect: no
# 3. CI cannot read secrets or exec
kubectl auth can-i get secrets \
--as=system:serviceaccount:team-payments:ci-deployer # expect: no
# 4. No unexpected holders of escalate/bind/impersonate
kubectl rbac-tool policy-rules -o wide | grep -Ei 'escalate|impersonate'
# 5. Nobody outside the break-glass list binds cluster-admin
kubectl get clusterrolebindings -o json | jq -r '
.items[] | select(.roleRef.name=="cluster-admin")
| "\(.metadata.name): \([.subjects[]?.name] | join(", "))"'
# 6. No pods silently using the default SA
kubectl get pods -A -o json | jq -r '
.items[] | select((.spec.serviceAccountName // "default")=="default")
| "\(.metadata.namespace)/\(.metadata.name)"'
Checklist
Pitfalls
*in any rule. A wildcard verb or resource silently absorbs future API types (including CRDs). Enumerate explicitly.RoleBindingto aClusterRoleBindingmental slip. ARoleBindingscopes aClusterRoleto one namespace — but aClusterRoleBindinggrants it everywhere. The wrong one turns “view in dev” into “view in production.”- Forgetting RBAC is additive. You cannot deny; an over-broad binding is not cancelled by a narrow one. Audit the union, not the latest object.
- Trusting community tooling as ground truth.
who-can,rbac-tool, andrakkessread RBAC objects, not the live authorizer — webhook authorizers and Node authorization can change the real answer. Confirm withkubectl auth can-i. - Leaving break-glass loose. Keep one audited, alerting-wired
cluster-adminpath. The goal is least privilege by default, not zero privilege when the cluster is on fire.