Designing Least-Privilege RBAC in Kubernetes: Roles, Aggregation & Auditing at Scale

Most clusters drift toward cluster-admin because it is the path of least resistance: someone hits a Forbidden, a binding gets widened, and nobody ever narrows it back. This is a playbook for doing the opposite — scoping permissions deliberately, binding to groups instead of people, and proving on a schedule that the model still holds.

1. RBAC primitives, revisited

Four object kinds do all the work. The split that trips people up is scope, not function.

Kind	Scope	Grants permissions in
`Role`	namespaced	its own namespace
`ClusterRole`	cluster-wide	all namespaces, plus cluster-scoped & non-resource URLs
`RoleBinding`	namespaced	one namespace (subject ← `Role` or `ClusterRole`)
`ClusterRoleBinding`	cluster-wide	every namespace

The non-obvious combination: a RoleBinding can reference a ClusterRole. That lets you author a permission set once as a ClusterRole and grant it per namespace via RoleBinding — the workhorse pattern for multi-tenant clusters.

RBAC is purely additive. There are no deny rules; effective permission is the union of every binding a subject matches. You reduce access by removing or narrowing bindings, never by adding a denial.

A rule is apiGroups × resources × verbs, optionally narrowed by resourceNames:

apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: deploy-restart
  namespace: team-payments
rules:
  - apiGroups: ["apps"]
    resources: ["deployments"]
    verbs: ["get", "list", "watch", "patch"]   # patch = rollout restart
  - apiGroups: ["apps"]
    resources: ["deployments/scale"]            # subresource is a separate grant
    verbs: ["update", "patch"]

Subjects are not Kubernetes objects. Users and groups are opaque strings asserted by the authenticator (your OIDC provider or client cert). The API server never validates that a user “exists” — it only matches the string. ServiceAccounts are real namespaced objects.

2. Map personas to scoped roles

Start from who is asking, not from a default role. A workable baseline for a shared cluster:

Persona	Scope	Core verbs
App developer	own namespace	read all; `create/patch/delete` deployments, configmaps, services; `get` pod logs; `create` pods/exec (gated)
SRE / on-call	cluster-wide	read everything; `delete` pods; `patch` nodes (cordon/drain); read events
CI pipeline (SA)	own namespace	`apply` app resources; no secrets read, no exec
Tenant team	own namespace	the `developer` set, bound only in their namespace

Author each as a ClusterRole so it is reusable, then bind with a namespaced RoleBinding:

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: app-developer
rules:
  - apiGroups: ["", "apps", "batch", "networking.k8s.io"]
    resources: ["deployments", "replicasets", "pods", "services",
                "configmaps", "jobs", "cronjobs", "ingresses"]
    verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]
  - apiGroups: [""]
    resources: ["pods/log"]
    verbs: ["get", "list"]
  # NOTE: secrets and pods/exec are deliberately omitted
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: payments-developers
  namespace: team-payments
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole          # reuse the cluster role, scope it here
  name: app-developer
subjects:
  - apiGroup: rbac.authorization.k8s.io
    kind: Group
    name: "eng-payments"     # an OIDC group, not a person

Note that pods/log and pods/exec are separate subresources. Granting log access does not grant exec — keep exec on its own, more tightly bound role.

3. Aggregated ClusterRoles for maintainable sets

Hand-maintaining the rule list above across a dozen teams is how drift starts. Use ClusterRole aggregation: a controller automatically merges the rules of any ClusterRole whose labels match an aggregation selector.

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: monitoring-reader        # the aggregate — leave rules empty
aggregationRule:
  clusterRoleSelectors:
    - matchLabels:
        rbac.kv.io/aggregate-to-monitoring: "true"
rules: []                        # filled in by the controller; do not hand-edit
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: monitoring-prometheus
  labels:
    rbac.kv.io/aggregate-to-monitoring: "true"   # contributes into the aggregate
rules:
  - apiGroups: [""]
    resources: ["nodes/metrics", "services", "endpoints", "pods"]
    verbs: ["get", "list", "watch"]

Bind a subject to monitoring-reader; later, drop in a new labeled ClusterRole and the grant expands with zero edits to bindings. This is exactly how the built-in admin, edit, and view roles absorb CRD permissions — operators ship roles labeled rbac.authorization.k8s.io/aggregate-to-edit: "true" and they merge automatically.

Aggregation is convenient and a footgun. Any actor who can create a ClusterRole with the magic label silently widens every aggregate. Treat create on clusterroles as a privileged grant (more on this in section 7).

4. ServiceAccount hygiene

Every pod runs as a ServiceAccount. If you do nothing, the default SA token is mounted into every pod at /var/run/secrets/kubernetes.io/serviceaccount/ — a ready-made credential for anyone who lands RCE in a container.

Disable automount unless the workload calls the API. Set it on the SA (and you can override per-pod):

apiVersion: v1
kind: ServiceAccount
metadata:
  name: web-frontend
  namespace: team-payments
automountServiceAccountToken: false   # frontend never talks to the API
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: web-frontend
  namespace: team-payments
spec:
  template:
    spec:
      serviceAccountName: web-frontend
      automountServiceAccountToken: false   # belt-and-suspenders at pod level

Give each workload its own SA and bind only the verbs it needs. Never reuse default, and never bind anything to the default SA. Find pods still riding the default SA:

kubectl get pods -A -o json | jq -r '
  .items[]
  | select((.spec.serviceAccountName // "default") == "default")
  | "\(.metadata.namespace)/\(.metadata.name)"'

Since Kubernetes 1.24, SAs no longer auto-create long-lived Secret tokens; pods get short-lived bound tokens via the TokenRequest API, auto-rotated and audience-scoped. If you find a manually created kubernetes.io/service-account-token Secret, treat it as a static credential to eliminate.

5. Bind to OIDC groups, not individuals

Per-user bindings are unauditable and they outlive the user. Wire the API server to your IdP (Entra ID, Okta, or Dex fronting either) and bind to groups.

API server flags (managed control planes expose these as cluster settings — e.g. AKS/EKS OIDC integration — rather than raw flags):

kube-apiserver \
  --oidc-issuer-url=https://login.microsoftonline.com/<tenant-id>/v2.0 \
  --oidc-client-id=<app-client-id> \
  --oidc-username-claim=sub \
  --oidc-username-prefix="oidc:" \
  --oidc-groups-claim=groups \
  --oidc-groups-prefix="oidc:"

The prefixes namespace external identities so they cannot collide with built-ins like system:masters. Subjects then reference the prefixed group:

subjects:
  - apiGroup: rbac.authorization.k8s.io
    kind: Group
    name: "oidc:eng-payments"

Now access control lives in your IdP: add a user to the eng-payments group and they get exactly the bound role; remove them and access is gone at next token refresh. No cluster change, full IdP audit trail. For local CLI auth, kubectl uses the oidc exec/auth plugin (or kubelogin for Entra) to fetch and refresh tokens.

6. Find over-permissioned subjects

kubectl auth can-i is the built-in primitive — and the only one you should trust as ground truth, because it asks the API server’s actual authorizer:

# Impersonate a subject and ask a specific question
kubectl auth can-i delete secrets -n team-payments \
  --as="oidc:alice@corp.com" --as-group="oidc:eng-payments"

# List everything a ServiceAccount can do
kubectl auth can-i --list \
  --as=system:serviceaccount:team-payments:ci-deployer

For the reverse question — who can do something — reach for community tools. They parse RBAC objects, so confirm hits with auth can-i:

# krew plugins
kubectl krew install who-can rbac-tool access-matrix

# Who can read secrets anywhere?
kubectl who-can get secrets -A

# Resolve a subject's full effective permissions (transitively)
kubectl rbac-tool lookup oidc:alice@corp.com
kubectl rbac-tool policy-rules -e '^system:serviceaccount:.*'

# rakkess: per-resource access matrix for the current (or impersonated) subject
kubectl access-matrix --as=system:serviceaccount:team-payments:ci-deployer

rbac-tool also generates a least-privilege ClusterRole from observed audit-log activity (rbac-tool gen) — a strong starting point when retrofitting a permissive SA.

7. Catch privilege-escalation paths

Some grants are dangerous regardless of how narrow they look, because they let a subject grant themselves more. Hunt these specifically:

Verb / resource	Why it is an escalation path
`escalate` on roles/clusterroles	create/update a role with more rights than you hold (bypasses the built-in escalation check)
`bind` on roles/clusterroles	bind an existing high-priv role to yourself
`impersonate` on users/groups/serviceaccounts	act as any subject — effectively a superuser
`create` on `pods` (+ a privileged SA in ns)	launch a pod mounting another SA’s token, or a hostPath/privileged pod
`get`/`list` on `secrets`	read SA tokens, then authenticate as those SAs
`pods/exec`, `pods/attach`	execute inside a running pod, inheriting its identity
`update` on `*/status` or `nodes`	tamper with scheduling / admission outcomes

Normally the API server blocks you from creating a role more powerful than your own. The escalate and bind verbs opt out of that check — so granting them, even scoped, hands over the keys. Find every subject holding them:

# Any binding granting escalate/bind/impersonate
kubectl rbac-tool policy-rules -o wide \
  | grep -Ei 'escalate|impersonate|(^|[^a-z])bind([^a-z]|$)'

# Wildcards are almost always over-grants — surface them
kubectl get clusterroles -o json | jq -r '
  .items[]
  | select([.rules[]? | select((.verbs[]?=="*") or (.resources[]?=="*"))] | length > 0)
  | .metadata.name'

Reading Secrets is the most underrated escalation. A “read-only” role that includes secrets can read every mounted SA token in the namespace and then become those SAs. Exclude secrets from broad read roles and grant specific secrets by resourceNames only.

8. Continuous RBAC auditing

A correct model on Tuesday means nothing if Friday’s incident-fix binding never gets reverted. Make auditing continuous.

Enable the audit log and capture authorization decisions. A minimal policy that records RBAC changes and denials at metadata level:

apiVersion: audit.k8s.io/v1
kind: Policy
rules:
  - level: RequestResponse
    verbs: ["create", "update", "patch", "delete"]
    resources:
      - group: "rbac.authorization.k8s.io"
        resources: ["roles", "clusterroles", "rolebindings", "clusterrolebindings"]
  - level: Metadata
    resources:
      - group: ""
        resources: ["secrets"]   # who read what, without logging values

authorization.k8s.io/decision: forbid annotations in the log are gold — a steady stream of denials for one subject is a missing (legitimate) grant or an attacker probing. Ship the log to your SIEM and alert on:

any write to clusterrolebindings,
new subjects bound to cluster-admin or system:masters,
spikes in forbid decisions per subject.

Policy-as-code stops bad grants before they merge. With Kyverno or Gatekeeper/OPA, reject dangerous bindings at admission. A Kyverno policy blocking new cluster-admin ClusterRoleBindings:

apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: restrict-cluster-admin
spec:
  validationFailureAction: Enforce
  rules:
    - name: no-cluster-admin-binding
      match:
        any:
          - resources:
              kinds: ["ClusterRoleBinding"]
      validate:
        message: "Binding to cluster-admin is not allowed; request a scoped role."
        deny:
          conditions:
            any:
              - key: "{{ request.object.roleRef.name }}"
                operator: Equals
                value: "cluster-admin"

Keep all Roles/Bindings in Git, reconciled by Argo CD or Flux. GitOps gives you the missing pieces: drift detection (the cluster is corrected back to Git), peer review on every permission change, and a full history of who widened what and when.

Enterprise scenario

A fintech platform team running multi-tenant EKS thought their tenants were boxed into their own namespaces. Each tenant got the app-developer ClusterRole bound via a namespaced RoleBinding — the canonical pattern from section 2. During a quarterly access review, rbac-tool flagged that one tenant’s CI ServiceAccount could read Secrets in every namespace. The grant looked scoped, so nobody believed it until auth can-i confirmed it.

The cause was the EKS-managed aws-node and a vendor monitoring operator both shipping ClusterRoles labeled rbac.authorization.k8s.io/aggregate-to-edit: "true". A platform engineer, building a self-service “give tenants edit in their namespace” flow, had bound tenants to the built-in edit ClusterRole. Aggregation silently merged the vendor’s secrets: ["get","list"] rule into edit cluster-wide — and because the tenant binding used edit, every tenant inherited Secret read in their namespace, including mounted SA tokens they could then impersonate.

The fix was to stop binding the aggregated built-ins for tenants and pin an explicit, non-aggregating ClusterRole instead, plus a Kyverno policy rejecting any new ClusterRole carrying an aggregation label unless it lives in an allowlisted namespace prefix.

# Prove the blast radius before and after the fix
kubectl auth can-i list secrets -n tenant-acme \
  --as=system:serviceaccount:tenant-acme:ci   # was: yes  ->  now: no

# Audit which ClusterRoles feed the built-in edit aggregate
kubectl get clusterroles -l rbac.authorization.k8s.io/aggregate-to-edit=true

Lesson: binding aggregated roles is binding a moving target. Any operator you install can widen them.

Verify

Prove the model end to end before you call it done.

# 1. The least-priv personas can do their job...
kubectl auth can-i patch deployments -n team-payments \
  --as=oidc:alice@corp.com --as-group=oidc:eng-payments        # expect: yes

# 2. ...and cannot do what they must not
kubectl auth can-i get secrets -n team-payments \
  --as=oidc:alice@corp.com --as-group=oidc:eng-payments        # expect: no
kubectl auth can-i create pods/exec -n team-payments \
  --as=oidc:alice@corp.com --as-group=oidc:eng-payments        # expect: no

# 3. CI cannot read secrets or exec
kubectl auth can-i get secrets \
  --as=system:serviceaccount:team-payments:ci-deployer         # expect: no

# 4. No unexpected holders of escalate/bind/impersonate
kubectl rbac-tool policy-rules -o wide | grep -Ei 'escalate|impersonate'

# 5. Nobody outside the break-glass list binds cluster-admin
kubectl get clusterrolebindings -o json | jq -r '
  .items[] | select(.roleRef.name=="cluster-admin")
  | "\(.metadata.name): \([.subjects[]?.name] | join(", "))"'

# 6. No pods silently using the default SA
kubectl get pods -A -o json | jq -r '
  .items[] | select((.spec.serviceAccountName // "default")=="default")
  | "\(.metadata.namespace)/\(.metadata.name)"'

Checklist

Permission sets authored once as ClusterRole, scoped per namespace via RoleBinding
Aggregation used for reusable sets; create clusterroles treated as privileged
Subjects are OIDC groups, with username/group prefixes set on the API server
Every workload has its own SA; automountServiceAccountToken: false unless needed
secrets, pods/exec, and pods/log excluded from broad read roles
escalate / bind / impersonate inventoried and justified per holder
Audit log capturing RBAC writes + forbid decisions, shipped to a SIEM with alerts
Admission policy (Kyverno/Gatekeeper) blocking cluster-admin bindings and wildcards
All RBAC in Git, reconciled by Argo CD/Flux with drift correction on

Pitfalls

* in any rule. A wildcard verb or resource silently absorbs future API types (including CRDs). Enumerate explicitly.
RoleBinding to a ClusterRoleBinding mental slip. A RoleBinding scopes a ClusterRole to one namespace — but a ClusterRoleBinding grants it everywhere. The wrong one turns “view in dev” into “view in production.”
Forgetting RBAC is additive. You cannot deny; an over-broad binding is not cancelled by a narrow one. Audit the union, not the latest object.
Trusting community tooling as ground truth. who-can, rbac-tool, and rakkess read RBAC objects, not the live authorizer — webhook authorizers and Node authorization can change the real answer. Confirm with kubectl auth can-i.
Leaving break-glass loose. Keep one audited, alerting-wired cluster-admin path. The goal is least privilege by default, not zero privilege when the cluster is on fire.

Designing Least-Privilege RBAC in Kubernetes: Roles, Aggregation & Auditing at Scale

1. RBAC primitives, revisited

2. Map personas to scoped roles

3. Aggregated ClusterRoles for maintainable sets

4. ServiceAccount hygiene

5. Bind to OIDC groups, not individuals

6. Find over-permissioned subjects

7. Catch privilege-escalation paths

8. Continuous RBAC auditing

Enterprise scenario

Verify

Checklist

Pitfalls

Written by Vinod

Comments

Keep Reading

Cilium Beyond CNI: Cluster Mesh, Egress Gateway, and the BGP Control Plane

GitOps with Flux: Image Update Automation, OCI Artifact Sources, and Hard Multi-Tenancy

Helm for Complex Releases: Umbrella Charts, Library Charts, Lifecycle Hooks, and Safe Rollbacks