Every object you have ever created in Kubernetes — every Pod, every Deployment, every Secret — passed through a checkpoint you probably never saw. After the API server has worked out who you are and whether you are allowed, but before it writes anything to etcd, the request runs a gauntlet of admission controllers. They can quietly rewrite your object, or reject it outright with a message of their choosing. This is the layer that injects sidecars, sets default storage classes, stamps labels, enforces “no privileged containers”, blocks unsigned images, and refuses Pods without resource limits. If authentication answers “who?” and authorization answers “can they?”, admission answers the far more interesting question: “is this specific object acceptable, and should we change it first?”
Admission control is where almost all production policy lives, and it is the single most exam-probed area of the Certified Kubernetes Security Specialist (CKS) curriculum. This lesson takes it apart completely: the exact place admission sits on the request path, why mutation always runs before validation, the built-in controllers that ship enabled, the two dynamic webhook types with every configuration field explained, the newer in-tree CEL policies that let you write admission rules without running a webhook server at all, and finally how the three tools you will actually meet in the wild — Pod Security Admission, Kyverno and OPA Gatekeeper — all hook into this same machinery.
Learning objectives
By the end of this lesson you will be able to:
- Place admission control precisely on the API request path and explain why it runs after authentication and authorization but before persistence and quota.
- Distinguish the mutating phase from the validating phase, explain why ordering is guaranteed in that direction, and reason about reinvocation.
- Name the important built-in (compiled-in) admission controllers and say what each one does and why disabling some is dangerous.
- Configure a MutatingWebhookConfiguration and a ValidatingWebhookConfiguration field by field —
rules,failurePolicy,matchPolicy,namespaceSelector,objectSelector,sideEffects,timeoutSeconds,reinvocationPolicy,admissionReviewVersions,matchConditions— and explain the trade-off and gotcha of each. - Write a ValidatingAdmissionPolicy using CEL and bind it with a ValidatingAdmissionPolicyBinding, and describe the matching MutatingAdmissionPolicy.
- Explain exactly how Pod Security Admission, Kyverno and OPA Gatekeeper plug in to this layer, and choose between them.
Prerequisites & where this fits
You should be comfortable applying objects with kubectl, reading YAML, and have a rough mental model of the API server, RBAC and namespaces. Helpful but not required: having seen a Pod securityContext and knowing what a Service is. This lesson sits in the Security module of the Kubernetes Zero-to-Hero course, immediately after Kubernetes StatefulSets, In Depth: Stable Identity, Ordered Lifecycle & Per-Pod Storage and before Kubernetes CRDs, Controllers & the Operator Pattern, In Depth (Fundamentals). Admission control is the natural bridge between “I can run workloads” and “I can govern a multi-tenant cluster”, and it is foundational to everything in the policy-as-code lessons that follow.
Core concepts
An admission controller is a piece of code in the API server’s request-handling pipeline that can intercept a request to create, update, delete, or connect to an object after the request is authenticated and authorized, and act on it. There are two kinds of action, and so two kinds of controller:
- A mutating admission controller can modify the incoming object — add fields, set defaults, inject containers, attach annotations. It changes what eventually gets stored.
- A validating admission controller can only accept or reject the object. It cannot change it; it returns a verdict and, on rejection, an error message that the user sees.
Some controllers do both. Crucially, admission controllers only see write-shaped requests against the API. They run on CREATE, UPDATE, DELETE, and CONNECT (sub-resource connections such as pods/exec and pods/attach). They do not run on plain reads (GET, LIST, WATCH) — reads are an authorization concern, not an admission concern. This single fact answers a surprising number of “why didn’t my policy fire?” questions: if the action is a read, admission never engaged.
There are two delivery mechanisms for admission logic, and keeping them straight is the key to the whole topic:
| Delivery mechanism | What it is | Configured by | Lives where |
|---|---|---|---|
| Built-in (compiled-in) controllers | A fixed set of controllers compiled into the API server binary | --enable-admission-plugins / --disable-admission-plugins flags on the API server |
Inside kube-apiserver |
| Dynamic admission control | Pluggable controllers you add at runtime without recompiling | MutatingWebhookConfiguration / ValidatingWebhookConfiguration objects (webhooks), or ValidatingAdmissionPolicy / MutatingAdmissionPolicy objects (CEL) |
Webhooks: external HTTPS servers. CEL policies: evaluated inside the API server |
The built-ins are the foundation Kubernetes ships with. Dynamic admission is how you extend the cluster — historically only by running a webhook server, and since Kubernetes 1.30 (GA) increasingly by writing CEL expressions that the API server evaluates itself, with no server to run.
The mental model to lock in: mutation happens first and can change the object; validation happens last and only judges it. Everything else in this lesson hangs off that sentence.
The API request path: where admission sits
When kubectl apply -f pod.yaml runs, the request travels through the API server in a strict, well-defined order. Admission is one stage in that pipeline, and its position is deliberate.
- TLS termination & decoding. The HTTPS request is terminated and the body decoded into an internal object.
- Authentication (authn). The API server establishes who the caller is — a user, group, or ServiceAccount — via client certificates, bearer tokens, OIDC, or similar. Failure here is
401 Unauthorized. Admission has not run yet. - Authorization (authz). RBAC (or another authorizer) decides whether that identity may perform this verb on this resource. Failure here is
403 Forbidden. Admission still has not run. - Mutating admission. All applicable mutating admission controllers run, in turn. Each may modify the object. Mutating webhooks are called here.
- Object schema validation. The (possibly mutated) object is validated against the built-in schema / OpenAPI / CRD structural schema — types, required fields, enums. This is structural validation, distinct from policy validation.
- Validating admission. All applicable validating admission controllers run. Each may reject. Validating webhooks and
ValidatingAdmissionPolicyare evaluated here. - Persistence to etcd. Only now is the final object written to the backing store. Quota accounting (
ResourceQuota) is reconciled around this point as a built-in controller.
The position matters for three reasons that interviewers love. First, admission runs only on requests that already passed authz — you cannot use admission as a substitute for RBAC, because an attacker who is not authorized never reaches it. Second, mutation precedes the schema check, so a mutating controller can legitimately produce an object that only becomes valid after its changes (for example, injecting a required field). Third, validation is the last gate before etcd, which is why “reject unsigned images” or “deny privileged Pods” belongs in validating admission: nothing it approves can be altered afterwards within the same request.
Reads are not in this list.
GET/LIST/WATCHgo authn → authz → serve. There is no admission stage on the read path, full stop.
The two phases: mutating, then validating
Dynamic admission, and the pipeline as a whole, runs in two ordered phases:
- The mutating phase. Every applicable built-in mutating controller and every matching mutating webhook runs. Order within this phase is not something you should depend on for correctness — built-in mutators run in a fixed compiled order, and webhooks run in an order the API server chooses (effectively unordered from your perspective). The output of this phase is a single, merged, mutated object.
- The validating phase. Object schema validation runs, then every applicable built-in validating controller, every matching validating webhook, and every matching
ValidatingAdmissionPolicyruns. These run in parallel from the API server’s point of view, because none of them can change the object — order is irrelevant when nobody mutates. If any of them rejects, the whole request fails and nothing is persisted.
The guarantee you can rely on is the ordering between the phases: all mutation finishes before any validation begins. This is what makes validating policy trustworthy. A validating webhook that says “every Pod must have a team label” can assume that if a mutating webhook was supposed to add that label, it already has — because mutation is complete before validation looks. If the two phases interleaved, no validating policy could ever be sure it was judging the final object.
Reinvocation: why mutation can run twice
There is a subtlety. Because mutating webhooks can run in any order, webhook A might mutate the object after webhook B has already run — and webhook B might have wanted to react to A’s change. To handle this, the API server supports reinvocation. A mutating webhook can declare reinvocationPolicy: IfNeeded, which tells the API server: if any other webhook modified the object after I ran in this pass, call me again. The API server makes at most one extra pass over the mutating webhooks (so a webhook is invoked at most twice total). The default, Never, calls each webhook exactly once.
Reinvocation has two firm rules you must design around:
- Webhooks must be idempotent. Because a webhook may be called twice on the same request, applying it twice must produce the same result as applying it once. A sidecar-injector that blindly appends a container would inject it twice under reinvocation — the correct design checks “is my sidecar already present?” before adding it.
- Reinvocation does not guarantee ordering or convergence. It is one extra pass, not a fixed-point loop. Do not build webhooks that depend on a precise mutation order; design each to reach the right state from whatever it is handed.
Built-in admission controllers
Kubernetes ships a long list of admission controllers compiled into the API server. A curated default set is enabled out of the box; you can adjust the set with --enable-admission-plugins and --disable-admission-plugins, but most of the defaults should never be turned off — several are load-bearing for correctness, not merely policy. (Managed control planes such as EKS, AKS and GKE manage these flags for you and restrict what you can change.)
Two special controllers act as the plumbing for everything dynamic:
| Controller | Phase | What it does |
|---|---|---|
MutatingAdmissionWebhook |
Mutating | The built-in controller that calls out to all your MutatingWebhookConfiguration webhooks. Without it enabled, mutating webhooks do nothing. |
ValidatingAdmissionWebhook |
Validating | The built-in controller that calls out to all your ValidatingWebhookConfiguration webhooks. Without it enabled, validating webhooks do nothing. |
ValidatingAdmissionPolicy |
Validating | The built-in controller that evaluates in-tree CEL ValidatingAdmissionPolicy objects. GA and enabled by default since 1.30. |
These three are why “dynamic admission control” works at all: the mechanism is a built-in controller; the policy is the object you create. Beyond the plumbing, the controllers you should know by name and purpose:
| Controller | Phase | What it does | Why it matters |
|---|---|---|---|
NamespaceLifecycle |
Both | Rejects creation of objects in a namespace that is being deleted; prevents deletion of the default, kube-system, kube-public namespaces. |
Stops objects leaking into half-deleted namespaces. Disabling breaks cleanup. |
LimitRanger |
Mutating + Validating | Applies LimitRange defaults to Pods/containers (default requests/limits) and rejects ones outside the allowed bounds. |
This is how a namespace LimitRange actually takes effect. |
ResourceQuota |
Validating | Enforces ResourceQuota objects; rejects requests that would exceed a namespace’s quota. |
The enforcement half of quotas; tracks usage as objects are admitted. |
ServiceAccount |
Mutating | Injects the default ServiceAccount and its API-token projection / image-pull secrets into Pods that do not specify one. | Pods would have no identity without it. |
PodSecurity |
Validating | Enforces the Pod Security Standards (privileged/baseline/restricted) per namespace via labels — the built-in PodSecurityPolicy successor. | The primary built-in workload-hardening control (see below). |
DefaultStorageClass |
Mutating | Sets the default StorageClass on a PVC that omits one. | “Why did my PVC get that storage class?” lives here. |
DefaultIngressClass |
Mutating | Sets the default IngressClass on an Ingress that omits one. | Routes Ingresses to the cluster’s default controller. |
DefaultTolerationSeconds |
Mutating | Adds default tolerations for the not-ready / unreachable node taints (300s) to Pods that lack them. |
Controls how long Pods linger on failed nodes. |
TaintNodesByCondition |
— | Taints new nodes based on conditions (e.g. NotReady) so the scheduler avoids them until ready. |
Node-readiness gating. |
PersistentVolumeClaimResize |
Validating | Gates PVC expansion to StorageClasses that allow it. | Enforces allowVolumeExpansion. |
Priority |
Mutating + Validating | Resolves a Pod’s priorityClassName to its numeric priority value and validates it. |
Makes PriorityClasses work. |
RuntimeClass |
Mutating + Validating | Applies pod overhead and scheduling constraints from a RuntimeClass. |
Accounts for sandbox/runtime overhead. |
CertificateApproval / CertificateSigning / CertificateSubjectRestriction |
Validating | Guard the CertificateSigningRequest workflow. | Protect cluster PKI. |
A few rules of thumb. MutatingAdmissionWebhook, ValidatingAdmissionWebhook and ValidatingAdmissionPolicy must be in the enabled set or your dynamic policy silently does nothing — this is a classic “my webhook never fires” cause on self-managed clusters. NamespaceLifecycle, ServiceAccount, LimitRanger, ResourceQuota and PodSecurity are effectively mandatory for a sane, secure cluster. The compiled order of the mutating built-ins is fixed and chosen so that, for example, the ServiceAccount injector runs before validation; you do not control it and should not need to.
Dynamic admission: the webhook model
A dynamic admission webhook is your own HTTPS server that the API server calls during admission. You register it by creating a MutatingWebhookConfiguration or ValidatingWebhookConfiguration object — a cluster-scoped resource that tells the API server which requests to forward and where to send them. When a matching request arrives, the API server serialises the request into an AdmissionReview JSON object, POSTs it to your endpoint over TLS, and waits for an AdmissionReview response containing the verdict (and, for mutating webhooks, a base64-encoded JSON Patch describing the changes).
The flow is identical for both types; only the response differs:
- A validating webhook responds with
allowed: true/falseand, on denial, astatusmessage that surfaces to the user. - A mutating webhook responds with
allowed: trueplus an optionalpatch(JSON Patch, RFC 6902) andpatchType: JSONPatch. The API server applies the patch to the object.
Both configurations share almost all their fields; the only structural difference is that a MutatingWebhookConfiguration lists webhooks under webhooks[]. with mutation semantics and may set reinvocationPolicy, while a ValidatingWebhookConfiguration does not mutate and has no reinvocationPolicy. Let us go through every field, because the exam — and production — lives in these details.
Every field of a webhook configuration
Here is a fully-annotated ValidatingWebhookConfiguration; a mutating one is identical except for the kind and the addition of reinvocationPolicy.
apiVersion: admissionregistration.k8s.io/v1
kind: ValidatingWebhookConfiguration
metadata:
name: require-team-label
webhooks:
- name: require-team-label.kloudvin.dev # MUST be a fully-qualified domain-style name
admissionReviewVersions: ["v1"] # AdmissionReview versions this webhook understands
sideEffects: None # does calling it change state outside the request?
failurePolicy: Fail # what to do if the webhook is unreachable/errors
matchPolicy: Equivalent # match equivalent API groups/versions, not just exact
timeoutSeconds: 10 # how long the API server waits (1–30)
namespaceSelector: # only namespaces matching this label selector
matchLabels:
kloudvin.dev/governed: "true"
objectSelector: {} # only objects matching this label selector
matchConditions: # CEL pre-filters evaluated by the API server
- name: exclude-kube-system
expression: "request.namespace != 'kube-system'"
rules: # which operations/resources to intercept
- apiGroups: ["apps", ""]
apiVersions: ["v1"]
operations: ["CREATE", "UPDATE"]
resources: ["deployments", "pods"]
scope: "Namespaced" # Namespaced | Cluster | * (all)
clientConfig: # where the API server sends the AdmissionReview
service: # in-cluster Service (preferred)
namespace: webhooks
name: policy-webhook
path: /validate
port: 443
caBundle: <base64 PEM> # CA that signed the webhook's serving cert
The fields, one by one:
| Field | What it controls | Choices / format | Default | When to set it / gotcha |
|---|---|---|---|---|
name |
Identifier for this webhook within the configuration. | A DNS-style FQDN (e.g. x.example.com). |
none (required) | Must be unique per configuration and must contain a dot. Used in error messages and logs. |
clientConfig.service |
The in-cluster Service to call. | namespace, name, path, port. |
— | Strongly preferred over url; resolves through the cluster network and respects Service routing. |
clientConfig.url |
An absolute external URL to call instead of a Service. | https://host:port/path. |
— | For webhooks outside the cluster. Cannot use localhost/127.0.0.1. Choose one of service or url. |
clientConfig.caBundle |
The PEM CA bundle the API server uses to verify the webhook’s TLS cert. | base64-encoded PEM. | — | If wrong/expired, every call fails TLS → behaviour decided by failurePolicy. The #1 cause of broken webhooks. |
rules |
Which operations on which resources trigger this webhook. | List of {apiGroups, apiVersions, operations, resources, scope}. |
none → matches nothing useful | "*" is a wildcard for groups/versions/resources. operations may include CREATE, UPDATE, DELETE, CONNECT, or "*". Over-broad rules (*/*/*) are a stability risk — see below. |
rules[].scope |
Restrict to namespaced or cluster-scoped resources. | Namespaced, Cluster, *. |
* |
Useful to avoid firing on cluster-scoped objects you do not care about. |
failurePolicy |
What happens if the webhook errors, times out, or is unreachable. | Fail, Ignore. |
Fail |
The most consequential field. Fail = closed (request rejected) — secure but a down webhook can wedge the cluster. Ignore = open (request proceeds) — available but a policy gap. See the dedicated section. |
matchPolicy |
How rules match when the same resource is served under multiple API versions. |
Exact, Equivalent. |
Equivalent |
Equivalent (the sane default) matches a request even if it arrives under a different but equivalent group/version than you listed, by converting it. Exact matches only the literal versions in rules — a foot-gun that lets requests slip past after an API version bump. |
namespaceSelector |
Restrict by namespace labels. | A label selector (matchLabels/matchExpressions). |
empty = all namespaces | The clean way to scope policy to opted-in namespaces and, critically, to exclude kube-system so a broken webhook cannot break the control plane. |
objectSelector |
Restrict by the object’s own labels. | A label selector. | empty = all objects | Lets a sidecar-injector fire only on Pods labelled inject=true, sparing every other Pod the round-trip. |
matchConditions |
Fine-grained CEL pre-filters the API server evaluates before calling the webhook. | List of {name, expression} CEL returning bool. |
none | All must be true to call the webhook. Cheaper and more expressive than selectors; e.g. skip requests from a specific ServiceAccount. (Stable since 1.30.) |
sideEffects |
Declares whether dry-run calls to this webhook are safe. | None, NoneOnDryRun, Some (deprecated), Unknown (deprecated). |
none (required) | None = the webhook never changes external state, so dry-run calls it normally. NoneOnDryRun = it has side effects, so the API server must not call it for ?dryRun=true requests. Lying here breaks kubectl --dry-run=server. |
timeoutSeconds |
How long the API server waits for a response. | 1–30 seconds. | 10 | Keep it low (1–5s). Combined with failurePolicy: Fail, a slow webhook adds latency to every matching write and can stall the cluster. |
admissionReviewVersions |
Which AdmissionReview schema versions the webhook accepts. |
Ordered list, e.g. ["v1"]. |
none (required) | The API server uses the first version both sides support. List ["v1"] for any modern webhook. |
reinvocationPolicy (mutating only) |
Whether to call this webhook a second time if later webhooks mutated the object. | Never, IfNeeded. |
Never |
IfNeeded enables one extra pass; requires the webhook to be idempotent. Absent on validating configs. |
failurePolicy: the fail-open vs fail-closed decision
failurePolicy deserves its own treatment because it is the field that turns a security control into an availability risk, or vice versa.
| Value | Behaviour when the webhook is unreachable / errors / times out | Security posture | Availability posture |
|---|---|---|---|
Fail (default) |
The API request is rejected with a webhook-unavailable error. | Fail-closed — no object slips past while the policy engine is down. | Fragile — if the webhook is down (or scoped too broadly), legitimate writes are blocked, potentially including the very Pods that are the webhook. |
Ignore |
The API request proceeds as if the webhook had not matched. | Fail-open — a policy gap opens whenever the webhook is unavailable. | Robust — cluster keeps accepting writes regardless of webhook health. |
The classic catastrophe: a failurePolicy: Fail webhook with rules matching pods cluster-wide, whose own pods then get evicted. The API server cannot admit the replacement pods because it cannot reach the (now-zero) webhook replicas — a self-inflicted, cluster-wide deadlock. The defences are all in this lesson: scope tightly with namespaceSelector to exclude kube-system and the webhook’s own namespace; use objectSelector/matchConditions to match only what you must; keep timeoutSeconds small; and run the webhook with multiple replicas and a PodDisruptionBudget. For security-critical policies you genuinely want Fail; for best-effort conveniences (a sidecar nicety), Ignore is kinder.
The mutating webhook in depth
A mutating webhook returns a JSON Patch that the API server applies to the incoming object. The response looks like this:
{
"apiVersion": "admission.k8s.io/v1",
"kind": "AdmissionReview",
"response": {
"uid": "<copied from the request>",
"allowed": true,
"patchType": "JSONPatch",
"patch": "<base64 of: [{\"op\":\"add\",\"path\":\"/metadata/labels/team\",\"value\":\"payments\"}]>"
}
}
Three design rules make mutating webhooks safe:
- Echo the
uid. The response must copy the request’suidso the API server can correlate it. Omitting it is rejected. - Be idempotent. Especially under
reinvocationPolicy: IfNeeded, the webhook may run twice. Check before you add. A sidecar injector should look for its container by name and no-op if present. - Patch defensively. A JSON Patch
addto/spec/containers/-appends a container; anaddto a map key that already exists replaces it. Test patches against objects that already have the field.
Mutating webhooks should do the minimum necessary mutation and leave judgement to the validating phase. A common, clean split is: a mutating webhook injects defaults (a sidecar, a label, a default securityContext), and a separate validating webhook (or a ValidatingAdmissionPolicy) enforces that the result is acceptable. This keeps each webhook simple and lets validation assume mutation already happened.
The validating webhook in depth
A validating webhook is simpler: it returns a verdict and never a patch. The denial message is the field that determines whether your users curse you or thank you:
{
"apiVersion": "admission.k8s.io/v1",
"kind": "AdmissionReview",
"response": {
"uid": "<copied from the request>",
"allowed": false,
"status": {
"code": 403,
"message": "Pod 'web' rejected: container 'app' must set resources.limits.memory (policy: require-limits)"
}
}
}
A good message names the object, the exact field at fault, and the policy that fired, so the user can fix it without opening a ticket. A validating webhook can also return non-fatal warnings (surfaced by kubectl as Warning: lines) to nudge users toward better config without blocking them — the same mechanism PSA uses for its warn mode.
ValidatingAdmissionPolicy: webhooks without a server (CEL)
Running a webhook server is a lot of operational weight for what is often a one-line rule (“every Deployment must have ≥ 2 replicas”). You must build it, deploy it, expose it via a Service, manage its TLS cert and caBundle, keep it highly available, and accept that it adds a network round-trip and a single point of failure to every matching write. Since Kubernetes 1.30, the in-tree ValidatingAdmissionPolicy makes most validation possible without any of that. You write the rule as a CEL (Common Expression Language) expression, and the API server evaluates it itself — no server, no network call, no certificate.
There are two objects, deliberately split so one policy can be reused with different scopes:
ValidatingAdmissionPolicy— the what: the CELvalidations, the resources it can apply to (matchConstraints), optionalvariables,matchConditions, and thefailurePolicy.ValidatingAdmissionPolicyBinding— the where: binds a policy to actual namespaces/objects viamatchResources, and chooses the enforcement action (validationActions:Deny,Warn,Audit).
apiVersion: admissionregistration.k8s.io/v1
kind: ValidatingAdmissionPolicy
metadata:
name: require-min-replicas
spec:
failurePolicy: Fail
matchConstraints:
resourceRules:
- apiGroups: ["apps"]
apiVersions: ["v1"]
operations: ["CREATE", "UPDATE"]
resources: ["deployments"]
variables:
- name: replicas
expression: "object.spec.replicas"
validations:
- expression: "variables.replicas >= 2"
message: "Deployments must run at least 2 replicas for availability."
reason: Invalid
---
apiVersion: admissionregistration.k8s.io/v1
kind: ValidatingAdmissionPolicyBinding
metadata:
name: require-min-replicas-binding
spec:
policyName: require-min-replicas
validationActions: ["Deny"] # Deny | Warn | Audit (combinable)
matchResources:
namespaceSelector:
matchLabels:
kloudvin.dev/governed: "true"
CEL gives you a rich, sandboxed expression language with access to:
| CEL variable | What it is |
|---|---|
object |
The incoming object (the new state). null on DELETE. |
oldObject |
The existing object (the prior state). null on CREATE. Use for transition rules: “you may not increase replicas beyond 10”. |
request |
The AdmissionRequest metadata — request.operation, request.namespace, request.userInfo, request.dryRun. |
params |
A custom resource referenced by the binding’s paramRef, letting one policy be parameterised per namespace. |
namespaceObject |
The full Namespace object of the request, for reading namespace labels/annotations. |
variables.<name> |
Reusable sub-expressions declared in spec.variables, evaluated lazily and cached. |
authorizer |
A CEL helper to make authorization checks from within the policy. |
The trade-offs versus webhooks are worth memorising:
| Aspect | ValidatingAdmissionPolicy (CEL) | Validating webhook |
|---|---|---|
| Server to run | None — evaluated in-process | Yes — you build, deploy, secure it |
TLS / caBundle |
None | Required, and the top failure cause |
| Latency | Negligible, in-process | A network round-trip per matching write |
| Availability risk | None (no external dependency) | A down/slow webhook can block writes |
| Expressiveness | CEL only — no I/O, no external lookups, bounded | Arbitrary code — can call other systems, do crypto (e.g. verify image signatures) |
| Mutation | MutatingAdmissionPolicy (newer; CEL ApplyConfiguration/JSON Patch) |
Full JSON Patch |
When can you not use CEL? When the decision needs information the API server does not have — verifying a container image’s cryptographic signature against a registry, calling an external policy service, or doing anything with side effects. Those still need a webhook. But for the enormous class of “this field must / must not be X relative to that field”, CEL is now the right default: less to run, nothing to break, no certificate to expire.
There is also a MutatingAdmissionPolicy (alpha in 1.32, advancing toward beta), which brings the same webhook-free, CEL-driven approach to the mutating phase — letting you set defaults and apply changes via CEL ApplyConfiguration semantics without a mutating webhook server. Together, the two CEL policy types are positioned as the long-term, server-free successors to the webhook pair for the common cases.
How PSA, Kyverno and Gatekeeper plug in
This is the question that ties the lesson to reality: the three policy tools you will actually meet all hook into the exact machinery above, but at different points.
The diagram traces a write request through authentication and authorization, then through the mutating phase (built-in mutators and mutating webhooks, with reinvocation), then through the validating phase (built-in validators, validating webhooks, and CEL policies), and finally into etcd — and shows where each of the three tools below attaches.
Pod Security Admission (PSA) is the simplest: it is a built-in validating admission controller (PodSecurity), enabled by default. There is nothing to install and no webhook. You configure it purely with namespace labels that select a Pod Security Standard level (privileged, baseline, restricted) and a mode (enforce, audit, warn):
kubectl label namespace payments \
pod-security.kubernetes.io/enforce=restricted \
pod-security.kubernetes.io/warn=restricted
Because it runs in the validating phase, it can only reject non-conforming Pods (those that set privileged: true, share host namespaces, run as root under restricted, and so on) — it never mutates them. It covers exactly the Pod Security Standards and nothing else; for anything beyond those fields you reach for Kyverno or Gatekeeper. PSA is covered end-to-end in Migrating to Pod Security Admission: Enforcing Baseline and Restricted Profiles Without Breaking Workloads.
Kyverno is a policy engine that installs as a set of webhook servers. When you install it, it registers its own MutatingWebhookConfiguration and ValidatingWebhookConfiguration pointing at the Kyverno deployment, and then dynamically manages those configurations based on the ClusterPolicy/Policy custom resources you create — adding the relevant rules so the API server only forwards request types some policy actually cares about. Its headline advantage is that policies are written in YAML, not a separate language, and it can do all three of validate, mutate, and generate (create companion objects, e.g. a default NetworkPolicy per new namespace) — plus image-signature verification, which needs a webhook because it calls out to a registry. It plugs into both the mutating and validating phases via its webhooks. See Deploying Kyverno: Policy-as-Code for Image Signing, Limits & Pod Security.
OPA Gatekeeper is the Kubernetes-native packaging of Open Policy Agent. It installs a validating (and optional mutating) webhook server backed by OPA, and you express policy in Rego via two CRDs: a ConstraintTemplate (the reusable Rego logic, which generates a new CRD) and Constraint objects (instances that parameterise and scope the template). The API server forwards matching requests to Gatekeeper’s webhook, which evaluates the constraints and returns allow/deny. Gatekeeper adds an audit mode that periodically re-scans existing objects against constraints — finding violations that predate the policy, which admission alone (only firing on new writes) cannot. It plugs in chiefly at the validating phase via its webhook. See OPA Gatekeeper: Policy-as-Code Admission Gating.
| Tool | How it attaches | Policy language | Mutate? | Generate? | Audit existing? | Best for |
|---|---|---|---|---|---|---|
| Pod Security Admission | Built-in PodSecurity validating controller |
Namespace labels (no language) | No | No | No (admission-time only) | Baseline Pod hardening with zero install |
| ValidatingAdmissionPolicy | Built-in CEL validating controller | CEL | No (see MutatingAdmissionPolicy) | No | Via Audit action |
Simple in-cluster field rules, no server |
| Kyverno | Its own mutating + validating webhooks | YAML | Yes | Yes | Via background scans / reports | Teams wanting YAML policy, mutation, generation, image signing |
| OPA Gatekeeper | Its own validating (+ mutating) webhook | Rego | Yes (mutation feature) | No | Yes (audit) | Org-wide policy with Rego, strong audit of existing objects |
The decision in one breath: PSA for the Pod Security Standards (free, built-in); ValidatingAdmissionPolicy for simple field rules you want without running anything; Kyverno when you want approachable YAML policy plus mutation/generation/image-signing; Gatekeeper when your organisation has standardised on Rego and wants strong audit of pre-existing resources. All four are first-class citizens of the same admission pipeline.
Hands-on lab
We will create a local cluster, register a ValidatingAdmissionPolicy that requires every Deployment in a governed namespace to have at least two replicas, prove it both blocks and allows, then add a mutating webhook-style default via a CEL MutatingAdmissionPolicy is out of scope for a default kind build, so we will instead use a built-in mutator to illustrate the mutating phase. Everything here is free and local — no cloud account.
1. Create a cluster
kind create cluster --name admission-lab --image kindest/node:v1.30.0
kubectl cluster-info --context kind-admission-lab
kindruns a full Kubernetes 1.30 control plane in Docker, soValidatingAdmissionPolicyis GA and enabled by default. Verify the API is up before continuing.
2. Label a governed namespace
kubectl create namespace governed
kubectl label namespace governed kloudvin.dev/governed=true
3. Apply the policy and its binding
cat <<'EOF' | kubectl apply -f -
apiVersion: admissionregistration.k8s.io/v1
kind: ValidatingAdmissionPolicy
metadata:
name: require-min-replicas
spec:
failurePolicy: Fail
matchConstraints:
resourceRules:
- apiGroups: ["apps"]
apiVersions: ["v1"]
operations: ["CREATE", "UPDATE"]
resources: ["deployments"]
validations:
- expression: "object.spec.replicas >= 2"
message: "Deployments in a governed namespace must run at least 2 replicas."
reason: Invalid
---
apiVersion: admissionregistration.k8s.io/v1
kind: ValidatingAdmissionPolicyBinding
metadata:
name: require-min-replicas-binding
spec:
policyName: require-min-replicas
validationActions: ["Deny"]
matchResources:
namespaceSelector:
matchLabels:
kloudvin.dev/governed: "true"
EOF
Expected output:
validatingadmissionpolicy.admissionregistration.k8s.io/require-min-replicas created
validatingadmissionpolicybinding.admissionregistration.k8s.io/require-min-replicas-binding created
4. Prove it rejects a single-replica Deployment
kubectl -n governed create deployment nginx --image=nginx --replicas=1
Expected — the request is denied before anything is created, with your message:
error: failed to create deployment: deployments.apps "nginx" is forbidden:
ValidatingAdmissionPolicy 'require-min-replicas' with binding
'require-min-replicas-binding' denied request: Deployments in a governed
namespace must run at least 2 replicas.
5. Prove it allows a compliant Deployment
kubectl -n governed create deployment nginx --image=nginx --replicas=2
kubectl -n governed get deploy nginx
Expected:
deployment.apps/nginx created
NAME READY UP-TO-DATE AVAILABLE AGE
nginx 2/2 2 2 15s
6. Prove the binding is scoped (control namespace is unaffected)
kubectl create namespace ungoverned
kubectl -n ungoverned create deployment nginx --image=nginx --replicas=1
The single-replica Deployment succeeds here, because ungoverned lacks the kloudvin.dev/governed=true label the binding selects on — proving that scoping is driven by the binding’s namespaceSelector, exactly as a webhook’s namespaceSelector would.
7. Observe the built-in mutating phase
Watch a built-in mutator in action: create a bare Pod and see the ServiceAccount admission controller inject the default service account and its token projection — a mutation, applied before validation.
kubectl -n governed run probe --image=busybox --restart=Never --command -- sleep 3600
kubectl -n governed get pod probe -o jsonpath='{.spec.serviceAccountName}{"\n"}'
kubectl -n governed get pod probe -o jsonpath='{.spec.volumes[*].name}{"\n"}'
Expected — you never set these; the mutating phase did:
default
kube-api-access-xxxxx
8. Cleanup
kubectl delete validatingadmissionpolicybinding require-min-replicas-binding
kubectl delete validatingadmissionpolicy require-min-replicas
kubectl delete namespace governed ungoverned
kind delete cluster --name admission-lab
Cost note: entirely free. kind runs in local Docker; nothing is provisioned in any cloud, so there is nothing to bill and nothing to leak. Deleting the cluster reclaims all resources.
Common mistakes & troubleshooting
| Symptom | Likely cause | Fix |
|---|---|---|
| Webhook never fires; objects pass unchecked | The MutatingAdmissionWebhook / ValidatingAdmissionWebhook plugin is disabled, or rules do not match the resource/operation, or matchPolicy: Exact misses a different API version |
Confirm the plugins are enabled; widen/correct rules; prefer matchPolicy: Equivalent |
| All writes suddenly rejected with “webhook unavailable” | failurePolicy: Fail webhook is down, unreachable, or matched too broadly (including its own namespace) |
Restore the webhook; scope with namespaceSelector to exclude kube-system and the webhook’s own namespace; consider Ignore for non-critical policies |
| TLS / x509 errors calling the webhook | clientConfig.caBundle is wrong, missing, or the serving cert expired/rotated |
Regenerate the cert, re-encode the CA into caBundle (cert-manager’s CA injector automates this) |
kubectl --dry-run=server triggers real side effects |
sideEffects: None declared on a webhook that does mutate external state |
Set sideEffects: NoneOnDryRun so the API server skips it on dry-run |
| Sidecar injected twice / label set twice | Non-idempotent mutating webhook under reinvocationPolicy: IfNeeded (or two webhooks both injecting) |
Make the webhook check-before-add; ensure idempotency |
| Every write is slow | High timeoutSeconds plus a sluggish webhook on a broad rules match |
Lower timeoutSeconds (1–5s), narrow rules, add objectSelector/matchConditions, scale the webhook |
| CEL policy applies everywhere, ignoring scope | validationActions/matchResources set on the policy instead of the binding, or no binding created at all |
Scope via the ValidatingAdmissionPolicyBinding; a policy with no binding does nothing |
| Policy passes objects that violate it after an upgrade | matchConstraints/rules list a now-stale apiVersions |
Use Equivalent matching / wildcards and keep version lists current |
Best practices
- Validate in CEL, mutate only when you must. Reach for
ValidatingAdmissionPolicybefore standing up a webhook server; you remove a network hop, a certificate, and a single point of failure. - Scope every webhook tightly. Always set a
namespaceSelector(ormatchConditions) that excludeskube-systemand the webhook’s own namespace; useobjectSelectorso you only intercept objects you actually act on. - Choose
failurePolicydeliberately, per policy.Failfor security-critical gates,Ignorefor best-effort conveniences. Never default everything toFailand forget about it. - Keep timeouts low. 1–5 seconds. Admission latency is added to every matching write.
- Make mutating webhooks idempotent. Assume they run twice; check before you add.
- Run webhooks for availability. Multiple replicas, a PodDisruptionBudget, anti-affinity, and health probes — a webhook on the write path is as critical as the API server it gates.
- Declare
sideEffectshonestly so dry-run and diff work. - Pin
admissionReviewVersions: ["v1"]and useapiVersion: admissionregistration.k8s.io/v1. - Layer, do not duplicate. PSA for the Pod Security Standards, a policy engine (Kyverno/Gatekeeper) or CEL for the rest. Avoid two tools enforcing the same rule with conflicting messages.
- Test policies in
Warn/AuditbeforeDeny. Measure blast radius on real traffic before you turn a gate on.
Security notes
Admission control is a security mechanism, so its own failure modes are security-relevant. Treat the webhook server as part of the control plane’s trust boundary: it can mutate any object it intercepts, so a compromised mutating webhook can inject a malicious sidecar or strip a securityContext from every Pod. Lock down who can create or edit MutatingWebhookConfiguration/ValidatingWebhookConfiguration objects via RBAC — the verb to guard is write access on admissionregistration.k8s.io — because anyone who can register a webhook can intercept and rewrite cluster writes. Remember admission runs after authorization, so it is a complement to RBAC, never a replacement: never rely on a webhook to enforce something RBAC should deny, because an unauthorized caller never reaches admission. Prefer failurePolicy: Fail for controls that block dangerous configurations (you want them closed when the engine is down), but pair that with high availability so “closed” does not become “cluster wedged”. Keep webhook serving certificates short-lived and automatically rotated, and verify CEL policies cannot be bypassed by submitting the resource under an alternative API version (matchPolicy: Equivalent, wildcard versions). Finally, for image-provenance enforcement (signature verification), use a tool that does it in a webhook (Kyverno, Sigstore policy-controller, Gatekeeper with an external data provider) — CEL cannot reach out to a registry, so “only signed images” is not a job for ValidatingAdmissionPolicy.
Interview & exam questions
-
Where does admission control sit on the API request path, and what runs before and after it? After authentication and authorization, and after the request is decoded — but before persistence to etcd. Order: authn → authz → mutating admission → schema validation → validating admission → etcd (with quota around persistence). It only runs on write-shaped verbs (
CREATE/UPDATE/DELETE/CONNECT), never on reads. -
Why does the mutating phase always run before the validating phase? So validation can judge the final object. Mutation can add or default fields; if validation ran first or interleaved, no validating policy could be sure it was evaluating the object that will actually be stored. The phase ordering (all mutation, then all validation) is the only ordering guarantee you get.
-
What is reinvocation and why must mutating webhooks be idempotent? With
reinvocationPolicy: IfNeeded, the API server makes one extra pass over mutating webhooks if a later webhook changed the object after a given webhook ran — so a webhook can be called twice on the same request. If it is not idempotent (e.g. it appends a sidecar without checking), it duplicates its effect on the second pass. -
Explain
failurePolicy: FailvsIgnoreand the danger ofFail.Fail(default) rejects the request if the webhook is unreachable or errors — fail-closed, secure, but a down or over-broad webhook can block legitimate writes, including its own replacement pods (cluster deadlock).Ignorelets the request proceed — fail-open, available, but opens a policy gap while the webhook is down. -
What do
namespaceSelector,objectSelectorandmatchConditionseach scope, and why prefer them?namespaceSelectorfilters by namespace labels;objectSelectorby the object’s own labels;matchConditionsare CEL pre-filters the API server evaluates before calling the webhook. They reduce blast radius and load — critically, excludingkube-systemand the webhook’s own namespace prevents self-inflicted outages. -
What does
sideEffectsdeclare, and what breaks if it is wrong? Whether the webhook changes state outside the admitted object, and specifically whether it is safe to call on dry-run.None= safe, called normally;NoneOnDryRun= has side effects, so skipped for?dryRun=true. DeclaringNoneon a webhook that mutates external state makeskubectl --dry-run=servercause real changes. -
matchPolicy: ExactvsEquivalent— which is the default and why does it matter?Equivalentis the default and the safe choice: it matches a request even if it arrives under a different-but-equivalent API group/version than yourruleslist, by converting it.Exactmatches only the literal versions listed, so requests can slip past a policy after an API version bump — a real security foot-gun. -
What is a
ValidatingAdmissionPolicyand when do you use it over a webhook? An in-tree (GA since 1.30), CEL-based validating policy evaluated inside the API server — no server, no TLS cert, no network hop, no availability risk. Use it for field-level rules (“replicas ≥ 2”, “image not :latest”, transition rules viaoldObject). Use a webhook when you need I/O the API server cannot do — image-signature verification, external lookups, side effects. -
What is the role of the
ValidatingAdmissionPolicyBinding, and what arevalidationActions? The binding attaches a policy to specific namespaces/objects (matchResources) and selects the enforcement action:Deny,Warn, orAudit(combinable). The policy is the reusable logic; the binding is the scope and the verdict behaviour. A policy with no binding does nothing. -
How do PSA, Kyverno and Gatekeeper each plug into admission? PSA is a built-in validating controller configured by namespace labels — no install, validate-only, Pod Security Standards only. Kyverno installs its own mutating and validating webhooks and manages them from YAML
ClusterPolicy/Policyresources; it can validate, mutate, generate, and verify image signatures. Gatekeeper installs a validating (and optional mutating) webhook backed by OPA, with policy in Rego viaConstraintTemplate/Constraint, and adds periodic audit of existing objects. -
Why can’t admission control replace RBAC? Admission runs after authorization, so an unauthorized caller never reaches it. RBAC decides whether the verb is allowed at all; admission decides whether the specific object is acceptable. They are complementary layers, and security-critical “this caller may not do X” belongs in RBAC.
-
A
failurePolicy: FailPod webhook took down the cluster after a node drain. Why, and how do you prevent it? The webhook’s own pods were evicted; withFail, the API server could not admit replacement pods because it could not reach the (zero-replica) webhook — deadlock. Prevent it by excludingkube-systemand the webhook’s namespace vianamespaceSelector, running multiple replicas with a PodDisruptionBudget, keepingtimeoutSecondslow, and usingIgnorefor non-critical policies.
Quick check
- On which API verbs does admission control run, and on which does it not?
- True or false: a validating webhook can modify the incoming object.
- What is the default value of
failurePolicy, and what does it do? - Which CEL variable holds the previous state of an object, for transition rules?
- Which of PSA, Kyverno, and Gatekeeper requires no installation because it is built into the API server?
Answers
- It runs on write-shaped verbs —
CREATE,UPDATE,DELETE, andCONNECT(sub-resource connections likepods/exec). It does not run on reads (GET,LIST,WATCH). - False. Validating webhooks can only accept or reject and return a message; only mutating webhooks return a patch that changes the object.
- The default is
Fail(fail-closed): if the webhook is unreachable or errors, the API request is rejected. oldObject— it isnullonCREATEand holds the prior state onUPDATE/DELETE.- Pod Security Admission — it is the built-in
PodSecurityvalidating controller, enabled by default and configured purely with namespace labels.
Exercise
On a local kind cluster running Kubernetes 1.30+, build a small policy suite and prove each behaviour:
- Write a
ValidatingAdmissionPolicythat denies any Pod or Deployment whose container image uses the:latesttag (hint: CELendsWith, iterateobject.spec.template.spec.containersfor Deployments andobject.spec.containersfor Pods — you may use two policies or avariablesblock to normalise). Bind it withvalidationActions: ["Warn", "Audit"]first. - Apply a Deployment using
nginx:latestand confirm you get a warning but the object is created (because the action isWarn/Audit, notDeny). - Flip the binding to
validationActions: ["Deny"], re-apply, and confirm the same Deployment is now rejected with your message. - Add a
matchConditions(ornamespaceSelector) so the policy applies only to agovernednamespace, and prove an identical Deployment in a different namespace is unaffected. - Stretch: add a transition rule using
oldObjectthat forbids increasing a Deployment’s replica count above 10 onUPDATE, and prove it blocks a scale-up from 8 to 12 while allowing 8 to 9.
Write down, for each step, which phase fired and why — and which of these you could not have done in CEL (answer: none of them; they are all in-cluster field comparisons, which is exactly CEL’s sweet spot).
Certification mapping
- CKS (Certified Kubernetes Security Specialist) — Cluster Setup and Cluster Hardening: admission control is a core CKS domain. Expect to enable/inspect admission plugins, configure Pod Security Admission, and reason about webhook-based policy (Kyverno/Gatekeeper/OPA) and
ValidatingAdmissionPolicy. ThefailurePolicy, scoping, and “admission complements RBAC” points are prime exam material. - CKA (Certified Kubernetes Administrator) — Cluster Architecture, Installation & Configuration: understanding the request path (authn → authz → admission), the role of built-in controllers (
ResourceQuota,LimitRanger,NamespaceLifecycle,ServiceAccount), and howResourceQuota/LimitRangeare enforced via admission.
Glossary
- Admission controller — code in the API server pipeline that intercepts authenticated, authorized write requests to mutate or validate the object before it is persisted.
- Mutating admission — the phase/controllers that can modify the incoming object.
- Validating admission — the phase/controllers that can only accept or reject the object.
- Built-in (compiled-in) controller — an admission controller shipped inside
kube-apiserver, toggled with--enable-admission-plugins/--disable-admission-plugins. - Dynamic admission control — pluggable admission added at runtime via webhook configurations or CEL policy objects, without recompiling the API server.
- AdmissionReview — the JSON envelope the API server sends to (and receives from) a webhook, carrying the request and the response verdict/patch.
- MutatingWebhookConfiguration / ValidatingWebhookConfiguration — cluster-scoped objects registering webhook endpoints, their match rules, and their behaviour.
failurePolicy—Fail(reject on webhook error, fail-closed) orIgnore(proceed, fail-open).matchPolicy—Equivalent(match across equivalent API versions, default) orExact(literal versions only).sideEffects— declares whether a webhook changes external state, governing whether it is called on dry-run (None,NoneOnDryRun).reinvocationPolicy—IfNeededallows one extra pass over a mutating webhook if later webhooks changed the object; requires idempotency.matchConditions— CEL expressions the API server evaluates to decide whether to call a webhook (or apply a policy) at all.- CEL (Common Expression Language) — the sandboxed, non-Turing-complete expression language used by
ValidatingAdmissionPolicy/MutatingAdmissionPolicy. - ValidatingAdmissionPolicy / ValidatingAdmissionPolicyBinding — in-tree, server-free CEL validation: the policy (logic) and the binding (scope +
Deny/Warn/Auditaction). - Pod Security Admission (PSA) — the built-in
PodSecurityvalidating controller enforcing the Pod Security Standards via namespace labels. - Pod Security Standards — the three profiles (
privileged,baseline,restricted) PSA enforces. - Kyverno — a policy engine that installs as webhook servers and enforces validate/mutate/generate policies written in YAML.
- OPA Gatekeeper — the Kubernetes integration of Open Policy Agent: a webhook plus
ConstraintTemplate/ConstraintCRDs, with policy in Rego and audit of existing objects.
Next steps
- Continue the course with Kubernetes CRDs, Controllers & the Operator Pattern, In Depth (Fundamentals) — admission webhooks and operators are the two halves of extending the API server, and CRDs underpin Kyverno and Gatekeeper policy objects.
- Put admission into practice with policy-as-code: Deploying Kyverno: Policy-as-Code for Image Signing, Limits & Pod Security and OPA Gatekeeper: Policy-as-Code Admission Gating.
- Master the built-in path with Migrating to Pod Security Admission: Enforcing Baseline and Restricted Profiles Without Breaking Workloads.
- Revisit where this layer lives in the bigger picture via the StatefulSets deep dive’s companion, Kubernetes StatefulSets, In Depth: Stable Identity, Ordered Lifecycle & Per-Pod Storage.