Kubernetes Admission Control, In Depth: Validating & Mutating Webhooks + ValidatingAdmissionPolicy

Every object you have ever created in Kubernetes — every Pod, every Deployment, every Secret — passed through a checkpoint you probably never saw. After the API server has worked out who you are and whether you are allowed, but before it writes anything to etcd, the request runs a gauntlet of admission controllers. They can quietly rewrite your object, or reject it outright with a message of their choosing. This is the layer that injects sidecars, sets default storage classes, stamps labels, enforces “no privileged containers”, blocks unsigned images, and refuses Pods without resource limits. If authentication answers “who?” and authorization answers “can they?”, admission answers the far more interesting question: “is this specific object acceptable, and should we change it first?”

Admission control is where almost all production policy lives, and it is the single most exam-probed area of the Certified Kubernetes Security Specialist (CKS) curriculum. This lesson takes it apart completely: the exact place admission sits on the request path, why mutation always runs before validation, the built-in controllers that ship enabled, the two dynamic webhook types with every configuration field explained, the newer in-tree CEL policies that let you write admission rules without running a webhook server at all, and finally how the three tools you will actually meet in the wild — Pod Security Admission, Kyverno and OPA Gatekeeper — all hook into this same machinery.

Learning objectives

By the end of this lesson you will be able to:

Place admission control precisely on the API request path and explain why it runs after authentication and authorization but before persistence and quota.
Distinguish the mutating phase from the validating phase, explain why ordering is guaranteed in that direction, and reason about reinvocation.
Name the important built-in (compiled-in) admission controllers and say what each one does and why disabling some is dangerous.
Configure a MutatingWebhookConfiguration and a ValidatingWebhookConfiguration field by field — rules, failurePolicy, matchPolicy, namespaceSelector, objectSelector, sideEffects, timeoutSeconds, reinvocationPolicy, admissionReviewVersions, matchConditions — and explain the trade-off and gotcha of each.
Write a ValidatingAdmissionPolicy using CEL and bind it with a ValidatingAdmissionPolicyBinding, and describe the matching MutatingAdmissionPolicy.
Explain exactly how Pod Security Admission, Kyverno and OPA Gatekeeper plug in to this layer, and choose between them.

Prerequisites & where this fits

You should be comfortable applying objects with kubectl, reading YAML, and have a rough mental model of the API server, RBAC and namespaces. Helpful but not required: having seen a Pod securityContext and knowing what a Service is. This lesson sits in the Security module of the Kubernetes Zero-to-Hero course, immediately after Kubernetes StatefulSets, In Depth: Stable Identity, Ordered Lifecycle & Per-Pod Storage and before Kubernetes CRDs, Controllers & the Operator Pattern, In Depth (Fundamentals). Admission control is the natural bridge between “I can run workloads” and “I can govern a multi-tenant cluster”, and it is foundational to everything in the policy-as-code lessons that follow.

Core concepts

An admission controller is a piece of code in the API server’s request-handling pipeline that can intercept a request to create, update, delete, or connect to an object after the request is authenticated and authorized, and act on it. There are two kinds of action, and so two kinds of controller:

A mutating admission controller can modify the incoming object — add fields, set defaults, inject containers, attach annotations. It changes what eventually gets stored.
A validating admission controller can only accept or reject the object. It cannot change it; it returns a verdict and, on rejection, an error message that the user sees.

Some controllers do both. Crucially, admission controllers only see write-shaped requests against the API. They run on CREATE, UPDATE, DELETE, and CONNECT (sub-resource connections such as pods/exec and pods/attach). They do not run on plain reads (GET, LIST, WATCH) — reads are an authorization concern, not an admission concern. This single fact answers a surprising number of “why didn’t my policy fire?” questions: if the action is a read, admission never engaged.

There are two delivery mechanisms for admission logic, and keeping them straight is the key to the whole topic:

Delivery mechanism	What it is	Configured by	Lives where
Built-in (compiled-in) controllers	A fixed set of controllers compiled into the API server binary	`--enable-admission-plugins` / `--disable-admission-plugins` flags on the API server	Inside `kube-apiserver`
Dynamic admission control	Pluggable controllers you add at runtime without recompiling	`MutatingWebhookConfiguration` / `ValidatingWebhookConfiguration` objects (webhooks), or `ValidatingAdmissionPolicy` / `MutatingAdmissionPolicy` objects (CEL)	Webhooks: external HTTPS servers. CEL policies: evaluated inside the API server

The built-ins are the foundation Kubernetes ships with. Dynamic admission is how you extend the cluster — historically only by running a webhook server, and since Kubernetes 1.30 (GA) increasingly by writing CEL expressions that the API server evaluates itself, with no server to run.

The mental model to lock in: mutation happens first and can change the object; validation happens last and only judges it. Everything else in this lesson hangs off that sentence.

The API request path: where admission sits

When kubectl apply -f pod.yaml runs, the request travels through the API server in a strict, well-defined order. Admission is one stage in that pipeline, and its position is deliberate.

TLS termination & decoding. The HTTPS request is terminated and the body decoded into an internal object.
Authentication (authn). The API server establishes who the caller is — a user, group, or ServiceAccount — via client certificates, bearer tokens, OIDC, or similar. Failure here is 401 Unauthorized. Admission has not run yet.
Authorization (authz). RBAC (or another authorizer) decides whether that identity may perform this verb on this resource. Failure here is 403 Forbidden. Admission still has not run.
Mutating admission. All applicable mutating admission controllers run, in turn. Each may modify the object. Mutating webhooks are called here.
Object schema validation. The (possibly mutated) object is validated against the built-in schema / OpenAPI / CRD structural schema — types, required fields, enums. This is structural validation, distinct from policy validation.
Validating admission. All applicable validating admission controllers run. Each may reject. Validating webhooks and ValidatingAdmissionPolicy are evaluated here.
Persistence to etcd. Only now is the final object written to the backing store. Quota accounting (ResourceQuota) is reconciled around this point as a built-in controller.

The position matters for three reasons that interviewers love. First, admission runs only on requests that already passed authz — you cannot use admission as a substitute for RBAC, because an attacker who is not authorized never reaches it. Second, mutation precedes the schema check, so a mutating controller can legitimately produce an object that only becomes valid after its changes (for example, injecting a required field). Third, validation is the last gate before etcd, which is why “reject unsigned images” or “deny privileged Pods” belongs in validating admission: nothing it approves can be altered afterwards within the same request.

Reads are not in this list. GET/LIST/WATCH go authn → authz → serve. There is no admission stage on the read path, full stop.

The two phases: mutating, then validating

Dynamic admission, and the pipeline as a whole, runs in two ordered phases:

The mutating phase. Every applicable built-in mutating controller and every matching mutating webhook runs. Order within this phase is not something you should depend on for correctness — built-in mutators run in a fixed compiled order, and webhooks run in an order the API server chooses (effectively unordered from your perspective). The output of this phase is a single, merged, mutated object.
The validating phase. Object schema validation runs, then every applicable built-in validating controller, every matching validating webhook, and every matching ValidatingAdmissionPolicy runs. These run in parallel from the API server’s point of view, because none of them can change the object — order is irrelevant when nobody mutates. If any of them rejects, the whole request fails and nothing is persisted.

The guarantee you can rely on is the ordering between the phases: all mutation finishes before any validation begins. This is what makes validating policy trustworthy. A validating webhook that says “every Pod must have a team label” can assume that if a mutating webhook was supposed to add that label, it already has — because mutation is complete before validation looks. If the two phases interleaved, no validating policy could ever be sure it was judging the final object.

Reinvocation: why mutation can run twice

There is a subtlety. Because mutating webhooks can run in any order, webhook A might mutate the object after webhook B has already run — and webhook B might have wanted to react to A’s change. To handle this, the API server supports reinvocation. A mutating webhook can declare reinvocationPolicy: IfNeeded, which tells the API server: if any other webhook modified the object after I ran in this pass, call me again. The API server makes at most one extra pass over the mutating webhooks (so a webhook is invoked at most twice total). The default, Never, calls each webhook exactly once.

Reinvocation has two firm rules you must design around:

Webhooks must be idempotent. Because a webhook may be called twice on the same request, applying it twice must produce the same result as applying it once. A sidecar-injector that blindly appends a container would inject it twice under reinvocation — the correct design checks “is my sidecar already present?” before adding it.
Reinvocation does not guarantee ordering or convergence. It is one extra pass, not a fixed-point loop. Do not build webhooks that depend on a precise mutation order; design each to reach the right state from whatever it is handed.

Built-in admission controllers

Kubernetes ships a long list of admission controllers compiled into the API server. A curated default set is enabled out of the box; you can adjust the set with --enable-admission-plugins and --disable-admission-plugins, but most of the defaults should never be turned off — several are load-bearing for correctness, not merely policy. (Managed control planes such as EKS, AKS and GKE manage these flags for you and restrict what you can change.)

Two special controllers act as the plumbing for everything dynamic:

Controller	Phase	What it does
`MutatingAdmissionWebhook`	Mutating	The built-in controller that calls out to all your `MutatingWebhookConfiguration` webhooks. Without it enabled, mutating webhooks do nothing.
`ValidatingAdmissionWebhook`	Validating	The built-in controller that calls out to all your `ValidatingWebhookConfiguration` webhooks. Without it enabled, validating webhooks do nothing.
`ValidatingAdmissionPolicy`	Validating	The built-in controller that evaluates in-tree CEL `ValidatingAdmissionPolicy` objects. GA and enabled by default since 1.30.

These three are why “dynamic admission control” works at all: the mechanism is a built-in controller; the policy is the object you create. Beyond the plumbing, the controllers you should know by name and purpose:

Controller	Phase	What it does	Why it matters
`NamespaceLifecycle`	Both	Rejects creation of objects in a namespace that is being deleted; prevents deletion of the `default`, `kube-system`, `kube-public` namespaces.	Stops objects leaking into half-deleted namespaces. Disabling breaks cleanup.
`LimitRanger`	Mutating + Validating	Applies `LimitRange` defaults to Pods/containers (default requests/limits) and rejects ones outside the allowed bounds.	This is how a namespace `LimitRange` actually takes effect.
`ResourceQuota`	Validating	Enforces `ResourceQuota` objects; rejects requests that would exceed a namespace’s quota.	The enforcement half of quotas; tracks usage as objects are admitted.
`ServiceAccount`	Mutating	Injects the default ServiceAccount and its API-token projection / image-pull secrets into Pods that do not specify one.	Pods would have no identity without it.
`PodSecurity`	Validating	Enforces the Pod Security Standards (privileged/baseline/restricted) per namespace via labels — the built-in PodSecurityPolicy successor.	The primary built-in workload-hardening control (see below).
`DefaultStorageClass`	Mutating	Sets the default StorageClass on a PVC that omits one.	“Why did my PVC get that storage class?” lives here.
`DefaultIngressClass`	Mutating	Sets the default IngressClass on an Ingress that omits one.	Routes Ingresses to the cluster’s default controller.
`DefaultTolerationSeconds`	Mutating	Adds default tolerations for the `not-ready` / `unreachable` node taints (300s) to Pods that lack them.	Controls how long Pods linger on failed nodes.
`TaintNodesByCondition`	—	Taints new nodes based on conditions (e.g. `NotReady`) so the scheduler avoids them until ready.	Node-readiness gating.
`PersistentVolumeClaimResize`	Validating	Gates PVC expansion to StorageClasses that allow it.	Enforces `allowVolumeExpansion`.
`Priority`	Mutating + Validating	Resolves a Pod’s `priorityClassName` to its numeric `priority` value and validates it.	Makes PriorityClasses work.
`RuntimeClass`	Mutating + Validating	Applies pod overhead and scheduling constraints from a `RuntimeClass`.	Accounts for sandbox/runtime overhead.
`CertificateApproval` / `CertificateSigning` / `CertificateSubjectRestriction`	Validating	Guard the CertificateSigningRequest workflow.	Protect cluster PKI.

A few rules of thumb. MutatingAdmissionWebhook, ValidatingAdmissionWebhook and ValidatingAdmissionPolicy must be in the enabled set or your dynamic policy silently does nothing — this is a classic “my webhook never fires” cause on self-managed clusters. NamespaceLifecycle, ServiceAccount, LimitRanger, ResourceQuota and PodSecurity are effectively mandatory for a sane, secure cluster. The compiled order of the mutating built-ins is fixed and chosen so that, for example, the ServiceAccount injector runs before validation; you do not control it and should not need to.

Dynamic admission: the webhook model

A dynamic admission webhook is your own HTTPS server that the API server calls during admission. You register it by creating a MutatingWebhookConfiguration or ValidatingWebhookConfiguration object — a cluster-scoped resource that tells the API server which requests to forward and where to send them. When a matching request arrives, the API server serialises the request into an AdmissionReview JSON object, POSTs it to your endpoint over TLS, and waits for an AdmissionReview response containing the verdict (and, for mutating webhooks, a base64-encoded JSON Patch describing the changes).

The flow is identical for both types; only the response differs:

A validating webhook responds with allowed: true/false and, on denial, a status message that surfaces to the user.
A mutating webhook responds with allowed: true plus an optional patch (JSON Patch, RFC 6902) and patchType: JSONPatch. The API server applies the patch to the object.

Both configurations share almost all their fields; the only structural difference is that a MutatingWebhookConfiguration lists webhooks under webhooks[]. with mutation semantics and may set reinvocationPolicy, while a ValidatingWebhookConfiguration does not mutate and has no reinvocationPolicy. Let us go through every field, because the exam — and production — lives in these details.

Every field of a webhook configuration

Here is a fully-annotated ValidatingWebhookConfiguration; a mutating one is identical except for the kind and the addition of reinvocationPolicy.

apiVersion: admissionregistration.k8s.io/v1
kind: ValidatingWebhookConfiguration
metadata:
  name: require-team-label
webhooks:
  - name: require-team-label.kloudvin.dev   # MUST be a fully-qualified domain-style name
    admissionReviewVersions: ["v1"]          # AdmissionReview versions this webhook understands
    sideEffects: None                        # does calling it change state outside the request?
    failurePolicy: Fail                      # what to do if the webhook is unreachable/errors
    matchPolicy: Equivalent                  # match equivalent API groups/versions, not just exact
    timeoutSeconds: 10                        # how long the API server waits (1–30)
    namespaceSelector:                        # only namespaces matching this label selector
      matchLabels:
        kloudvin.dev/governed: "true"
    objectSelector: {}                        # only objects matching this label selector
    matchConditions:                          # CEL pre-filters evaluated by the API server
      - name: exclude-kube-system
        expression: "request.namespace != 'kube-system'"
    rules:                                    # which operations/resources to intercept
      - apiGroups:   ["apps", ""]
        apiVersions: ["v1"]
        operations:  ["CREATE", "UPDATE"]
        resources:   ["deployments", "pods"]
        scope:       "Namespaced"            # Namespaced | Cluster | * (all)
    clientConfig:                             # where the API server sends the AdmissionReview
      service:                                # in-cluster Service (preferred)
        namespace: webhooks
        name: policy-webhook
        path: /validate
        port: 443
      caBundle: <base64 PEM>                  # CA that signed the webhook's serving cert

The fields, one by one:

Field	What it controls	Choices / format	Default	When to set it / gotcha
`name`	Identifier for this webhook within the configuration.	A DNS-style FQDN (e.g. `x.example.com`).	none (required)	Must be unique per configuration and must contain a dot. Used in error messages and logs.
`clientConfig.service`	The in-cluster Service to call.	`namespace`, `name`, `path`, `port`.	—	Strongly preferred over `url`; resolves through the cluster network and respects Service routing.
`clientConfig.url`	An absolute external URL to call instead of a Service.	`https://host:port/path`.	—	For webhooks outside the cluster. Cannot use `localhost`/`127.0.0.1`. Choose one of `service` or `url`.
`clientConfig.caBundle`	The PEM CA bundle the API server uses to verify the webhook’s TLS cert.	base64-encoded PEM.	—	If wrong/expired, every call fails TLS → behaviour decided by `failurePolicy`. The #1 cause of broken webhooks.
`rules`	Which operations on which resources trigger this webhook.	List of `{apiGroups, apiVersions, operations, resources, scope}`.	none → matches nothing useful	`""` is a wildcard for groups/versions/resources. `operations` may include `CREATE`, `UPDATE`, `DELETE`, `CONNECT`, or `""`. Over-broad rules (`//*`) are a stability risk — see below.
`rules[].scope`	Restrict to namespaced or cluster-scoped resources.	`Namespaced`, `Cluster`, `*`.	`*`	Useful to avoid firing on cluster-scoped objects you do not care about.
`failurePolicy`	What happens if the webhook errors, times out, or is unreachable.	`Fail`, `Ignore`.	`Fail`	The most consequential field. `Fail` = closed (request rejected) — secure but a down webhook can wedge the cluster. `Ignore` = open (request proceeds) — available but a policy gap. See the dedicated section.
`matchPolicy`	How `rules` match when the same resource is served under multiple API versions.	`Exact`, `Equivalent`.	`Equivalent`	`Equivalent` (the sane default) matches a request even if it arrives under a different but equivalent group/version than you listed, by converting it. `Exact` matches only the literal versions in `rules` — a foot-gun that lets requests slip past after an API version bump.
`namespaceSelector`	Restrict by namespace labels.	A label selector (`matchLabels`/`matchExpressions`).	empty = all namespaces	The clean way to scope policy to opted-in namespaces and, critically, to exclude `kube-system` so a broken webhook cannot break the control plane.
`objectSelector`	Restrict by the object’s own labels.	A label selector.	empty = all objects	Lets a sidecar-injector fire only on Pods labelled `inject=true`, sparing every other Pod the round-trip.
`matchConditions`	Fine-grained CEL pre-filters the API server evaluates before calling the webhook.	List of `{name, expression}` CEL returning bool.	none	All must be true to call the webhook. Cheaper and more expressive than selectors; e.g. skip requests from a specific ServiceAccount. (Stable since 1.30.)
`sideEffects`	Declares whether dry-run calls to this webhook are safe.	`None`, `NoneOnDryRun`, `Some` (deprecated), `Unknown` (deprecated).	none (required)	`None` = the webhook never changes external state, so dry-run calls it normally. `NoneOnDryRun` = it has side effects, so the API server must not call it for `?dryRun=true` requests. Lying here breaks `kubectl --dry-run=server`.
`timeoutSeconds`	How long the API server waits for a response.	1–30 seconds.	10	Keep it low (1–5s). Combined with `failurePolicy: Fail`, a slow webhook adds latency to every matching write and can stall the cluster.
`admissionReviewVersions`	Which `AdmissionReview` schema versions the webhook accepts.	Ordered list, e.g. `["v1"]`.	none (required)	The API server uses the first version both sides support. List `["v1"]` for any modern webhook.
`reinvocationPolicy` (mutating only)	Whether to call this webhook a second time if later webhooks mutated the object.	`Never`, `IfNeeded`.	`Never`	`IfNeeded` enables one extra pass; requires the webhook to be idempotent. Absent on validating configs.

failurePolicy: the fail-open vs fail-closed decision

failurePolicy deserves its own treatment because it is the field that turns a security control into an availability risk, or vice versa.

Value	Behaviour when the webhook is unreachable / errors / times out	Security posture	Availability posture
`Fail` (default)	The API request is rejected with a webhook-unavailable error.	Fail-closed — no object slips past while the policy engine is down.	Fragile — if the webhook is down (or scoped too broadly), legitimate writes are blocked, potentially including the very Pods that are the webhook.
`Ignore`	The API request proceeds as if the webhook had not matched.	Fail-open — a policy gap opens whenever the webhook is unavailable.	Robust — cluster keeps accepting writes regardless of webhook health.

The classic catastrophe: a failurePolicy: Fail webhook with rules matching pods cluster-wide, whose own pods then get evicted. The API server cannot admit the replacement pods because it cannot reach the (now-zero) webhook replicas — a self-inflicted, cluster-wide deadlock. The defences are all in this lesson: scope tightly with namespaceSelector to exclude kube-system and the webhook’s own namespace; use objectSelector/matchConditions to match only what you must; keep timeoutSeconds small; and run the webhook with multiple replicas and a PodDisruptionBudget. For security-critical policies you genuinely want Fail; for best-effort conveniences (a sidecar nicety), Ignore is kinder.

The mutating webhook in depth

A mutating webhook returns a JSON Patch that the API server applies to the incoming object. The response looks like this:

{
  "apiVersion": "admission.k8s.io/v1",
  "kind": "AdmissionReview",
  "response": {
    "uid": "<copied from the request>",
    "allowed": true,
    "patchType": "JSONPatch",
    "patch": "<base64 of: [{\"op\":\"add\",\"path\":\"/metadata/labels/team\",\"value\":\"payments\"}]>"
  }
}

Three design rules make mutating webhooks safe:

Echo the uid. The response must copy the request’s uid so the API server can correlate it. Omitting it is rejected.
Be idempotent. Especially under reinvocationPolicy: IfNeeded, the webhook may run twice. Check before you add. A sidecar injector should look for its container by name and no-op if present.
Patch defensively. A JSON Patch add to /spec/containers/- appends a container; an add to a map key that already exists replaces it. Test patches against objects that already have the field.

Mutating webhooks should do the minimum necessary mutation and leave judgement to the validating phase. A common, clean split is: a mutating webhook injects defaults (a sidecar, a label, a default securityContext), and a separate validating webhook (or a ValidatingAdmissionPolicy) enforces that the result is acceptable. This keeps each webhook simple and lets validation assume mutation already happened.

The validating webhook in depth

A validating webhook is simpler: it returns a verdict and never a patch. The denial message is the field that determines whether your users curse you or thank you:

{
  "apiVersion": "admission.k8s.io/v1",
  "kind": "AdmissionReview",
  "response": {
    "uid": "<copied from the request>",
    "allowed": false,
    "status": {
      "code": 403,
      "message": "Pod 'web' rejected: container 'app' must set resources.limits.memory (policy: require-limits)"
    }
  }
}

A good message names the object, the exact field at fault, and the policy that fired, so the user can fix it without opening a ticket. A validating webhook can also return non-fatal warnings (surfaced by kubectl as Warning: lines) to nudge users toward better config without blocking them — the same mechanism PSA uses for its warn mode.

ValidatingAdmissionPolicy: webhooks without a server (CEL)

Running a webhook server is a lot of operational weight for what is often a one-line rule (“every Deployment must have ≥ 2 replicas”). You must build it, deploy it, expose it via a Service, manage its TLS cert and caBundle, keep it highly available, and accept that it adds a network round-trip and a single point of failure to every matching write. Since Kubernetes 1.30, the in-tree ValidatingAdmissionPolicy makes most validation possible without any of that. You write the rule as a CEL (Common Expression Language) expression, and the API server evaluates it itself — no server, no network call, no certificate.

There are two objects, deliberately split so one policy can be reused with different scopes:

ValidatingAdmissionPolicy — the what: the CEL validations, the resources it can apply to (matchConstraints), optional variables, matchConditions, and the failurePolicy.
ValidatingAdmissionPolicyBinding — the where: binds a policy to actual namespaces/objects via matchResources, and chooses the enforcement action (validationActions: Deny, Warn, Audit).

apiVersion: admissionregistration.k8s.io/v1
kind: ValidatingAdmissionPolicy
metadata:
  name: require-min-replicas
spec:
  failurePolicy: Fail
  matchConstraints:
    resourceRules:
      - apiGroups:   ["apps"]
        apiVersions: ["v1"]
        operations:  ["CREATE", "UPDATE"]
        resources:   ["deployments"]
  variables:
    - name: replicas
      expression: "object.spec.replicas"
  validations:
    - expression: "variables.replicas >= 2"
      message: "Deployments must run at least 2 replicas for availability."
      reason: Invalid
---
apiVersion: admissionregistration.k8s.io/v1
kind: ValidatingAdmissionPolicyBinding
metadata:
  name: require-min-replicas-binding
spec:
  policyName: require-min-replicas
  validationActions: ["Deny"]          # Deny | Warn | Audit (combinable)
  matchResources:
    namespaceSelector:
      matchLabels:
        kloudvin.dev/governed: "true"

CEL gives you a rich, sandboxed expression language with access to:

CEL variable	What it is
`object`	The incoming object (the new state). `null` on `DELETE`.
`oldObject`	The existing object (the prior state). `null` on `CREATE`. Use for transition rules: “you may not increase replicas beyond 10”.
`request`	The `AdmissionRequest` metadata — `request.operation`, `request.namespace`, `request.userInfo`, `request.dryRun`.
`params`	A custom resource referenced by the binding’s `paramRef`, letting one policy be parameterised per namespace.
`namespaceObject`	The full Namespace object of the request, for reading namespace labels/annotations.
`variables.<name>`	Reusable sub-expressions declared in `spec.variables`, evaluated lazily and cached.
`authorizer`	A CEL helper to make authorization checks from within the policy.

The trade-offs versus webhooks are worth memorising:

Aspect	ValidatingAdmissionPolicy (CEL)	Validating webhook
Server to run	None — evaluated in-process	Yes — you build, deploy, secure it
TLS / `caBundle`	None	Required, and the top failure cause
Latency	Negligible, in-process	A network round-trip per matching write
Availability risk	None (no external dependency)	A down/slow webhook can block writes
Expressiveness	CEL only — no I/O, no external lookups, bounded	Arbitrary code — can call other systems, do crypto (e.g. verify image signatures)
Mutation	`MutatingAdmissionPolicy` (newer; CEL `ApplyConfiguration`/JSON Patch)	Full JSON Patch

When can you not use CEL? When the decision needs information the API server does not have — verifying a container image’s cryptographic signature against a registry, calling an external policy service, or doing anything with side effects. Those still need a webhook. But for the enormous class of “this field must / must not be X relative to that field”, CEL is now the right default: less to run, nothing to break, no certificate to expire.

There is also a MutatingAdmissionPolicy (alpha in 1.32, advancing toward beta), which brings the same webhook-free, CEL-driven approach to the mutating phase — letting you set defaults and apply changes via CEL ApplyConfiguration semantics without a mutating webhook server. Together, the two CEL policy types are positioned as the long-term, server-free successors to the webhook pair for the common cases.

How PSA, Kyverno and Gatekeeper plug in

This is the question that ties the lesson to reality: the three policy tools you will actually meet all hook into the exact machinery above, but at different points.

Kubernetes admission control chain

The diagram traces a write request through authentication and authorization, then through the mutating phase (built-in mutators and mutating webhooks, with reinvocation), then through the validating phase (built-in validators, validating webhooks, and CEL policies), and finally into etcd — and shows where each of the three tools below attaches.

Pod Security Admission (PSA) is the simplest: it is a built-in validating admission controller (PodSecurity), enabled by default. There is nothing to install and no webhook. You configure it purely with namespace labels that select a Pod Security Standard level (privileged, baseline, restricted) and a mode (enforce, audit, warn):

kubectl label namespace payments \
  pod-security.kubernetes.io/enforce=restricted \
  pod-security.kubernetes.io/warn=restricted

Because it runs in the validating phase, it can only reject non-conforming Pods (those that set privileged: true, share host namespaces, run as root under restricted, and so on) — it never mutates them. It covers exactly the Pod Security Standards and nothing else; for anything beyond those fields you reach for Kyverno or Gatekeeper. PSA is covered end-to-end in Migrating to Pod Security Admission: Enforcing Baseline and Restricted Profiles Without Breaking Workloads.

Kyverno is a policy engine that installs as a set of webhook servers. When you install it, it registers its own MutatingWebhookConfiguration and ValidatingWebhookConfiguration pointing at the Kyverno deployment, and then dynamically manages those configurations based on the ClusterPolicy/Policy custom resources you create — adding the relevant rules so the API server only forwards request types some policy actually cares about. Its headline advantage is that policies are written in YAML, not a separate language, and it can do all three of validate, mutate, and generate (create companion objects, e.g. a default NetworkPolicy per new namespace) — plus image-signature verification, which needs a webhook because it calls out to a registry. It plugs into both the mutating and validating phases via its webhooks. See Deploying Kyverno: Policy-as-Code for Image Signing, Limits & Pod Security.

OPA Gatekeeper is the Kubernetes-native packaging of Open Policy Agent. It installs a validating (and optional mutating) webhook server backed by OPA, and you express policy in Rego via two CRDs: a ConstraintTemplate (the reusable Rego logic, which generates a new CRD) and Constraint objects (instances that parameterise and scope the template). The API server forwards matching requests to Gatekeeper’s webhook, which evaluates the constraints and returns allow/deny. Gatekeeper adds an audit mode that periodically re-scans existing objects against constraints — finding violations that predate the policy, which admission alone (only firing on new writes) cannot. It plugs in chiefly at the validating phase via its webhook. See OPA Gatekeeper: Policy-as-Code Admission Gating.

Tool	How it attaches	Policy language	Mutate?	Generate?	Audit existing?	Best for
Pod Security Admission	Built-in `PodSecurity` validating controller	Namespace labels (no language)	No	No	No (admission-time only)	Baseline Pod hardening with zero install
ValidatingAdmissionPolicy	Built-in CEL validating controller	CEL	No (see MutatingAdmissionPolicy)	No	Via `Audit` action	Simple in-cluster field rules, no server
Kyverno	Its own mutating + validating webhooks	YAML	Yes	Yes	Via background scans / reports	Teams wanting YAML policy, mutation, generation, image signing
OPA Gatekeeper	Its own validating (+ mutating) webhook	Rego	Yes (mutation feature)	No	Yes (audit)	Org-wide policy with Rego, strong audit of existing objects

The decision in one breath: PSA for the Pod Security Standards (free, built-in); ValidatingAdmissionPolicy for simple field rules you want without running anything; Kyverno when you want approachable YAML policy plus mutation/generation/image-signing; Gatekeeper when your organisation has standardised on Rego and wants strong audit of pre-existing resources. All four are first-class citizens of the same admission pipeline.

Hands-on lab

We will create a local cluster, register a ValidatingAdmissionPolicy that requires every Deployment in a governed namespace to have at least two replicas, prove it both blocks and allows, then add a mutating webhook-style default via a CEL MutatingAdmissionPolicy is out of scope for a default kind build, so we will instead use a built-in mutator to illustrate the mutating phase. Everything here is free and local — no cloud account.

1. Create a cluster

kind create cluster --name admission-lab --image kindest/node:v1.30.0
kubectl cluster-info --context kind-admission-lab

kind runs a full Kubernetes 1.30 control plane in Docker, so ValidatingAdmissionPolicy is GA and enabled by default. Verify the API is up before continuing.

2. Label a governed namespace

kubectl create namespace governed
kubectl label namespace governed kloudvin.dev/governed=true

3. Apply the policy and its binding

cat <<'EOF' | kubectl apply -f -
apiVersion: admissionregistration.k8s.io/v1
kind: ValidatingAdmissionPolicy
metadata:
  name: require-min-replicas
spec:
  failurePolicy: Fail
  matchConstraints:
    resourceRules:
      - apiGroups:   ["apps"]
        apiVersions: ["v1"]
        operations:  ["CREATE", "UPDATE"]
        resources:   ["deployments"]
  validations:
    - expression: "object.spec.replicas >= 2"
      message: "Deployments in a governed namespace must run at least 2 replicas."
      reason: Invalid
---
apiVersion: admissionregistration.k8s.io/v1
kind: ValidatingAdmissionPolicyBinding
metadata:
  name: require-min-replicas-binding
spec:
  policyName: require-min-replicas
  validationActions: ["Deny"]
  matchResources:
    namespaceSelector:
      matchLabels:
        kloudvin.dev/governed: "true"
EOF

Expected output:

validatingadmissionpolicy.admissionregistration.k8s.io/require-min-replicas created
validatingadmissionpolicybinding.admissionregistration.k8s.io/require-min-replicas-binding created

4. Prove it rejects a single-replica Deployment

kubectl -n governed create deployment nginx --image=nginx --replicas=1

Expected — the request is denied before anything is created, with your message:

error: failed to create deployment: deployments.apps "nginx" is forbidden:
ValidatingAdmissionPolicy 'require-min-replicas' with binding
'require-min-replicas-binding' denied request: Deployments in a governed
namespace must run at least 2 replicas.

5. Prove it allows a compliant Deployment

kubectl -n governed create deployment nginx --image=nginx --replicas=2
kubectl -n governed get deploy nginx

Expected:

deployment.apps/nginx created
NAME    READY   UP-TO-DATE   AVAILABLE   AGE
nginx   2/2     2            2           15s

6. Prove the binding is scoped (control namespace is unaffected)

kubectl create namespace ungoverned
kubectl -n ungoverned create deployment nginx --image=nginx --replicas=1

The single-replica Deployment succeeds here, because ungoverned lacks the kloudvin.dev/governed=true label the binding selects on — proving that scoping is driven by the binding’s namespaceSelector, exactly as a webhook’s namespaceSelector would.

7. Observe the built-in mutating phase

Watch a built-in mutator in action: create a bare Pod and see the ServiceAccount admission controller inject the default service account and its token projection — a mutation, applied before validation.

kubectl -n governed run probe --image=busybox --restart=Never --command -- sleep 3600
kubectl -n governed get pod probe -o jsonpath='{.spec.serviceAccountName}{"\n"}'
kubectl -n governed get pod probe -o jsonpath='{.spec.volumes[*].name}{"\n"}'

Expected — you never set these; the mutating phase did:

default
kube-api-access-xxxxx

8. Cleanup

kubectl delete validatingadmissionpolicybinding require-min-replicas-binding
kubectl delete validatingadmissionpolicy require-min-replicas
kubectl delete namespace governed ungoverned
kind delete cluster --name admission-lab

Cost note: entirely free. kind runs in local Docker; nothing is provisioned in any cloud, so there is nothing to bill and nothing to leak. Deleting the cluster reclaims all resources.

Common mistakes & troubleshooting

Symptom	Likely cause	Fix
Webhook never fires; objects pass unchecked	The `MutatingAdmissionWebhook` / `ValidatingAdmissionWebhook` plugin is disabled, or `rules` do not match the resource/operation, or `matchPolicy: Exact` misses a different API version	Confirm the plugins are enabled; widen/correct `rules`; prefer `matchPolicy: Equivalent`
All writes suddenly rejected with “webhook unavailable”	`failurePolicy: Fail` webhook is down, unreachable, or matched too broadly (including its own namespace)	Restore the webhook; scope with `namespaceSelector` to exclude `kube-system` and the webhook’s own namespace; consider `Ignore` for non-critical policies
TLS / x509 errors calling the webhook	`clientConfig.caBundle` is wrong, missing, or the serving cert expired/rotated	Regenerate the cert, re-encode the CA into `caBundle` (cert-manager’s CA injector automates this)
`kubectl --dry-run=server` triggers real side effects	`sideEffects: None` declared on a webhook that does mutate external state	Set `sideEffects: NoneOnDryRun` so the API server skips it on dry-run
Sidecar injected twice / label set twice	Non-idempotent mutating webhook under `reinvocationPolicy: IfNeeded` (or two webhooks both injecting)	Make the webhook check-before-add; ensure idempotency
Every write is slow	High `timeoutSeconds` plus a sluggish webhook on a broad `rules` match	Lower `timeoutSeconds` (1–5s), narrow `rules`, add `objectSelector`/`matchConditions`, scale the webhook
CEL policy applies everywhere, ignoring scope	`validationActions`/`matchResources` set on the policy instead of the binding, or no binding created at all	Scope via the `ValidatingAdmissionPolicyBinding`; a policy with no binding does nothing
Policy passes objects that violate it after an upgrade	`matchConstraints`/`rules` list a now-stale `apiVersions`	Use `Equivalent` matching / wildcards and keep version lists current

Best practices

Validate in CEL, mutate only when you must. Reach for ValidatingAdmissionPolicy before standing up a webhook server; you remove a network hop, a certificate, and a single point of failure.
Scope every webhook tightly. Always set a namespaceSelector (or matchConditions) that excludes kube-system and the webhook’s own namespace; use objectSelector so you only intercept objects you actually act on.
Choose failurePolicy deliberately, per policy. Fail for security-critical gates, Ignore for best-effort conveniences. Never default everything to Fail and forget about it.
Keep timeouts low. 1–5 seconds. Admission latency is added to every matching write.
Make mutating webhooks idempotent. Assume they run twice; check before you add.
Run webhooks for availability. Multiple replicas, a PodDisruptionBudget, anti-affinity, and health probes — a webhook on the write path is as critical as the API server it gates.
Declare sideEffects honestly so dry-run and diff work.
Pin admissionReviewVersions: ["v1"] and use apiVersion: admissionregistration.k8s.io/v1.
Layer, do not duplicate. PSA for the Pod Security Standards, a policy engine (Kyverno/Gatekeeper) or CEL for the rest. Avoid two tools enforcing the same rule with conflicting messages.
Test policies in Warn/Audit before Deny. Measure blast radius on real traffic before you turn a gate on.

Security notes

Admission control is a security mechanism, so its own failure modes are security-relevant. Treat the webhook server as part of the control plane’s trust boundary: it can mutate any object it intercepts, so a compromised mutating webhook can inject a malicious sidecar or strip a securityContext from every Pod. Lock down who can create or edit MutatingWebhookConfiguration/ValidatingWebhookConfiguration objects via RBAC — the verb to guard is write access on admissionregistration.k8s.io — because anyone who can register a webhook can intercept and rewrite cluster writes. Remember admission runs after authorization, so it is a complement to RBAC, never a replacement: never rely on a webhook to enforce something RBAC should deny, because an unauthorized caller never reaches admission. Prefer failurePolicy: Fail for controls that block dangerous configurations (you want them closed when the engine is down), but pair that with high availability so “closed” does not become “cluster wedged”. Keep webhook serving certificates short-lived and automatically rotated, and verify CEL policies cannot be bypassed by submitting the resource under an alternative API version (matchPolicy: Equivalent, wildcard versions). Finally, for image-provenance enforcement (signature verification), use a tool that does it in a webhook (Kyverno, Sigstore policy-controller, Gatekeeper with an external data provider) — CEL cannot reach out to a registry, so “only signed images” is not a job for ValidatingAdmissionPolicy.

Interview & exam questions

Where does admission control sit on the API request path, and what runs before and after it? After authentication and authorization, and after the request is decoded — but before persistence to etcd. Order: authn → authz → mutating admission → schema validation → validating admission → etcd (with quota around persistence). It only runs on write-shaped verbs (CREATE/UPDATE/DELETE/CONNECT), never on reads.
Why does the mutating phase always run before the validating phase? So validation can judge the final object. Mutation can add or default fields; if validation ran first or interleaved, no validating policy could be sure it was evaluating the object that will actually be stored. The phase ordering (all mutation, then all validation) is the only ordering guarantee you get.
What is reinvocation and why must mutating webhooks be idempotent? With reinvocationPolicy: IfNeeded, the API server makes one extra pass over mutating webhooks if a later webhook changed the object after a given webhook ran — so a webhook can be called twice on the same request. If it is not idempotent (e.g. it appends a sidecar without checking), it duplicates its effect on the second pass.
Explain failurePolicy: Fail vs Ignore and the danger of Fail. Fail (default) rejects the request if the webhook is unreachable or errors — fail-closed, secure, but a down or over-broad webhook can block legitimate writes, including its own replacement pods (cluster deadlock). Ignore lets the request proceed — fail-open, available, but opens a policy gap while the webhook is down.
What do namespaceSelector, objectSelector and matchConditions each scope, and why prefer them? namespaceSelector filters by namespace labels; objectSelector by the object’s own labels; matchConditions are CEL pre-filters the API server evaluates before calling the webhook. They reduce blast radius and load — critically, excluding kube-system and the webhook’s own namespace prevents self-inflicted outages.
What does sideEffects declare, and what breaks if it is wrong? Whether the webhook changes state outside the admitted object, and specifically whether it is safe to call on dry-run. None = safe, called normally; NoneOnDryRun = has side effects, so skipped for ?dryRun=true. Declaring None on a webhook that mutates external state makes kubectl --dry-run=server cause real changes.
matchPolicy: Exact vs Equivalent — which is the default and why does it matter? Equivalent is the default and the safe choice: it matches a request even if it arrives under a different-but-equivalent API group/version than your rules list, by converting it. Exact matches only the literal versions listed, so requests can slip past a policy after an API version bump — a real security foot-gun.
What is a ValidatingAdmissionPolicy and when do you use it over a webhook? An in-tree (GA since 1.30), CEL-based validating policy evaluated inside the API server — no server, no TLS cert, no network hop, no availability risk. Use it for field-level rules (“replicas ≥ 2”, “image not :latest”, transition rules via oldObject). Use a webhook when you need I/O the API server cannot do — image-signature verification, external lookups, side effects.
What is the role of the ValidatingAdmissionPolicyBinding, and what are validationActions? The binding attaches a policy to specific namespaces/objects (matchResources) and selects the enforcement action: Deny, Warn, or Audit (combinable). The policy is the reusable logic; the binding is the scope and the verdict behaviour. A policy with no binding does nothing.
How do PSA, Kyverno and Gatekeeper each plug into admission? PSA is a built-in validating controller configured by namespace labels — no install, validate-only, Pod Security Standards only. Kyverno installs its own mutating and validating webhooks and manages them from YAML ClusterPolicy/Policy resources; it can validate, mutate, generate, and verify image signatures. Gatekeeper installs a validating (and optional mutating) webhook backed by OPA, with policy in Rego via ConstraintTemplate/Constraint, and adds periodic audit of existing objects.
Why can’t admission control replace RBAC? Admission runs after authorization, so an unauthorized caller never reaches it. RBAC decides whether the verb is allowed at all; admission decides whether the specific object is acceptable. They are complementary layers, and security-critical “this caller may not do X” belongs in RBAC.
A failurePolicy: Fail Pod webhook took down the cluster after a node drain. Why, and how do you prevent it? The webhook’s own pods were evicted; with Fail, the API server could not admit replacement pods because it could not reach the (zero-replica) webhook — deadlock. Prevent it by excluding kube-system and the webhook’s namespace via namespaceSelector, running multiple replicas with a PodDisruptionBudget, keeping timeoutSeconds low, and using Ignore for non-critical policies.

Quick check

On which API verbs does admission control run, and on which does it not?
True or false: a validating webhook can modify the incoming object.
What is the default value of failurePolicy, and what does it do?
Which CEL variable holds the previous state of an object, for transition rules?
Which of PSA, Kyverno, and Gatekeeper requires no installation because it is built into the API server?

Answers

It runs on write-shaped verbs — CREATE, UPDATE, DELETE, and CONNECT (sub-resource connections like pods/exec). It does not run on reads (GET, LIST, WATCH).
False. Validating webhooks can only accept or reject and return a message; only mutating webhooks return a patch that changes the object.
The default is Fail (fail-closed): if the webhook is unreachable or errors, the API request is rejected.
oldObject — it is null on CREATE and holds the prior state on UPDATE/DELETE.
Pod Security Admission — it is the built-in PodSecurity validating controller, enabled by default and configured purely with namespace labels.

Exercise

On a local kind cluster running Kubernetes 1.30+, build a small policy suite and prove each behaviour:

Write a ValidatingAdmissionPolicy that denies any Pod or Deployment whose container image uses the :latest tag (hint: CEL endsWith, iterate object.spec.template.spec.containers for Deployments and object.spec.containers for Pods — you may use two policies or a variables block to normalise). Bind it with validationActions: ["Warn", "Audit"] first.
Apply a Deployment using nginx:latest and confirm you get a warning but the object is created (because the action is Warn/Audit, not Deny).
Flip the binding to validationActions: ["Deny"], re-apply, and confirm the same Deployment is now rejected with your message.
Add a matchConditions (or namespaceSelector) so the policy applies only to a governed namespace, and prove an identical Deployment in a different namespace is unaffected.
Stretch: add a transition rule using oldObject that forbids increasing a Deployment’s replica count above 10 on UPDATE, and prove it blocks a scale-up from 8 to 12 while allowing 8 to 9.

Write down, for each step, which phase fired and why — and which of these you could not have done in CEL (answer: none of them; they are all in-cluster field comparisons, which is exactly CEL’s sweet spot).

Certification mapping

CKS (Certified Kubernetes Security Specialist) — Cluster Setup and Cluster Hardening: admission control is a core CKS domain. Expect to enable/inspect admission plugins, configure Pod Security Admission, and reason about webhook-based policy (Kyverno/Gatekeeper/OPA) and ValidatingAdmissionPolicy. The failurePolicy, scoping, and “admission complements RBAC” points are prime exam material.
CKA (Certified Kubernetes Administrator) — Cluster Architecture, Installation & Configuration: understanding the request path (authn → authz → admission), the role of built-in controllers (ResourceQuota, LimitRanger, NamespaceLifecycle, ServiceAccount), and how ResourceQuota/LimitRange are enforced via admission.

Glossary

Admission controller — code in the API server pipeline that intercepts authenticated, authorized write requests to mutate or validate the object before it is persisted.
Mutating admission — the phase/controllers that can modify the incoming object.
Validating admission — the phase/controllers that can only accept or reject the object.
Built-in (compiled-in) controller — an admission controller shipped inside kube-apiserver, toggled with --enable-admission-plugins / --disable-admission-plugins.
Dynamic admission control — pluggable admission added at runtime via webhook configurations or CEL policy objects, without recompiling the API server.
AdmissionReview — the JSON envelope the API server sends to (and receives from) a webhook, carrying the request and the response verdict/patch.
MutatingWebhookConfiguration / ValidatingWebhookConfiguration — cluster-scoped objects registering webhook endpoints, their match rules, and their behaviour.
failurePolicy — Fail (reject on webhook error, fail-closed) or Ignore (proceed, fail-open).
matchPolicy — Equivalent (match across equivalent API versions, default) or Exact (literal versions only).
sideEffects — declares whether a webhook changes external state, governing whether it is called on dry-run (None, NoneOnDryRun).
reinvocationPolicy — IfNeeded allows one extra pass over a mutating webhook if later webhooks changed the object; requires idempotency.
matchConditions — CEL expressions the API server evaluates to decide whether to call a webhook (or apply a policy) at all.
CEL (Common Expression Language) — the sandboxed, non-Turing-complete expression language used by ValidatingAdmissionPolicy/MutatingAdmissionPolicy.
ValidatingAdmissionPolicy / ValidatingAdmissionPolicyBinding — in-tree, server-free CEL validation: the policy (logic) and the binding (scope + Deny/Warn/Audit action).
Pod Security Admission (PSA) — the built-in PodSecurity validating controller enforcing the Pod Security Standards via namespace labels.
Pod Security Standards — the three profiles (privileged, baseline, restricted) PSA enforces.
Kyverno — a policy engine that installs as webhook servers and enforces validate/mutate/generate policies written in YAML.
OPA Gatekeeper — the Kubernetes integration of Open Policy Agent: a webhook plus ConstraintTemplate/Constraint CRDs, with policy in Rego and audit of existing objects.

Next steps

Continue the course with Kubernetes CRDs, Controllers & the Operator Pattern, In Depth (Fundamentals) — admission webhooks and operators are the two halves of extending the API server, and CRDs underpin Kyverno and Gatekeeper policy objects.
Put admission into practice with policy-as-code: Deploying Kyverno: Policy-as-Code for Image Signing, Limits & Pod Security and OPA Gatekeeper: Policy-as-Code Admission Gating.
Master the built-in path with Migrating to Pod Security Admission: Enforcing Baseline and Restricted Profiles Without Breaking Workloads.
Revisit where this layer lives in the bigger picture via the StatefulSets deep dive’s companion, Kubernetes StatefulSets, In Depth: Stable Identity, Ordered Lifecycle & Per-Pod Storage.