Deploy Kyverno Policies to Enforce Image Signing, Resource Limits, and Pod Security

A payments platform team gets the finding back from their first real supply-chain audit: anyone with kubectl apply can run :latest from an arbitrary public registry, half the pods have no CPU/memory limits so one bad deploy noisy-neighbours an entire node, and a third of workloads run as root with hostPath mounts. The CISO’s instruction is blunt — “nothing runs in production unless it is our signed image, it stays inside its limits, and it cannot get root on the node.” You can chase that with code review and good intentions, or you can make the cluster itself refuse the bad manifest at the API server. This guide does the latter with Kyverno, the Kubernetes-native policy engine, enforcing three controls as a single admission gate: image signature verification (Cosign), resource limits (mutate + validate), and the restricted Pod Security Standard. Every command below is real and runnable against any conformant cluster (AKS, EKS, GKE, or vanilla).

Prerequisites

A Kubernetes cluster, v1.27+ (Kyverno 1.13 tracks recent APIs), with cluster-admin via kubectl.
helm 3.12+, kubectl, cosign 2.x, and jq on your workstation.
An OCI registry you control (GHCR, ECR, ACR, Artifact Registry). Examples use ghcr.io/kloudvin.
A CI system that builds and signs images — examples use GitHub Actions; Jenkins works identically with the Cosign CLI.
Cluster egress (or a mirror) to ghcr.io/kyverno for the controller images.
Optional but assumed in the operating model: HashiCorp Vault (or a KMS) holding the Cosign private key, Argo CD for GitOps delivery of policies, and Wiz / a SIEM consuming policy reports.

Target topology

Deploy Kyverno Policies to Enforce Image Signing, Resource Limits, and Pod Security — topology

Kyverno installs as a set of controllers in the kyverno namespace and registers two webhooks with the API server: a ValidatingWebhookConfiguration (deny on policy violation) and a MutatingWebhookConfiguration (inject defaults, verify-and-rewrite image digests). Every CREATE/UPDATE of a Pod-bearing resource flows API server → Kyverno admission controller → your ClusterPolicy rules → allow / mutate / deny. A separate reports controller writes PolicyReport objects continuously so you have a posture view even for resources admitted before a policy existed. Three independent control planes feed in:

CI (GitHub Actions / Jenkins) builds the image and signs it with Cosign, whose private key is issued just-in-time from HashiCorp Vault (or keyless signing via OIDC) — so the signing secret never lives in a runner.
GitOps (Argo CD) delivers the Kyverno policies themselves from a Git repo, making the policy set auditable and revertable, and Terraform/Ansible provisions the cluster add-on and namespace labels underneath.
Posture/SOC: Wiz correlates the admission policies with cloud posture and flags drift, PolicyReports stream to Datadog/Dynatrace dashboards, and a hard denial can open a ServiceNow ticket. CrowdStrike Falcon stays on the nodes for runtime defense — Kyverno gates admission, Falcon watches what runs after.

1. Install Kyverno

Install via the official Helm chart. Run admission in high availability (3 replicas) for any cluster that matters — a single Kyverno pod is a single point of admission failure.

helm repo add kyverno https://kyverno.github.io/kyverno/
helm repo update

helm install kyverno kyverno/kyverno \
  --namespace kyverno --create-namespace \
  --version 3.3.4 \
  --set admissionController.replicas=3 \
  --set backgroundController.replicas=2 \
  --set reportsController.replicas=2 \
  --set cleanupController.replicas=2

Wait for the controllers and confirm the webhooks registered:

kubectl -n kyverno rollout status deploy/kyverno-admission-controller
kubectl get pods -n kyverno
kubectl get validatingwebhookconfigurations,mutatingwebhookconfigurations | grep kyverno

A critical safety setting before you write any policy: decide what happens if Kyverno itself is down. The default failurePolicy: Fail means admission requests are rejected when the webhook is unreachable — safe, but it can wedge a cluster. Set it deliberately per policy (below). Also confirm Kyverno excludes its own and system namespaces so you cannot deadlock the control plane:

kubectl get configmap kyverno -n kyverno -o jsonpath='{.data.webhooks}' ; echo
# Expect kube-system / kyverno excluded by namespaceSelector

2. Set up Cosign signing in CI

Image-signature enforcement is worthless if your own images are unsigned, so build the signing side first. Generate a key pair, or — preferred — use keyless signing where Cosign gets a short-lived certificate from Fulcio bound to your CI’s OIDC identity, leaving no long-lived key to leak.

Key-based, with the private key stored in HashiCorp Vault (never in the repo or a plain CI secret):

# One-time: generate and push the public half to the registry/Git; private half to Vault
cosign generate-key-pair
vault kv put secret/cosign/payments cosign.key=@cosign.key password='<passphrase>'
shred -u cosign.key            # do not keep the private key on disk

The CI job pulls the key from Vault at build time and signs the digest (never a tag):

# .github/workflows/build-sign.yml  (GitHub Actions)
permissions:
  contents: read
  id-token: write          # required for keyless / Vault OIDC auth
jobs:
  build-sign:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: docker/build-push-action@v6
        id: build
        with: { push: true, tags: "ghcr.io/kloudvin/api:${{ github.sha }}" }
      - uses: sigstore/cosign-installer@v3
      # Option A — keyless (recommended): identity is the GitHub OIDC token
      - run: |
          cosign sign --yes \
            "ghcr.io/kloudvin/api@${{ steps.build.outputs.digest }}"
      # Option B — key from Vault:
      # - run: cosign sign --yes --key "hashivault://payments/cosign" \
      #     "ghcr.io/kloudvin/api@${{ steps.build.outputs.digest }}"

Verify locally so you know the exact identity strings the cluster policy must match:

cosign verify \
  --certificate-identity-regexp "https://github.com/kloudvin/.+" \
  --certificate-oidc-issuer "https://token.actions.githubusercontent.com" \
  ghcr.io/kloudvin/api@<digest> | jq '.[0].optional.Subject'

3. Enforce image signatures with verifyImages

Now the gate. This ClusterPolicy uses Kyverno’s verifyImages rule to require a valid Cosign signature for any image from your registry. Start in Audit so you can see the blast radius before you block anything.

# policies/verify-images.yaml
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: require-signed-images
  annotations:
    policies.kyverno.io/severity: high
spec:
  validationFailureAction: Audit        # flip to Enforce in step 7
  failurePolicy: Fail
  webhookTimeoutSeconds: 30             # signature checks are slower than plain validation
  background: false                     # verifyImages cannot run as a background scan
  rules:
    - name: verify-ghcr-cosign-keyless
      match:
        any:
          - resources:
              kinds: [Pod]
      verifyImages:
        - imageReferences:
            - "ghcr.io/kloudvin/*"      # only OUR registry; pin public ones separately
          failureAction: Audit
          mutateDigest: true            # rewrite the verified tag to an immutable @sha256
          required: true
          attestors:
            - count: 1
              entries:
                - keyless:
                    subject: "https://github.com/kloudvin/*"
                    issuer: "https://token.actions.githubusercontent.com"
                    rekor:
                      url: https://rekor.sigstore.dev

If you signed with a Vault/KMS key instead of keyless, swap the attestor entry for the public key:

              entries:
                - keys:
                    publicKeys: |-
                      -----BEGIN PUBLIC KEY-----
                      MFkwEwYHKoZIzj0CAQ...your cosign.pub...
                      -----END PUBLIC KEY-----
                    rekor:
                      url: https://rekor.sigstore.dev

Apply it and watch the reports:

kubectl apply -f policies/verify-images.yaml
kubectl get clusterpolicy require-signed-images
kubectl get policyreport -A | head        # PASS/FAIL counts per namespace

mutateDigest: true is doing quiet, important work: once verified, Kyverno rewrites :tag to the pinned @sha256:... digest in the pod spec, so what runs is provably the bytes you signed — closing the tag-mutation window where an attacker re-pushes a tag after verification.

4. Mutate in default resource limits

A pod with no limits can starve a node. Use a mutate rule to inject sane defaults when the author omits them — non-destructive, and far better adoption than rejecting every under-specified deployment on day one.

# policies/default-resources.yaml
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: add-default-resources
spec:
  rules:
    - name: set-default-requests-limits
      match:
        any:
          - resources:
              kinds: [Pod]
      mutate:
        foreach:
          - list: "request.object.spec.containers"
            patchStrategicMerge:
              spec:
                containers:
                  - name: "{{ element.name }}"
                    resources:
                      requests:
                        +(memory): "128Mi"     # +(...) = add only if absent
                        +(cpu): "100m"
                      limits:
                        +(memory): "512Mi"
                        +(cpu): "500m"

The +(...) anchor means Kyverno only adds the field if it is missing — it never overwrites an explicit value the author set on purpose.

5. Require resource limits with validate

Defaulting is a safety net, not a rule. Pair it with a validate rule so a container that explicitly omits limits in a namespace you care about is rejected outright — defence in depth against someone setting limits: null.

# policies/require-limits.yaml
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: require-resource-limits
spec:
  validationFailureAction: Enforce
  background: true
  rules:
    - name: require-cpu-mem-limits
      match:
        any:
          - resources:
              kinds: [Pod]
      validate:
        message: "CPU and memory limits are required on every container."
        foreach:
          - list: "request.object.spec.containers"
            deny:
              conditions:
                any:
                  - key: "{{ element.resources.limits.memory || '' }}"
                    operator: Equals
                    value: ""
                  - key: "{{ element.resources.limits.cpu || '' }}"
                    operator: Equals
                    value: ""

Order matters: Kyverno runs mutate rules before validate, so the step-4 defaults are applied first and only a container that cannot be defaulted (e.g. an explicit null) trips this deny.

6. Enforce restricted Pod Security

Replace the deprecated PodSecurityPolicy with Kyverno’s podSecurity subrule, which maps directly to the upstream Pod Security Standards. This single rule enforces the entire restricted profile — no root, no privilege escalation, dropped capabilities, seccomp, no host namespaces.

# policies/pod-security-restricted.yaml
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: pod-security-restricted
spec:
  validationFailureAction: Enforce
  background: true
  rules:
    - name: restricted-profile
      match:
        any:
          - resources:
              kinds: [Pod]
      validate:
        podSecurity:
          level: restricted
          version: latest
          # Targeted, auditable exemptions instead of a blanket opt-out:
          exclude:
            - controlName: "Capabilities"
              images: ["ghcr.io/kloudvin/net-tools:*"]

Why Kyverno over the built-in Pod Security Admission: PSA only operates per-namespace at fixed levels and cannot make exceptions, mutate, or report centrally. Kyverno gives you per-image exemptions, the same PolicyReport stream as your other controls, and a single place security reviews. Apply all the policies through Argo CD rather than kubectl in production so the policy set is the Git-tracked source of truth:

kubectl apply -f policies/        # or sync the Argo CD Application
kubectl get cpol                  # all four ClusterPolicies, READY=true

7. Promote from Audit to Enforce

Never go straight to Enforce on a live cluster. Run in Audit, read the reports, fix the offenders, then flip. Find what would be blocked:

# Aggregate failing rules across the cluster
kubectl get policyreport -A -o json \
  | jq -r '.items[].results[] | select(.result=="fail")
      | "\(.policy)/\(.rule)\t\(.resources[0].namespace)/\(.resources[0].name)"' \
  | sort | uniq -c | sort -rn

Once the failures are down to known exemptions, flip each policy to enforcing:

kubectl patch clusterpolicy require-signed-images \
  --type merge -p '{"spec":{"validationFailureAction":"Enforce"}}'
# repeat for the verifyImages rule's own failureAction: Enforce

For high-risk control-plane namespaces, keep failurePolicy: Fail; for application namespaces during rollout, Ignore avoids an outage if Kyverno blips. Make that choice consciously, per policy.

Validation

Prove the gate works with a deliberately bad pod — every one of these must be rejected once policies are enforcing:

# 1. Unsigned / wrong-registry image -> blocked by verifyImages
kubectl run bad-unsigned --image=nginx:latest
# Error: ... require-signed-images: image is not signed

# 2. Signed image with NO limits -> defaulted by mutate, or denied if null
kubectl run noreq --image=ghcr.io/kloudvin/api@<digest> --dry-run=server -o yaml \
  | grep -A4 resources                      # see injected requests/limits

# 3. Root / privileged pod -> blocked by restricted profile
kubectl run rooty --image=ghcr.io/kloudvin/api@<digest> \
  --privileged --dry-run=server
# Error: ... pod-security-restricted: privileged containers are not allowed

# 4. A correctly signed, limited, non-root pod -> ADMITTED
kubectl apply -f tests/good-pod.yaml        # should succeed

Confirm the digest rewrite actually happened on the admitted pod:

kubectl get pod good-pod -o jsonpath='{.spec.containers[0].image}'; echo
# Expect ghcr.io/kloudvin/api@sha256:...  (a digest, not a tag)

Run Kyverno’s own test harness in CI so policy changes are unit-tested before they ship via Argo CD:

kyverno test ./policies/          # asserts expected pass/fail per fixture

Rollback / teardown

Policies are declarative, so rollback is fast — switch back to Audit first if a policy is over-blocking in production, then remove if needed:

# Soft rollback: stop denying, keep reporting
for p in require-signed-images require-resource-limits pod-security-restricted; do
  kubectl patch cpol "$p" --type merge -p '{"spec":{"validationFailureAction":"Audit"}}'
done

# Remove a single policy
kubectl delete clusterpolicy pod-security-restricted

# Full uninstall (also removes both webhooks, so admission stops gating)
helm uninstall kyverno -n kyverno
kubectl delete ns kyverno

If you delivered policies via Argo CD, do the rollback in Git (revert the commit) and let the sync remove them — never kubectl delete out of band, or Argo will flag drift and may re-create them.

Common pitfalls

Cluster wedge from failurePolicy: Fail. If Kyverno is unreachable and a policy fails closed, new pods (including Kyverno’s own on a cold start) can be blocked. Always exclude kube-system and kyverno, and keep app namespaces on Ignore during rollout.
verifyImages can’t background-scan. Signature rules only run at admission (background: false); existing pods are not retro-verified. Re-roll deployments after enabling, or pods admitted earlier keep running unsigned images.
Identity string mismatch. Keyless subject/issuer must match the certificate exactly — a wrong issuer URL or a stray branch in the subject regex silently fails every verify. Confirm with cosign verify first (step 2).
Forgetting to pin digests. Without mutateDigest: true, an attacker re-pushing a verified tag bypasses the check. Always rewrite to a digest.
Mutate vs validate ordering surprises. Defaults from a mutate rule appear before your validate rule runs; test the combined result with --dry-run=server, not the policies in isolation.
initContainers and ephemeralContainers. A foreach over spec.containers misses init/ephemeral containers — add them explicitly or attackers hide privileged work there.

Security notes

This is a Zero-Trust admission control: the cluster trusts no image it cannot cryptographically tie to your CI identity, runs nothing as root, and pins every workload to a signed digest. Keep the Cosign private key in HashiCorp Vault or use keyless signing so there is no long-lived secret to steal; rotate the key and update the policy’s public key together. Feed every PolicyReport to Wiz (to correlate admission posture with cloud misconfig and attack paths) and your SIEM, and let a hard denial open a ServiceNow incident so security gets a ticket, not just a log line. Remember the boundary: Kyverno gates admission — CrowdStrike Falcon on the nodes covers runtime (a compromise after a pod is admitted), and the two together close the gap.

Cost notes

Kyverno’s own footprint is small — the HA controllers run comfortably in roughly 0.5 vCPU / 512Mi per replica, a rounding error against the workloads they protect. The real saving is indirect: the step-4/5 resource-limit policies stop unbounded pods from triggering node autoscale events and cluster overprovisioning, which is usually a far larger line item than the controller. Watch one operational cost — verifyImages adds a Rekor/registry round-trip per new image, so size webhookTimeoutSeconds (step 3) generously and run an in-cluster registry mirror if your image pull volume is high, both to cut latency and to avoid public-registry rate limits.