Deploy Istio Ambient Mesh Waypoint Proxies for L7 Authorization Policies

A payments platform team has run Istio in sidecar mode for two years and is paying for it: every pod carries an Envoy sidecar that adds ~120 MB of memory and 30–50 ms of cold-start latency, and a mesh-wide upgrade means restarting 3,000 pods across forty teams in a coordinated, weekend-long change window that the on-call rotation has come to dread. Their actual security requirement is narrower than the cost they pay for it — they need mTLS everywhere plus L7 authorization on exactly the dozen services that handle card data (only the accounts service may call POST /ledger/v1/debit, and only with a valid Entra-issued JWT carrying the right scope). Istio ambient mode is built for precisely this asymmetry: it gives every workload mTLS and L4 policy through a per-node ztunnel with zero sidecars, and lets you bolt on a waypoint proxy — a standalone Envoy — only for the namespaces or services that genuinely need L7 rules. This guide deploys ambient on an existing cluster, stands up waypoints for the sensitive namespace, and enforces real L7 AuthorizationPolicy resources, end to end.

Prerequisites

A Kubernetes cluster on v1.28+ (AKS, EKS, or GKE) with at least three nodes, and kubectl context pointed at it. Ambient needs the Istio CNI, so a managed cluster where you can run a privileged DaemonSet.
Helm 3.15+ and istioctl 1.24+ on your workstation (istioctl version --remote=false to confirm). Ambient went GA for L7/waypoints in the 1.24 line; do not attempt this on 1.22 or earlier.
Cluster-admin RBAC for the install (CRDs + a CNI DaemonSet), and the Kubernetes Gateway API CRDs (v1.2.0) installed — waypoints are provisioned through the Gateway API, not Istio’s legacy Gateway.
An OIDC issuer you already trust for service and user identity. Here that is Microsoft Entra ID (workforce SSO is brokered from Okta to Entra, so the JWTs your services validate are first-class Entra tokens). Have the issuer URL and JWKS URI handy.
HashiCorp Vault reachable from the cluster for any application secrets the waypoint’d services need — it is not in the mTLS path (Istio’s own CA handles workload certs), but it holds the third-party API tokens those services use.
A demo namespace payments with at least two deployments (accounts, ledger) and a curl-capable client pod, so the policies have something real to gate.

Target topology

Deploy Istio Ambient Mesh Waypoint Proxies for L7 Authorization Policies — topology

The mesh splits into two planes that ambient deliberately keeps separate. The secure overlay (L4) is delivered by ztunnel, a Rust DaemonSet running one instance per node; it transparently captures pod traffic and gives every workload in an ambient namespace mTLS and identity (SPIFFE) without anything injected into the pod. Above it sits the L7 plane: a waypoint proxy — a normal Envoy deployment you scale and place yourself — that ztunnel routes through only for namespaces or services you have opted in. L4 authorization (who may connect to whom, on which port) lives in ztunnel; L7 authorization (which HTTP method, path, and JWT claim) lives in the waypoint. Traffic from a client pod flows: client → its node’s ztunnel → (if the destination is waypoint-enabled) the payments waypoint Envoy, where the AuthorizationPolicy and RequestAuthentication rules run → destination’s node ztunnel → destination pod. Everything is observed by Dynatrace (OneAgent + the Istio/Envoy integration scraping waypoint and ztunnel metrics) and Datadog as the second pane for the mesh dashboards; CrowdStrike Falcon sensors run on every node for runtime threat detection on the ztunnel and waypoint pods themselves.

1. Install the Gateway API and Istio in ambient mode

Waypoints are Gateway API objects, so those CRDs must exist before Istio. Install them, then install Istio with the ambient profile via Helm (the profile that wires up ztunnel and the CNI; the legacy istioctl install works too, but Helm is what your GitHub Actions / Argo CD pipeline will template).

# 1a. Gateway API CRDs (pinned, not 'latest')
kubectl apply -f \
  "https://github.com/kubernetes-sigs/gateway-api/releases/download/v1.2.0/standard-install.yaml"

# 1b. Istio Helm repo
helm repo add istio https://istio-release.storage.googleapis.com/charts
helm repo update

# 1c. Base CRDs into istio-system
kubectl create namespace istio-system
helm install istio-base istio/base -n istio-system --version 1.24.2 --wait

# 1d. The CNI in ambient mode (handles traffic redirection, replaces init-container hacks)
helm install istio-cni istio/cni -n istio-system --version 1.24.2 \
  --set profile=ambient --wait

# 1e. The istiod control plane, ambient profile
helm install istiod istio/istiod -n istio-system --version 1.24.2 \
  --set profile=ambient --wait

# 1f. ztunnel — the per-node L4 secure overlay DaemonSet
helm install ztunnel istio/ztunnel -n istio-system --version 1.24.2 --wait

Verify the data plane came up. You want istiod, the istio-cni-node DaemonSet, and the ztunnel DaemonSet all ready, one ztunnel pod per node:

kubectl get pods -n istio-system
kubectl get daemonset -n istio-system   # istio-cni-node and ztunnel: DESIRED == READY
istioctl version                        # control plane + data plane on 1.24.2

If you manage clusters as code (you should), this same release is expressed as a Helm release in Terraform (helm_release resources) or an Argo CD Application so the mesh version is GitOps-pinned and an upgrade is a reviewed pull request — not the hand-run, 3,000-pod restart that sidecar mode forced. Ansible handles any node-level prerequisites (kernel modules, the privileged-container policy) on self-managed nodes before the chart lands.

2. Enroll a namespace into the ambient data plane

Adding a workload to ambient is a single label on its namespace — no pod restart, no sidecar injection, no redeploy. This is the headline operational win: existing pods join the secure overlay in place.

# Opt the payments namespace into ambient (L4 mTLS via ztunnel)
kubectl label namespace payments istio.io/dataplane-mode=ambient

# Confirm — running pods are now in the mesh WITHOUT having restarted
kubectl get pods -n payments -o wide
istioctl ztunnel-config workloads --namespace payments

That last command lists every workload ztunnel now sees, each with a SPIFFE identity like spiffe://cluster.local/ns/payments/sa/accounts. At this point you already have mTLS between every pod in payments and L4 identity — but no L7 rules yet, and no waypoint. Sidecar mode could not give you this without injecting into and restarting all of them.

A quick proof that mTLS is live before any policy: exec into a client and call ledger; the connection is now encrypted and identity-bearing at L4 even though nothing changed in the pod spec.

kubectl exec -n payments deploy/accounts -- \
  curl -s -o /dev/null -w "%{http_code}\n" http://ledger:8080/healthz

3. Lock down L4 with a default-deny ztunnel policy

Before adding L7, establish a zero-trust L4 baseline: deny all traffic into payments, then explicitly allow only the identities that should connect. These AuthorizationPolicy resources with no to/HTTP rules are enforced by ztunnel (L4) — cheap, sidecar-free, and they apply mesh-wide regardless of waypoints.

# default-deny everything entering the payments namespace (L4)
apiVersion: security.istio.io/v1
kind: AuthorizationPolicy
metadata:
  name: payments-default-deny
  namespace: payments
spec:
  {}                      # empty spec == deny-all for the namespace
---
# allow only the 'accounts' service identity to reach 'ledger' on 8080
apiVersion: security.istio.io/v1
kind: AuthorizationPolicy
metadata:
  name: ledger-allow-accounts-l4
  namespace: payments
spec:
  selector:
    matchLabels:
      app: ledger
  action: ALLOW
  rules:
    - from:
        - source:
            principals: ["cluster.local/ns/payments/sa/accounts"]
      to:
        - operation:
            ports: ["8080"]

kubectl apply -f l4-policies.yaml
# A pod with a different service account is now refused at L4 by ztunnel:
kubectl run probe -n payments --rm -it --image=curlimages/curl --restart=Never -- \
  curl -s -o /dev/null -w "%{http_code}\n" http://ledger:8080/healthz   # expect connection reset / 000

L4 policy is necessary but blunt — it cannot say “only POST /ledger/v1/debit.” For that you need L7, and for L7 you need a waypoint.

4. Deploy a waypoint proxy for the namespace

istioctl waypoint generates a Gateway API Gateway resource of class istio-waypoint; istiod sees it and provisions a dedicated Envoy deployment. Bind it to the whole payments namespace so all services in it can carry L7 policy. Crucially, a waypoint is just a Deployment — you size and scale it like any service, the antithesis of one sidecar per pod.

# Generate and apply a namespace-scoped waypoint named 'payments-waypoint'
istioctl waypoint apply -n payments \
  --name payments-waypoint \
  --for service \
  --enroll-namespace          # label the namespace to route its services via this waypoint

# Inspect what was created (a Gateway + an Envoy Deployment/Service)
kubectl get gateway -n payments
kubectl get pods -n payments -l gateway.networking.k8s.io/gateway-name=payments-waypoint
istioctl waypoint list -n payments

--enroll-namespace stamps the namespace with istio.io/use-waypoint: payments-waypoint, so ztunnel now routes traffic destined for services in payments through this Envoy before delivery. To scope a waypoint to a single workload instead of the namespace, you would label just that service:

# Alternative: route ONLY the 'ledger' service through the waypoint
kubectl label service ledger -n payments istio.io/use-waypoint=payments-waypoint

Scale and pin the waypoint for production — it is in the request path for the sensitive services, so give it an HPA and a PodDisruptionBudget:

kubectl -n payments scale deploy/payments-waypoint --replicas=3
kubectl -n payments autoscale deploy/payments-waypoint --min=3 --max=10 --cpu-percent=70

5. Validate JWTs at the waypoint with RequestAuthentication

L7 authorization on a token requires Istio to first authenticate the JWT. RequestAuthentication tells the waypoint which issuer and JWKS to trust — here Microsoft Entra ID, the IdP that workforce logins from Okta are federated into, so a token minted for a user or a service principal validates natively. This resource only parses and verifies the token; it does not deny anything on its own.

apiVersion: security.istio.io/v1
kind: RequestAuthentication
metadata:
  name: payments-entra-jwt
  namespace: payments
spec:
  targetRefs:
    - kind: Service
      group: ""
      name: ledger
  jwtRules:
    - issuer: "https://login.microsoftonline.com/<TENANT_ID>/v2.0"
      jwksUri: "https://login.microsoftonline.com/<TENANT_ID>/discovery/v2.0/keys"
      audiences:
        - "api://payments-ledger"
      forwardOriginalToken: true     # pass the JWT on to the app for its own audit log

kubectl apply -f request-auth.yaml

A subtle but critical point: RequestAuthentication alone makes invalid tokens rejected but missing tokens allowed (the request is simply treated as unauthenticated). The deny happens in the next step.

6. Enforce the L7 AuthorizationPolicy

Now the payoff — an AuthorizationPolicy enforced by the waypoint (because it has HTTP to rules and JWT when conditions) that says: only the accounts workload identity, presenting a valid Entra JWT carrying scope ledger.debit, may call POST /ledger/v1/debit. Everything else is denied.

# 6a. Require a valid principal AND a valid request-principal (JWT) for any L7 access
apiVersion: security.istio.io/v1
kind: AuthorizationPolicy
metadata:
  name: ledger-require-jwt
  namespace: payments
spec:
  targetRefs:
    - kind: Service
      group: ""
      name: ledger
  action: DENY
  rules:
    - from:
        - source:
            notRequestPrincipals: ["*"]   # deny anything WITHOUT a valid JWT
---
# 6b. Allow ONLY accounts -> POST /ledger/v1/debit with the right scope
apiVersion: security.istio.io/v1
kind: AuthorizationPolicy
metadata:
  name: ledger-debit-allow
  namespace: payments
spec:
  targetRefs:
    - kind: Service
      group: ""
      name: ledger
  action: ALLOW
  rules:
    - from:
        - source:
            principals: ["cluster.local/ns/payments/sa/accounts"]
      to:
        - operation:
            methods: ["POST"]
            paths: ["/ledger/v1/debit"]
      when:
        - key: request.auth.claims[scp]
          values: ["ledger.debit"]

kubectl apply -f l7-authz.yaml
istioctl waypoint status -n payments     # policies programmed into the waypoint

You now have method-, path-, identity-, and claim-scoped authorization running in a standalone Envoy that touches only the services you opted in — and not one sidecar anywhere in the cluster.

Validation

Prove each rule does what you claimed. Mint two test tokens from Entra (the right-scope one and a wrong-scope one) — in a pipeline this is a client-credentials grant; locally use a saved token in $GOOD / $BAD.

# From the accounts pod (correct identity), WITH a valid scoped JWT -> 200
kubectl exec -n payments deploy/accounts -- sh -c \
  'curl -s -o /dev/null -w "%{http_code}\n" -X POST \
   -H "Authorization: Bearer '"$GOOD"'" http://ledger:8080/ledger/v1/debit'   # 200

# Same identity, NO token -> 403 (DENY from 6a)
kubectl exec -n payments deploy/accounts -- \
  curl -s -o /dev/null -w "%{http_code}\n" -X POST http://ledger:8080/ledger/v1/debit  # 403

# Valid token but WRONG scope -> 403 (fails the 'when' claim check)
kubectl exec -n payments deploy/accounts -- sh -c \
  'curl -s -o /dev/null -w "%{http_code}\n" -X POST \
   -H "Authorization: Bearer '"$BAD"'" http://ledger:8080/ledger/v1/debit'   # 403

# Correct identity + token but a method/path NOT allowed -> 403
kubectl exec -n payments deploy/accounts -- sh -c \
  'curl -s -o /dev/null -w "%{http_code}\n" -X DELETE \
   -H "Authorization: Bearer '"$GOOD"'" http://ledger:8080/ledger/v1/debit'  # 403

Watch the decisions live in the waypoint’s Envoy logs, and confirm the metrics are flowing to your observability stack:

# RBAC allow/deny decisions in the waypoint
kubectl logs -n payments deploy/payments-waypoint -f | grep -i "rbac"

# Envoy/waypoint metrics that Dynatrace and Datadog scrape
kubectl exec -n payments deploy/payments-waypoint -- \
  pilot-agent request GET stats/prometheus | grep -E "istio_requests_total|rbac"

In Dynatrace you should see the payments-waypoint service with per-route request counts and a denied-request rate; Datadog’s Istio integration shows the same istio_requests_total{response_code="403"} series, which you alert on. Wiz (and Wiz Code scanning the manifests in the repo before merge) flags any namespace labelled dataplane-mode=ambient that has no default-deny AuthorizationPolicy, or a waypoint exposed without a RequestAuthentication — posture gaps the YAML review should never let through.

Rollback / teardown

Ambient is reversible at every layer, in order of blast radius — peel off L7 first, then the namespace, then the mesh. Removing a waypoint instantly drops L7 enforcement but leaves L4 mTLS intact (ztunnel is untouched), which is exactly the graceful-degradation path you want during an incident.

# 1. Remove L7 policy + waypoint (L4 mTLS via ztunnel stays on)
kubectl delete authorizationpolicy ledger-debit-allow ledger-require-jwt -n payments
kubectl delete requestauthentication payments-entra-jwt -n payments
kubectl label namespace payments istio.io/use-waypoint-                # stop routing via waypoint
istioctl waypoint delete payments-waypoint -n payments

# 2. Remove the namespace from ambient entirely (back to plain pods, no restart)
kubectl delete authorizationpolicy --all -n payments
kubectl label namespace payments istio.io/dataplane-mode-

# 3. Full mesh uninstall (only if abandoning Istio)
helm uninstall ztunnel istiod istio-cni istio-base -n istio-system
kubectl delete namespace istio-system

Roll these back through the same Argo CD / GitHub Actions path you rolled them out with, so a teardown is an auditable revert and ServiceNow carries the change record — the mesh team raises a normal change ticket, and a guardrail trip (a spike in waypoint 403s, or Wiz finding an ambient namespace with no deny policy) auto-opens a ServiceNow incident rather than living only in a log line.

Common pitfalls

Forgetting the Gateway API CRDs. istioctl waypoint apply fails or the Gateway stays Unprogrammed if v1.2.0 of the Gateway API CRDs is not installed first. This is the single most common ambient mistake.
Expecting L7 policy without a waypoint. An AuthorizationPolicy with HTTP methods/paths is silently a no-op against a destination that has no waypoint — ztunnel only enforces L4. If a path-scoped rule “isn’t working,” check that the service is actually routed through a waypoint (istioctl waypoint status -n payments).
The missing-vs-invalid JWT gap. RequestAuthentication rejects bad tokens but lets no-token requests through as unauthenticated. You must add the explicit notRequestPrincipals: ["*"] DENY (step 6a) or anonymous callers sail past.
Waypoint as an unscaled singleton. It defaults to one replica and sits in the request path; under load it becomes a bottleneck or a SPOF. Give it an HPA and a PodDisruptionBudget (step 4).
Wrong targetRef granularity. Binding a policy to the namespace when you meant a service (or vice versa) changes what it covers. Match the targetRefs kind/name to the scope you enrolled the waypoint for.
audiences mismatch. If the Entra app registration’s exposed API URI does not equal the audiences value, every token is rejected with a confusing 401 at the waypoint. Pin them to the same string (api://payments-ledger).

Security notes

Ambient is zero-trust by construction: mTLS and SPIFFE identity for every enrolled workload via ztunnel, default-deny L4, and JWT-gated L7 only where it matters — all without sidecars to exploit or restart. Keep the trust boundary honest: validate tokens against Entra ID (federated from Okta for human callers) at the waypoint, never trust an unauthenticated request, and stamp policies to the narrowest identity + method + path + claim that works. Istio’s own CA issues and rotates the workload certificates, so HashiCorp Vault stays out of the mesh-cert path and is used only for the application secrets (third-party API tokens) the gated services consume. Run CrowdStrike Falcon sensors on every node so the ztunnel and waypoint pods themselves are under runtime threat detection, and let Wiz / Wiz Code continuously verify that no ambient namespace is missing its default-deny policy and no waypoint is missing a RequestAuthentication — the posture backstop behind the in-cluster controls. For any north-south traffic, Akamai terminates TLS and applies WAF/bot protection at the edge before requests reach the cluster’s ingress, with the waypoints enforcing east-west L7 authorization once inside.

Cost notes

The economic case is the whole point. Sidecar mode costs one Envoy per pod — at 3,000 pods, ~360 GB of memory and 3,000 proxy restarts per upgrade. Ambient costs one ztunnel per node (a few dozen, lightweight Rust) plus one waypoint Deployment per opted-in scope (here, three replicas for payments). On a forty-node cluster that is roughly 40 ztunnels + 3 waypoint pods versus 3,000 sidecars — a double-digit reduction in proxy memory and CPU, and upgrades that no longer restart application pods at all. You pay for L7 Envoys only where you enforce L7, so a cluster where ten of forty namespaces need HTTP policy runs ten small waypoint Deployments instead of meshing everything. Right-size each waypoint to its real RPS with the HPA above rather than over-provisioning, keep ztunnel on every node (it is the cheap, mandatory L4 layer), and treat any namespace that doesn’t need L7 as waypoint-free — the single biggest lever ambient gives you over the old sidecar bill.