Application Gateway for Containers: Gateway API on AKS with Traffic Splitting, mTLS, and Header Routing

If you have run AGIC (the Application Gateway Ingress Controller) at any scale, you already know its failure mode: a single pod reconciling the entire Application Gateway config, mutating an ARM resource on every Ingress change, and grinding through multi-minute control-plane updates while one noisy namespace starves everyone else. Application Gateway for Containers (AGC) is Microsoft’s clean break from that design. The data plane is a managed, regional, near-real-time proxy fleet; the control plane is the ALB Controller running inside your cluster; and — the change that reorganises everything — it speaks the Kubernetes Gateway API, not the legacy Ingress API. The result is routing that propagates in seconds, a blast radius scoped per Gateway instead of per gateway resource, and first-class weighted splitting, backend re-encryption, mTLS and header/path/query routing expressed as portable Kubernetes objects.

This guide walks the full production path and treats AGC as an operable system, not a demo. You will install the ALB Controller against workload identity, provision a managed AGC from an ApplicationLoadBalancer CRD, expose it with Gateway and HTTPRoute, then layer on weighted traffic splitting, BackendTLSPolicy re-encryption and mTLS, custom HealthCheckPolicy probes, and header/path/query routing — every command in the Gateway API variant. Because AGC has a precise mapping from Gateway API objects to AGC constructs, a precise set of RBAC roles, and a precise set of status conditions that tell you exactly where a request stalls, this is also a reference you keep open mid-incident: every CRD field, every role, every status.conditions value, every error string and every limit is laid out as a scannable table beside the prose and the YAML.

By the end you will stop treating AGC as a black box. When a route stays Accepted=False, a split lands 50/50 instead of 90/10, a re-encrypt leg throws 502, or the GatewayClass never goes Accepted, you will know which CRD field, which controller log line, or which federated-credential subject is the cause — and the exact kubectl or az command that confirms it. AGC does still support a legacy Ingress path, but if you are adopting AGC in 2026 you adopt Gateway API: that is where all the routing capability lives, and it is the only path Microsoft is investing behind.

What problem this solves

AGIC’s architecture made a category of pain inevitable. Every Ingress change anywhere in the cluster triggered a full Application Gateway configuration push to ARM, so a single team’s frequent deploys produced 4–7 minute propagation windows during which unrelated services saw stale routing. The controller was a single point of reconciliation; a malformed Ingress could wedge the whole pipeline; and because the gateway was a Standard_v2 ARM resource, every change paid ARM’s control-plane latency and throttling. Worse, sophisticated L7 routing (header matches, weighted canaries, per-service backend mTLS) was either impossible or expressed through a sprawl of annotations that no two engineers spelled the same way.

What breaks without AGC’s model: canary releases that should be a one-line weight edit become deploy-time gymnastics; a PCI requirement to re-encrypt to the cardholder workload gets quietly skipped because AGIC’s backend-mTLS story was awkward, leaving an audit finding waiting to happen; and the noisy-neighbour reconcile storms turn one team’s churn into everyone’s latency. Teams paper over it by sharding gateways (one AGIC per namespace, multiplying cost and operational surface) or by freezing deploys during business hours.

Who hits this: any platform team running ingress for more than a handful of namespaces on AKS, anyone who needs progressive delivery (Argo Rollouts / Flagger) wired to a real traffic-splitting data plane, and any regulated estate that must prove encryption all the way to the pod. AGC fixes the architecture: near-real-time programming from an in-cluster controller, per-Gateway blast radius, native weighted splits, and BackendTLSPolicy-driven re-encryption/mTLS — all in Gateway API manifests that travel with the workload.

To frame the whole field before the deep dive, here is what AGC changes versus AGIC, the symptom each change removes, and where in this article you act on it:

Dimension	AGIC (the old way)	AGC (this article)	Symptom it removes	Where you act
Data plane	`Standard_v2` Application Gateway (ARM)	Managed regional proxy fleet	ARM throttling on every change	§AGC architecture
Control plane	One pod mutating ARM per `Ingress`	ALB Controller writing a config plane	Single-point reconcile wedge	§ALB Controller install
Propagation	4–7 min full config push	Single-digit seconds	Stale routing during peer deploys	§First HTTPRoute
API	`Ingress` + annotation sprawl	Gateway API CRDs	Inconsistent annotation routing	§Gateway + HTTPRoute
Blast radius	Whole gateway per change	Per-`Gateway` object	Noisy-neighbour reconcile storms	§Enterprise scenario
Traffic split	Awkward / annotation-driven	Native `backendRefs` weights	Canary as deploy gymnastics	§Weighted splitting
Backend mTLS	Limited / awkward	`BackendTLSPolicy` + clientCert	PCI re-encrypt gap	§Backend TLS & mTLS

Learning objectives

By the end of this article you can:

Explain how AGC differs from AGIC at the data-plane and control-plane level, and choose managed vs bring-your-own (BYO) deployment for greenfield versus governed estates.
Install the ALB Controller against OIDC + workload identity, federating a user-assigned managed identity to the azure-alb-system:alb-controller-sa service account and granting the purpose-built AppGw for Containers Configuration Manager role.
Provision a managed AGC from the ApplicationLoadBalancer CRD against a delegated /24 subnet, and read its provisioning status.conditions to confirm success.
Expose workloads with the Gateway API: a Gateway with an HTTPS listener terminating a cert from a Secret, and HTTPRoute objects bound to it, reading Programmed, Accepted and ResolvedRefs to localise failures.
Ship weighted traffic splitting and canary ramps with backendRefs weights (including weight: 0 drain), driven by Argo Rollouts/Flagger through the Gateway API provider.
Enforce backend re-encryption and mTLS with BackendTLSPolicy (sni, verify, caCertificateRef, subjectAltName, clientCertificateRef) and tune health with HealthCheckPolicy.
Compose header, path and query routing with Gateway API matches and filters (URLRewrite, RequestHeaderModifier), and reason about match precedence.
Diagnose every common AGC failure — GatewayClass not Accepted, listener cert errors, skewed splits, re-encrypt 502s, cross-namespace ResolvedRefs=False — from the exact kubectl/az command that confirms each.

Prerequisites & where this fits

You need an AKS cluster you can administer, with OIDC issuer and workload identity enabled (we turn them on idempotently below), and a dedicated subnet — minimum /24, delegated to Microsoft.ServiceNetworking/trafficControllers — that AGC injects its data plane into. You should be comfortable with kubectl, Helm, and reading Kubernetes object status.conditions, and have az configured with rights to create identities and role assignments in the cluster’s resource groups. Familiarity with the Gateway API object model (GatewayClass / Gateway / HTTPRoute) is assumed at a conceptual level; if it is new, read Kubernetes Gateway API: HTTPRoute, Traffic Splitting & Ingress Migration first — this article is the Azure-managed implementation of exactly those primitives.

This sits in the AKS networking & ingress track. Upstream of it is the managed-Kubernetes decision in Understanding Managed Kubernetes: AKS vs EKS vs GKE Compared and the broader cluster-networking picture in Production AKS: Networking & Observability. It is the containers-native cousin of the classic data-plane covered in Application Gateway v2 with WAF, L7 Routing & TLS in Production and the end-to-end TLS patterns in Application Gateway with WAF, mTLS & End-to-End TLS. The identity mechanism it depends on is detailed in Azure Key Vault & Workload Identity for Secrets, and it pairs naturally with a mesh — compare with AKS Istio Service Mesh Add-on: mTLS, Ingress & Egress when you need pod-to-pod mTLS behind the gateway.

A quick map of who owns what during an AGC incident, so you escalate to the right team fast:

Layer	What lives here	Who usually owns it	Failure classes it can cause
DNS / client	CNAME to AGC FQDN, TLS	Frontend / SRE	No resolution; cert name mismatch
AGC data plane	Managed proxy, listeners, routing rules	Microsoft (managed)	502/503 if backend unhealthy; rule eval
ALB Controller	Reconciles CRDs → config plane	Platform / cluster team	Nothing programmed; `Accepted=False`
Workload identity	Federated cred, role assignment	Platform + identity	Controller 401; provisioning stalls
Delegated subnet	`/24`, `trafficControllers` delegation	Network team	AGC won’t inject; association fails
Gateway API CRDs	`Gateway`, `HTTPRoute`, policies	App + platform	Routing, splits, mTLS misconfig
Backend pods / Services	Workloads, TLS, health paths	App / dev team	Re-encrypt 502; probe eviction

Core concepts

Five mental-model shifts make every later step obvious.

The proxy is not an Application Gateway v2. AGC is a separate product backed by Microsoft.ServiceNetworking/trafficControllers, not Microsoft.Network/applicationGateways. There is no Standard_v2 SKU, no per-Ingress ARM mutation, and — critically — no WAF policy built in. Routing changes propagate in seconds because the controller writes to a managed config plane rather than re-deploying a gateway resource. If you need a WAF in front of AGC today, you place it upstream (Front Door, or a classic Application Gateway v2 fronting the AGC FQDN), not on the AGC itself.

The control plane lives in your cluster. The ALB Controller is a Helm-installed deployment in the azure-alb-system namespace. It watches Gateway API objects (and its own AGC CRDs), and programs the managed data plane. It authenticates to Azure as a user-assigned managed identity federated to its Kubernetes service account — no secrets, no service-principal passwords. The GatewayClass named azure-alb-external is what the chart registers; Accepted=True on it is your green light that the controller is alive and authorised.

Gateway API objects map cleanly onto AGC constructs. This mapping is worth memorising because every diagnosis traces back to it:

Gateway API object	AGC construct it becomes	Carries	Status to watch
`GatewayClass` (`azure-alb-external`)	The AGC integration itself	Controller binding	`Accepted=True`
`Gateway`	An AGC frontend + its listeners	Hostnames, ports, TLS	`Programmed=True`, an address
`HTTPRoute`	AGC routing rules	path/header/query matches	`Accepted`, `ResolvedRefs`
`backendRefs` with `weight`	A weighted traffic split	Relative weights	(reflected in `ResolvedRefs`)
`BackendTLSPolicy`	Backend re-encryption / mTLS	SAN, CA, client cert	`Accepted` on the policy
`HealthCheckPolicy`	Per-backend health probe	path, interval, codes	(reflected in backend health)

Two deployment flavours, two ownership models. AGC can be created and lifecycle-managed by the controller from an in-cluster CRD (managed mode), or provisioned by you via ARM/Bicep/Terraform with the controller only referencing it (BYO mode). Managed mode is faster and keeps everything in cluster manifests; BYO mode fits enterprises where a platform-networking team must own the AGC, its subnet delegation and its Private Link surface independently of any cluster. We deploy managed mode end to end, then show the BYO association, because most regulated estates land there.

Mode	Who creates the AGC + association	Lifecycle owner	Use when
Managed by ALB Controller	The controller, from an `ApplicationLoadBalancer` CRD	Kubernetes manifests	Greenfield, GitOps-driven, lifecycle in-cluster
Bring your own (BYO)	You, via ARM/Bicep/Terraform	The network team’s IaC	Central team owns subnet/RBAC/Private Link governance

Weights are relative, not percentages. A traffic split is multiple backendRefs under one HTTPRoute rule, each with a weight. AGC distributes requests proportionally to the sum — so 90/10 and 9/1 behave identically, and weight: 0 drains a backend to zero without deleting the ref (keeping rollback one edit away). Because propagation is near-real-time, a canary ramp is just a sequence of kubectl applys, which is exactly why Argo Rollouts and Flagger drive AGC through the Gateway API provider.

The vocabulary in one table

Before the deep sections, pin down every moving part. The glossary repeats these for lookup; this is the mental model side by side:

Term	One-line definition	Where it lives	Why it matters
AGC	Managed L7 proxy fleet (`trafficControllers`)	Azure (regional)	The data plane; replaces AGIC’s gateway
ALB Controller	In-cluster reconciler that programs AGC	`azure-alb-system` ns	No controller → nothing routes
`GatewayClass`	The `azure-alb-external` binding	Cluster-scoped	`Accepted=True` = controller live
`Gateway`	Frontend + listeners (hostnames, TLS)	App namespace	Emits the AGC FQDN address
`HTTPRoute`	Routing rules + weighted backends	App namespace	Where splits and matches live
`ApplicationLoadBalancer`	CRD that creates a managed AGC	Infra namespace	Managed-mode provisioning
`BackendTLSPolicy`	Re-encrypt / mTLS to pods	App namespace	End-to-end encryption
`HealthCheckPolicy`	Per-Service probe override	App namespace	Replaces default `GET /`
Workload identity	Federated SA → managed identity	Azure + cluster	How the controller authenticates
Delegated subnet	`/24` for `trafficControllers`	The VNet	Where AGC injects its data plane
`ReferenceGrant`	Cross-namespace ref permission	Target namespace	Lets a route reach a foreign Secret/Service
Weight	Relative share of a backend	`HTTPRoute` rule	`0` = drain; relative not %

AGC architecture, and how it differs from AGIC

Two architectural facts drive everything operational. First, the data plane is managed and regional: you never patch it, scale it, or pay ARM latency to change it. The controller writes a desired-state config and the fleet converges in seconds. Second, the control plane is in your cluster and identity-bound: the ALB Controller is the only thing with rights to program the AGC, and it earns those rights through a federated managed identity, not a stored secret.

The practical consequence is a different operational posture than AGIC. With AGIC you debugged ARM deployments and Application Gateway config; with AGC you debug Kubernetes objects and a controller. Here is the side-by-side that matters when you are deciding whether to migrate and what to expect:

Property	AGIC	AGC	Operational consequence
Backing resource	`Microsoft.Network/applicationGateways`	`Microsoft.ServiceNetworking/trafficControllers`	Different ARM API, different RBAC
Reconcile target	ARM gateway config	Managed config plane	Seconds vs minutes
API surface	`Ingress` + annotations	Gateway API CRDs	Portable, typed routing
WAF	Built-in WAF_v2 policy	None on AGC (put upstream)	WAF moves to Front Door / AppGW v2
Subnet	AppGW subnet	`/24` delegated to `trafficControllers`	New delegation requirement
Identity	AAD pod identity / MSI	Workload identity (federated)	Secretless, OIDC-based
Private exposure	Private frontend IP	Private Link to frontend	Private endpoint + Private DNS
Multi-tenancy	One gateway, shared config	Per-`Gateway`, scoped blast radius	One namespace can’t stall another
Splitting	Annotation / awkward	Native `backendRefs` weights	First-class canary

What you lose moving off AGIC is the integrated WAF and the familiarity of Ingress; what you gain is propagation speed, blast-radius isolation, and the full Gateway API routing vocabulary. The WAF gap is the one to plan for deliberately — most teams front AGC with Front Door (Premium, with managed WAF) or keep a thin Application Gateway v2 + WAF_v2 hop for the public edge, then let AGC own all the L7 routing and backend TLS inside. The capability decision in one grid:

If you need…	AGIC could…	AGC does…	Recommendation
Built-in WAF at the same hop	Yes	No	Front AGC with Front Door / AppGW v2 WAF
Sub-10s routing changes	No (4–7 min)	Yes	AGC, native
Per-namespace blast radius	No	Yes	AGC, one `Gateway` per team
Weighted canary, edit-to-shift	Awkward	Yes	AGC `backendRefs` weights
Backend mTLS to pods	Limited	Yes	AGC `BackendTLSPolicy`
Header/path/query routing	Annotation soup	Typed `matches`	AGC Gateway API
Central-team-owned data plane	Possible	Yes (BYO)	AGC BYO mode

Prerequisites and the ALB Controller install

You need OIDC issuer and workload identity on the cluster, plus the delegated subnet. Turn the cluster features on idempotently:

RG=rg-agc-prod
AKS=aks-agc-prod
LOCATION=eastus2

# Ensure OIDC + workload identity are on (idempotent on an existing cluster)
az aks update -g "$RG" -n "$AKS" \
  --enable-oidc-issuer \
  --enable-workload-identity

OIDC_ISSUER=$(az aks show -g "$RG" -n "$AKS" \
  --query "oidcIssuerProfile.issuerUrl" -o tsv)

The infrastructure prerequisites are unforgiving in specific ways — a /25 subnet or a missing delegation fails provisioning with errors that don’t always name the real cause. Confirm each against this checklist before you install anything:

Prerequisite	Exact requirement	How to verify	Failure if wrong
OIDC issuer	Enabled on the cluster	`az aks show --query oidcIssuerProfile.enabled`	Federation has no issuer to trust
Workload identity	Add-on enabled	`az aks show --query securityProfile.workloadIdentity`	SA token not projected; controller 401
Subnet size	`/24` minimum	`az network vnet subnet show --query addressPrefix`	AGC injection fails (too few IPs)
Subnet delegation	`Microsoft.ServiceNetworking/trafficControllers`	`... --query delegations`	Association cannot bind the subnet
Subnet emptiness	No conflicting resources	Subnet has free address space	Injection / association errors
Helm	v3.8+ for OCI charts	`helm version`	OCI `oci://` pull unsupported
kubelet identity / RBAC	`az` rights to assign roles	`az role assignment create` succeeds	Controller cannot be granted Config Manager

The controller authenticates as a user-assigned managed identity federated to its service account. Create the identity, grant it the purpose-built role on the node resource group (where the controller manages the AGC), and Network Contributor on the subnet’s resource group so it can join the delegated subnet:

IDENTITY=alb-controller-identity
az identity create -g "$RG" -n "$IDENTITY" -l "$LOCATION"

PRINCIPAL_ID=$(az identity show -g "$RG" -n "$IDENTITY" --query principalId -o tsv)
CLIENT_ID=$(az identity show -g "$RG" -n "$IDENTITY" --query clientId -o tsv)

MC_RG=$(az aks show -g "$RG" -n "$AKS" --query nodeResourceGroup -o tsv)
MC_RG_ID=$(az group show -n "$MC_RG" --query id -o tsv)

# The controller manages AGC inside the node resource group
az role assignment create \
  --assignee-object-id "$PRINCIPAL_ID" \
  --assignee-principal-type ServicePrincipal \
  --scope "$MC_RG_ID" \
  --role "AppGw for Containers Configuration Manager"

# Reader/Network Contributor on the subnet's resource group so it can join the delegated subnet
az role assignment create \
  --assignee-object-id "$PRINCIPAL_ID" \
  --assignee-principal-type ServicePrincipal \
  --scope "$(az group show -n "$RG" --query id -o tsv)" \
  --role "Network Contributor"

The AppGw for Containers Configuration Manager role is purpose-built for AGC. Do not substitute Contributor — least privilege here is auditable, and Microsoft scopes the built-in role exactly to the trafficControllers and association operations the controller needs. The exact roles, their scope, and why each is required:

Role	Scope	Why the controller needs it	Substitute?
`AppGw for Containers Configuration Manager`	Node resource group (or AGC scope in BYO)	Create/update AGC, frontends, associations, routing config	No — purpose-built, least privilege
`Network Contributor`	Subnet’s resource group	Join/associate the delegated subnet	Narrow to the subnet if your policy demands
`Reader` (implicit via above)	Same	Read VNet/subnet to validate delegation	Covered by Network Contributor

Federate the identity to the controller’s service account (namespace azure-alb-system, service account alb-controller-sa). The subject string must match exactly — a typo here is the single most common “controller starts but gets 401” cause:

az identity federated-credential create \
  --name alb-controller-fedcred \
  --identity-name "$IDENTITY" \
  -g "$RG" \
  --issuer "$OIDC_ISSUER" \
  --subject "system:serviceaccount:azure-alb-system:alb-controller-sa" \
  --audience api://AzureADTokenExchange

The federated-credential fields and the exact value each must take:

Field	Required value	Consequence if wrong
`--issuer`	The cluster’s OIDC issuer URL	Token issuer not trusted → 401
`--subject`	`system:serviceaccount:azure-alb-system:alb-controller-sa`	Subject mismatch → 401 (most common)
`--audience`	`api://AzureADTokenExchange`	Audience rejected → token exchange fails
`--identity-name`	The UAMI you created	Credential federated to wrong identity

Install the controller via Helm, passing the identity client ID. Pin the chart version explicitly so a helm upgrade is deliberate, not whatever floats at the tag:

az aks get-credentials -g "$RG" -n "$AKS" --overwrite-existing

helm upgrade --install alb-controller \
  oci://mcr.microsoft.com/application-lb/charts/alb-controller \
  --version 1.7.9 \
  --namespace azure-alb-system --create-namespace \
  --set albController.namespace=azure-alb-system \
  --set albController.podIdentity.clientID="$CLIENT_ID"

Confirm both the controller and its webhook are healthy, and that the GatewayClass is accepted, before going further:

kubectl get pods -n azure-alb-system
kubectl get gatewayclass azure-alb-external -o yaml | grep -A5 status:

azure-alb-external is the GatewayClass the chart registers; ACCEPTED=True on it is your green light. The Helm values you actually touch, and what each controls:

Helm value	Purpose	Default / typical	When to change
`albController.podIdentity.clientID`	The UAMI client ID for workload identity	(required)	Always set
`albController.namespace`	Namespace the controller runs in	`azure-alb-system`	Rarely; keep the default
`--version`	Chart/controller version	pin explicitly	Deliberate upgrades only
`albController.replicaCount`	Controller replicas (HA)	2	Raise for resilience, not throughput
`albController.logLevel`	Controller verbosity	info	`debug` while diagnosing

Provision the AGC (managed mode)

In managed mode you declare the AGC and its association as CRDs and let the controller build them. Create an infra namespace and the ApplicationLoadBalancer object, pointing at the delegated subnet:

SUBNET_ID=$(az network vnet subnet show \
  -g "$RG" --vnet-name vnet-agc --name subnet-alb \
  --query id -o tsv)

kubectl create namespace alb-infra

# alb.yaml
apiVersion: alb.networking.azure.io/v1
kind: ApplicationLoadBalancer
metadata:
  name: alb-prod
  namespace: alb-infra
spec:
  associations:
    - /subscriptions/<SUB_ID>/resourceGroups/rg-agc-prod/providers/Microsoft.Network/virtualNetworks/vnet-agc/subnets/subnet-alb

kubectl apply -f alb.yaml
# Watch provisioning; Deployment.Succeeded means the managed AGC + association exist
kubectl get applicationloadbalancer alb-prod -n alb-infra -o yaml | grep -A10 conditions

This step creates the actual Microsoft.ServiceNetworking/trafficControllers resource and a frontend in the node resource group. Provisioning takes a few minutes the first time. The ApplicationLoadBalancer spec fields and what each does:

Field	Meaning	Required	Notes
`spec.associations[]`	Full resource ID of the delegated subnet	Yes	The subnet must be `/24` + delegated
`metadata.namespace`	Infra namespace holding the CRD	Yes	Convention: `alb-infra`
`metadata.name`	Logical AGC name referenced by `Gateway` annotations	Yes	Used in `alb-name` annotation

The provisioning status.conditions you read to know where you are — this is the managed-mode equivalent of watching an ARM deployment:

Condition `type`	`status` you want	Meaning	If not
`Deployment`	`Succeeded` / `True`	AGC + association created	Check subnet delegation + Config Manager role
`Available`	`True`	Data plane reachable	Wait; then check controller logs
(any) `Reason`	`*Succeeded`	No error reason attached	A `Reason` like `SubnetDelegationMissing` names the fix

Expose a Gateway and the first HTTPRoute

The Gateway references the GatewayClass and ties to your ApplicationLoadBalancer via annotation. Here is an HTTPS listener terminating a cert from a Kubernetes Secret:

# gateway.yaml
apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
  name: gw-prod
  namespace: app
  annotations:
    alb.networking.azure.io/alb-namespace: alb-infra
    alb.networking.azure.io/alb-name: alb-prod
spec:
  gatewayClassName: azure-alb-external
  listeners:
    - name: https
      protocol: HTTPS
      port: 443
      hostname: "app.kloudvin.com"
      tls:
        mode: Terminate
        certificateRefs:
          - kind: Secret
            name: app-tls
      allowedRoutes:
        namespaces:
          from: Same

kubectl apply -f gateway.yaml
# AGC publishes a generated FQDN; read it back from the Gateway address
kubectl get gateway gw-prod -n app \
  -o jsonpath='{.status.addresses[0].value}{"\n"}'

Point your DNS CNAME for app.kloudvin.com at that generated FQDN (the *.fzXX.alb.azure.com name). The Gateway listener fields you set, and the choices behind each:

Listener field	Values	Default / typical	When to change	Gotcha
`protocol`	`HTTP`, `HTTPS`	`HTTPS` in prod	HTTP only for redirect listeners	HTTP listener serves cleartext
`port`	any TCP port	443 (HTTPS), 80 (HTTP)	Match your edge	Must align with DNS/clients
`hostname`	FQDN or wildcard	the app host	SNI-based routing	Empty = match all (loosens routing)
`tls.mode`	`Terminate`, `Passthrough`	`Terminate`	Passthrough for end-to-end at pod	Passthrough skips L7 routing
`tls.certificateRefs`	Secret ref(s)	`app-tls` Secret	Rotate by replacing the Secret	Secret must be in the listener ns
`allowedRoutes.namespaces.from`	`Same`, `All`, `Selector`	`Same`	Multi-team gateways	`All` widens attach surface

Now bind a route. The minimal HTTPRoute sends all traffic for the host to one Service:

# route-basic.yaml
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
  name: rt-app
  namespace: app
spec:
  parentRefs:
    - name: gw-prod
  hostnames:
    - "app.kloudvin.com"
  rules:
    - backendRefs:
        - name: app-svc
          port: 80

kubectl apply -f route-basic.yaml, and once status.parents[].conditions shows Accepted=True and ResolvedRefs=True, traffic flows. The reconcile is seconds, not the multi-minute ARM churn AGIC inflicted. The status conditions across the three objects — this is your single most-used diagnostic table:

Object	Condition	`True` means	Common `False` cause
`GatewayClass`	`Accepted`	Controller bound + authorised	Controller down / RBAC missing
`Gateway`	`Accepted`	Spec valid, class matched	Bad `gatewayClassName`
`Gateway`	`Programmed`	Data plane configured; has an address	Cert Secret missing; AGC not ready
`HTTPRoute`	`Accepted`	Route is valid + attached to parent	`parentRefs` wrong; not allowed by `Gateway`
`HTTPRoute`	`ResolvedRefs`	All `backendRefs`/Secret refs resolve	Service/Secret missing or cross-ns w/o `ReferenceGrant`

The HTTPRoute backendRefs fields you’ll set on every route:

`backendRef` field	Meaning	Required	Notes
`name`	Target `Service` name	Yes	Must exist in the route’s namespace (or grant cross-ns)
`port`	Service port	Yes	The Service’s exposed port, not the container’s
`weight`	Relative split share	No (default 1)	`0` drains; relative, not percent
`kind`	`Service` (default)	No	AGC routes to Services
`namespace`	Cross-namespace target	No	Requires a `ReferenceGrant` in the target ns

Weighted traffic splitting and canary

This is where Gateway API earns its keep. Splitting is native: multiple backendRefs under one rule, each with a weight. AGC distributes requests proportionally. A 90/10 canary:

# canary.yaml
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
  name: rt-canary
  namespace: app
spec:
  parentRefs:
    - name: gw-prod
  hostnames:
    - "app.kloudvin.com"
  rules:
    - backendRefs:
        - name: app-svc-stable
          port: 80
          weight: 90
        - name: app-svc-canary
          port: 80
          weight: 10

Weights are relative, not percentages — 90/10 and 9/1 behave identically. Progress a release by editing weights and re-applying; because propagation is near-real-time, a canary ramp is just a sequence of applies. Set a weight to 0 to drain a backend without deleting the ref, which keeps the rollback path one edit away. A typical ramp, what each step means, and the rollback at every stage:

Stage	stable / canary weights	Effective canary share	Watch before advancing	Rollback
Baseline	100 / 0	0%	Canary deployed, healthy, drained	(already safe)
Smoke	95 / 5	~5%	Error rate, p95 latency on canary	Set canary → 0
Early	80 / 20	~20%	Business metrics, saturation	Set canary → 0
Half	50 / 50	~50%	Full load parity	Set canary → 0
Cutover	0 / 100	100%	Soak, then retire stable	Swap weights back
Drain	100 / 0	0%	Decommission canary deployment	n/a

Because each step is one kubectl apply with sub-10s propagation, a controller (Argo Rollouts or Flagger) using the Gateway API provider can drive the whole ramp from metric analysis. The integration shape:

Tool	How it drives AGC	Weight mechanism	Promotion trigger
Argo Rollouts	Gateway API plugin edits the `HTTPRoute`	`backendRefs` weights	`AnalysisTemplate` metric checks pass
Flagger	Gateway API provider patches the route	`backendRefs` weights	Prometheus metric thresholds
Manual / GitOps	`kubectl apply` of weight edits	`backendRefs` weights	Human / pipeline gate

For the broader pattern and the metric-analysis side, see Progressive Delivery with Argo Rollouts: Canary Metrics. The split-specific failure modes you’ll actually hit:

Symptom	Likely cause	Confirm	Fix
Canary takes ~50% not 10%	A `backendRef` missing its `weight` (defaults to 1)	`kubectl get httproute -o yaml` shows no weight	Set explicit integer weights on every ref
Split ignored entirely	Two rules instead of one (more-specific match wins)	Inspect `rules[]` — weights must share a rule	Put weighted refs under one rule
Drain not draining	`weight: 0` not applied / typo	`ResolvedRefs` + the live YAML	Re-apply; confirm propagation
Sticky to one backend	Client/session affinity upstream	Sample many requests, not one	Don’t conclude from a single curl

Backend TLS, mTLS, and health probes

By default AGC speaks HTTP to your pods. For end-to-end encryption — re-encrypt to the backend — and for mTLS where AGC presents a client cert, you use BackendTLSPolicy targeting the Service. First, server-side re-encryption with hostname validation against a CA you trust:

# backend-tls.yaml
apiVersion: alb.networking.azure.io/v1
kind: BackendTLSPolicy
metadata:
  name: btls-app
  namespace: app
spec:
  targetRef:
    group: ""
    kind: Service
    name: app-svc
  default:
    sni: backend.app.svc.cluster.local
    ports:
      - port: 443
    clientCertificateRef:
      name: alb-client-cert        # omit for one-way TLS; include for mTLS
    verify:
      caCertificateRef:
        name: backend-ca
      subjectAltName: backend.app.svc.cluster.local

The clientCertificateRef is what turns this into mutual TLS: AGC presents that certificate to the backend, and a backend (an Istio sidecar, an NGINX terminating mTLS, etc.) validates it. Drop that field and you get standard one-way re-encryption. The verify block makes AGC validate the backend’s certificate against backend-ca and pin the SAN — skip it only in non-production. Every BackendTLSPolicy field, what it does, and the trade-off:

Field	What it does	Required	Omit when	Gotcha if wrong
`targetRef` (Service)	Attaches the policy to a Service	Yes	—	Wrong kind/name → policy never applies
`sni`	SNI sent to the backend	Yes for TLS	—	Mismatch with cert → handshake fail
`ports[].port`	Backend TLS port	Yes	—	Wrong port → connection refused
`verify.caCertificateRef`	CA that signs the backend cert	Prod: yes	non-prod only	Missing CA → cannot validate → 502
`verify.subjectAltName`	SAN to pin on the backend cert	Prod: yes	non-prod only	SAN mismatch → re-encrypt 502
`clientCertificateRef`	Client cert AGC presents (mTLS)	Only for mTLS	one-way TLS	Rotated out → backend rejects

The three backend-encryption postures, side by side, so you pick deliberately:

Posture	`verify`	`clientCertificateRef`	Use when	Security
Cleartext (default)	n/a	n/a	Internal, low-trust, non-regulated	None on the backend leg
One-way re-encrypt	present	absent	Most production; encrypt to pod	Server authenticated, encrypted
Mutual TLS (mTLS)	present	present	PCI/Zero-Trust; backend verifies AGC	Both ends authenticated

Health probes default to GET / on the backend port. Override per-Service with a HealthCheckPolicy:

# health.yaml
apiVersion: alb.networking.azure.io/v1
kind: HealthCheckPolicy
metadata:
  name: hc-app
  namespace: app
spec:
  targetRef:
    group: ""
    kind: Service
    name: app-svc
  default:
    interval: 5s
    timeout: 3s
    healthyThreshold: 1
    unhealthyThreshold: 3
    http:
      host: app.kloudvin.com
      path: /healthz
      match:
        statusCodes:
          - start: 200
            end: 299

Both policies attach by targetRef to the Service, so they travel with the workload, not the gateway — exactly the separation of concerns you want when app and platform teams own different manifests. The HealthCheckPolicy knobs and how to reason about each:

Field	What it does	Default	Typical	When to change
`interval`	Probe frequency	(managed)	5s	Faster detect vs more probe load
`timeout`	Per-probe timeout	(managed)	3s	Slow backends need headroom
`healthyThreshold`	Successes to mark healthy	1	1	Raise to debounce flapping
`unhealthyThreshold`	Failures to mark unhealthy	3	3	Lower for fast eviction
`http.path`	Probe path	`/`	`/healthz`	Always a shallow readiness path
`http.host`	Host header on the probe	(none)	the app host	Backends that route by host
`http.match.statusCodes`	Codes that count as healthy	200–399	200–299	Tighten to real success codes

The re-encryption/mTLS failure modes — this is where 502s hide:

Symptom	Root cause	Confirm	Fix
502 on the backend leg only	SAN/host mismatch vs backend cert	Controller events; `BackendTLSPolicy` `verify`	Pin `subjectAltName` to the cert SAN
502 “untrusted”	`caCertificateRef` wrong/missing	The referenced CA Secret content	Upload the correct backend root CA
Backend rejects AGC	`clientCertificateRef` rotated/invalid	Backend (sidecar) TLS logs	Re-issue the client cert Secret
Policy never applies	`targetRef` wrong kind/name	`kubectl describe backendtlspolicy`	Match `kind: Service` + exact name
All backends unhealthy	Probe path 5xx / wrong codes	`HealthCheckPolicy` + app `/healthz`	Shallow path; correct `statusCodes`

Header, path, and query routing

Gateway API matches give you composable L7 rules. Match types combine with AND semantics within a single match block; AGC evaluates more specific matches first, so ordering behaves intuitively. A few production patterns in one route:

# routing-advanced.yaml
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
  name: rt-advanced
  namespace: app
spec:
  parentRefs:
    - name: gw-prod
  hostnames:
    - "app.kloudvin.com"
  rules:
    # Beta cohort: header-based dark launch
    - matches:
        - headers:
            - name: x-cohort
              value: beta
              type: Exact
      backendRefs:
        - name: app-svc-beta
          port: 80

    # API v2 by path prefix, with the prefix rewritten off
    - matches:
        - path:
            type: PathPrefix
            value: /api/v2
      filters:
        - type: URLRewrite
          urlRewrite:
            path:
              type: ReplacePrefixMatch
              replacePrefixMatch: /
      backendRefs:
        - name: api-v2-svc
          port: 8080

    # Query-param routing for a debug build
    - matches:
        - queryParams:
            - name: debug
              value: "true"
              type: Exact
      backendRefs:
        - name: app-svc-debug
          port: 80

    # Default
    - backendRefs:
        - name: app-svc-stable
          port: 80

The header and query rules win over the catch-all because they are more specific. The URLRewrite filter strips /api/v2 before the request reaches api-v2-svc. You can also inject or strip headers with RequestHeaderModifier/ResponseHeaderModifier filters — the clean way to add X-Forwarded-* or correlation headers without touching app code. The match types and their type values:

Match type	`type` values	Matches on	Notes
`path`	`PathPrefix`, `Exact`, `RegularExpression`	URL path	`PathPrefix` is the common case
`headers`	`Exact`, `RegularExpression`	Request header value	Multiple headers AND together
`queryParams`	`Exact`, `RegularExpression`	Query string param	Useful for debug/feature toggles
`method`	`GET`/`POST`/…	HTTP method	Split read vs write paths

The filters you compose with matches, and what each does:

Filter	Purpose	Key fields	Typical use
`URLRewrite`	Rewrite path/host upstream	`path.ReplacePrefixMatch`, `hostname`	Strip an API prefix
`RequestHeaderModifier`	Add/set/remove request headers	`add`, `set`, `remove`	Inject correlation/forwarded headers
`ResponseHeaderModifier`	Add/set/remove response headers	`add`, `set`, `remove`	Security headers (HSTS)
`RequestRedirect`	Issue an HTTP redirect	`scheme`, `statusCode`, `hostname`	HTTP→HTTPS, host moves
`RequestMirror`	Mirror traffic to a second backend	`backendRef`	Shadow a new build (no client impact)

Match precedence, made explicit so route ordering never surprises you:

Rule shape	Wins over	Why
`Exact` path	`PathPrefix` path	Exact is more specific
Longer `PathPrefix`	Shorter `PathPrefix`	Longest prefix wins
Match with header + path	Match with path only	More match criteria = more specific
Any explicit match	Catch-all (no `matches`)	Catch-all is least specific

Bring-your-own AGC and Private Link

When a central network team owns the AGC, you provision it with IaC and the cluster only references it. Create the AGC, a frontend, and a subnet association with Bicep:

resource agc 'Microsoft.ServiceNetworking/trafficControllers@2023-11-01' = {
  name: 'agc-shared'
  location: location
}

resource frontend 'Microsoft.ServiceNetworking/trafficControllers/frontends@2023-11-01' = {
  parent: agc
  name: 'fe-prod'
  location: location
}

resource assoc 'Microsoft.ServiceNetworking/trafficControllers/associations@2023-11-01' = {
  parent: agc
  name: 'assoc-prod'
  location: location
  properties: {
    associationType: 'subnets'
    subnet: { id: subnetId }
  }
}

In BYO mode you skip the ApplicationLoadBalancer CRD and instead annotate the Gateway with the existing AGC’s frontend resource ID, granting the controller identity the Configuration Manager role on that AGC’s scope. For private exposure, the AGC frontend is reachable over Azure Private Link: create a private endpoint against the frontend and resolve its FQDN through a Private DNS zone, so the generated *.fzXX.alb.azure.com name resolves to a private IP inside the spoke. That keeps north-south traffic off the public internet while preserving the same Gateway API manifests. Managed vs BYO across the dimensions that decide it:

Dimension	Managed mode	BYO mode
AGC created by	ALB Controller (CRD)	Your IaC (ARM/Bicep/Terraform)
Lifecycle in	Kubernetes manifests	Network team’s pipeline
`Gateway` references it via	`alb-namespace` + `alb-name` annotations	The AGC frontend resource ID annotation
Config Manager role scope	Node resource group	The AGC’s resource scope
Subnet ownership	Convenient, cluster-adjacent	Central, governed
Private Link	Possible	First-class (team-owned)
Best for	Greenfield, GitOps	Regulated, segregated duties

The Azure resources behind an AGC, regardless of mode, so you can read them in the portal/ARM:

Resource type	Role	Created in
`Microsoft.ServiceNetworking/trafficControllers`	The AGC itself	Node RG (managed) / chosen RG (BYO)
`.../trafficControllers/frontends`	Listener entry point (the FQDN)	Same
`.../trafficControllers/associations`	Binds the delegated subnet	Same
Delegated subnet	`/24` for data-plane injection	Your VNet
Private endpoint + Private DNS zone	Private exposure of the frontend	Spoke VNet (optional)

Architecture at a glance

Read the diagram left to right as a request actually travels, with the control plane feeding in from the side. A client resolves app.kloudvin.com to the AGC-generated FQDN (a *.fzXX.alb.azure.com CNAME) and opens HTTPS on 443 to the AGC data plane — the managed, regional proxy fleet. The fleet’s frontend listener terminates TLS using the cert from the app-tls Secret, then a routing rule evaluates the HTTPRoute matches (path, header, query — more-specific wins) and lands the request on the weighted split: backendRefs to the stable Service (weight 90) and the canary Service (weight 10), where weight: 0 is the drain lever. From the split, traffic is re-encrypted on a fresh TLS leg to the backend pods on 443, with the BackendTLSPolicy pinning the SAN and, when clientCertificateRef is set, presenting a client cert for mTLS that the pod’s sidecar verifies. Off to the side, the control plane — the ALB Controller in azure-alb-system, authorised by a workload identity (a federated service account holding the Configuration Manager role) — watches the Gateway API CRDs and programs the data plane in seconds.

Notice how every numbered failure point maps to one CRD or identity object, which is the whole operational story: if badge 1 (the GatewayClass not Accepted) is red, nothing downstream is programmed; badge 2 is the listener/cert leg (Programmed=False, no address); badge 3 is a skewed split (a missing weight); badge 4 is the re-encrypt 502 (SAN/CA mismatch, or a cross-namespace Service without a ReferenceGrant); and badge 5 is mTLS/identity drift (a wrong federated subject, or a rotated client cert). The diagnostic method is to walk the path left to right, find the first object whose status.conditions isn’t True, and read the legend for that badge’s confirm-and-fix.

Real-world scenario

A fintech platform team I worked with ran AGIC fronting roughly 40 namespaces on one shared AKS cluster. Their pain was concrete and recurring: every Ingress change anywhere triggered a full Application Gateway config push, and a single team’s frequent deploys produced 4–7 minute propagation windows during which unrelated services saw stale routing. During those windows, a customer-facing payments microservice would intermittently route to a just-decommissioned pod set because the gateway hadn’t caught up — a Sev-2 they could not reliably reproduce. Worse, their PCI scope required re-encryption to the payment pods, but AGIC’s backend-mTLS story was awkward enough that they had quietly settled for TLS terminating at the edge and cleartext to the pod — an audit finding waiting to happen, and one the next QSA assessment would certainly flag.

They migrated to AGC in BYO mode so the network team kept ownership of the AGC, its delegated /24 subnet, and a Private Link frontend, all in Terraform. Each app namespace got its own Gateway bound to the shared AGC, which decoupled the reconcile blast radius — a deploy in one namespace no longer touched another’s routing, because the controller programs per-Gateway, not per gateway resource. The PCI gap closed with a BackendTLSPolicy carrying a clientCertificateRef, giving genuine mTLS from AGC to the payment service:

apiVersion: alb.networking.azure.io/v1
kind: BackendTLSPolicy
metadata:
  name: btls-payments
  namespace: payments
spec:
  targetRef:
    group: ""
    kind: Service
    name: payments-svc
  default:
    sni: payments.internal.kloudvin.com
    clientCertificateRef:
      name: agc-payments-client
    verify:
      caCertificateRef:
        name: payments-ca
      subjectAltName: payments.internal.kloudvin.com

The migration ran with both ingress paths live — AGC on a parallel hostname — and DNS weight-shifted over a week, so there was no cutover big bang. One real snag surfaced on day two: a shared app-tls Secret lived in a platform namespace while several Gateway listeners lived in team namespaces, so those listeners came up Programmed=False until they added a ReferenceGrant permitting the cross-namespace Secret reference. They also briefly chased a 502 on the payments leg that turned out to be a SAN mismatch — the cert’s SAN was payments.internal.kloudvin.com but an early BackendTLSPolicy pinned payments-svc.payments.svc.cluster.local; aligning subjectAltName to the cert fixed it in one apply.

The measurable outcome: routing propagation dropped from minutes to single-digit seconds, the noisy-neighbour reconcile storms disappeared, the intermittent payments mis-route Sev-2 stopped recurring, and the next QSA assessment recorded encryption all the way to the cardholder-data workload. Canary releases that used to be a deploy-time ritual became a weight edit driven by Argo Rollouts. The lesson on the wall: “AGC moves the gateway out of ARM and into Kubernetes objects — so every failure is now a status.conditions you can read, not an ARM deployment you wait on.”

Advantages and disadvantages

AGC’s managed-data-plane-plus-in-cluster-controller model is a clear win for ingress at scale, but it is not free of trade-offs — most notably the missing WAF. Weigh it honestly:

Advantages (why this model helps you)	Disadvantages (why it bites)
Near-real-time programming (seconds), no ARM throttling on every change	No built-in WAF — you must add Front Door / AppGW v2 upstream for L7 protection
Per-`Gateway` blast radius — one namespace can’t stall another’s routing	More moving parts (controller, federated identity, delegated subnet) to stand up correctly
Native weighted splitting — canary is a one-line weight edit	Gateway API + AGC CRDs are a learning curve vs familiar `Ingress`
First-class backend re-encryption and mTLS via `BackendTLSPolicy`	Cross-namespace refs need `ReferenceGrant` — an easy first-day trip-up
Secretless auth via workload identity (no SP passwords to rotate)	Federated-credential subject typos fail as opaque 401s
Policies (`BackendTLSPolicy`, `HealthCheckPolicy`) travel with the Service	Newer product — smaller community corpus than NGINX/AGIC
Portable, typed Gateway API manifests (multi-implementation)	Region availability and feature parity still maturing in places

The model is right for any AKS estate doing ingress for more than a few namespaces, anyone needing progressive delivery wired to a real splitting data plane, and regulated workloads that must prove encryption to the pod. It is less compelling if you need a WAF at the same hop with zero extra components (then a classic Application Gateway v2 + WAF_v2, possibly via AGIC, is simpler), or if your cluster runs a single app and the AGIC/Ingress familiarity outweighs AGC’s gains. The disadvantages are all manageable — but only if you stand the prerequisites up precisely, which is the point of the install section.

Hands-on lab

Stand up AGC end to end on an existing AKS cluster, ship a 90/10 split, verify it lands, then add header routing and tear down. Run in Cloud Shell (Bash) with kubectl pointed at a test cluster you can administer. This uses a small managed AGC and a couple of single-replica deployments — minutes of runtime, deleted at the end.

Step 1 — Variables and cluster features.

RG=rg-agc-lab
AKS=aks-agc-lab
LOC=eastus2
az aks update -g "$RG" -n "$AKS" --enable-oidc-issuer --enable-workload-identity -o table
OIDC=$(az aks show -g "$RG" -n "$AKS" --query oidcIssuerProfile.issuerUrl -o tsv)
az aks get-credentials -g "$RG" -n "$AKS" --overwrite-existing

Expected: the cluster shows oidcIssuerProfile.enabled = true and a securityProfile.workloadIdentity block.

Step 2 — Identity, role, federation, and the Helm install.

ID=alb-lab-id
az identity create -g "$RG" -n "$ID" -l "$LOC" -o table
PID=$(az identity show -g "$RG" -n "$ID" --query principalId -o tsv)
CID=$(az identity show -g "$RG" -n "$ID" --query clientId -o tsv)
MCID=$(az group show -n "$(az aks show -g "$RG" -n "$AKS" --query nodeResourceGroup -o tsv)" --query id -o tsv)
az role assignment create --assignee-object-id "$PID" --assignee-principal-type ServicePrincipal \
  --scope "$MCID" --role "AppGw for Containers Configuration Manager"
az identity federated-credential create --name alb-lab-fc --identity-name "$ID" -g "$RG" \
  --issuer "$OIDC" --subject "system:serviceaccount:azure-alb-system:alb-controller-sa" \
  --audience api://AzureADTokenExchange
helm upgrade --install alb-controller \
  oci://mcr.microsoft.com/application-lb/charts/alb-controller --version 1.7.9 \
  --namespace azure-alb-system --create-namespace \
  --set albController.podIdentity.clientID="$CID"

Step 3 — Confirm the controller and GatewayClass are healthy.

kubectl get pods -n azure-alb-system
kubectl get gatewayclass azure-alb-external \
  -o jsonpath='{.status.conditions[?(@.type=="Accepted")].status}{"\n"}'

Expected: controller pods Running, and Accepted prints True.

Step 4 — Provision a managed AGC against the delegated subnet.

kubectl create namespace alb-infra
SUBNET=$(az network vnet subnet show -g "$RG" --vnet-name vnet-agc-lab --name subnet-alb --query id -o tsv)
cat <<EOF | kubectl apply -f -
apiVersion: alb.networking.azure.io/v1
kind: ApplicationLoadBalancer
metadata: { name: alb-lab, namespace: alb-infra }
spec: { associations: [ "$SUBNET" ] }
EOF
kubectl get applicationloadbalancer alb-lab -n alb-infra -o jsonpath='{.status.conditions[*].type}{"\n"}'

Expected (after a few minutes): a Deployment condition reaching Succeeded.

Step 5 — Deploy two versioned backends and a Gateway, then split 90/10.

kubectl create namespace app
# stable + canary echo deployments that return their version on /version
kubectl create deployment app-stable -n app --image=mcr.microsoft.com/azuredocs/aks-helloworld:v1
kubectl create deployment app-canary -n app --image=mcr.microsoft.com/azuredocs/aks-helloworld:v2
kubectl expose deployment app-stable -n app --name=app-svc-stable --port=80 --target-port=80
kubectl expose deployment app-canary -n app --name=app-svc-canary --port=80 --target-port=80

cat <<'EOF' | kubectl apply -f -
apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
  name: gw-lab
  namespace: app
  annotations:
    alb.networking.azure.io/alb-namespace: alb-infra
    alb.networking.azure.io/alb-name: alb-lab
spec:
  gatewayClassName: azure-alb-external
  listeners:
    - { name: http, protocol: HTTP, port: 80, allowedRoutes: { namespaces: { from: Same } } }
---
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata: { name: rt-lab, namespace: app }
spec:
  parentRefs: [ { name: gw-lab } ]
  rules:
    - backendRefs:
        - { name: app-svc-stable, port: 80, weight: 90 }
        - { name: app-svc-canary, port: 80, weight: 10 }
EOF

Step 6 — Read the FQDN, confirm Programmed, and sample the split.

kubectl get gateway gw-lab -n app \
  -o jsonpath='{.status.addresses[0].value} {.status.conditions[?(@.type=="Programmed")].status}{"\n"}'
FQDN=$(kubectl get gateway gw-lab -n app -o jsonpath='{.status.addresses[0].value}')
for i in $(seq 1 50); do curl -s "http://${FQDN}/"; echo; done | sort | uniq -c

Expected: an FQDN plus True, and the 50-sample count lands roughly 45 stable / 5 canary (relative weights, so exact counts vary).

Step 7 — Add header routing for a beta cohort and confirm it.

kubectl patch httproute rt-lab -n app --type=json -p='[
  {"op":"add","path":"/spec/rules/0","value":{
    "matches":[{"headers":[{"name":"x-cohort","value":"beta","type":"Exact"}]}],
    "backendRefs":[{"name":"app-svc-canary","port":80}]}}]'
curl -s "http://${FQDN}/" -H "x-cohort: beta"   # should always hit canary (v2)

Expected: with the x-cohort: beta header you consistently reach the canary (v2) backend; without it you get the 90/10 split.

Step 8 — Teardown.

kubectl delete namespace app
kubectl delete applicationloadbalancer alb-lab -n alb-infra
kubectl delete namespace alb-infra
helm uninstall alb-controller -n azure-alb-system
az identity delete -g "$RG" -n "$ID"

The lab steps mapped to what each proves:

Step	What you did	What it proves
2	Federate identity + Helm install	Secretless control-plane auth
3	`GatewayClass Accepted=True`	The controller is live and authorised
4	`ApplicationLoadBalancer` CRD	Managed-mode AGC provisioning
5–6	`Gateway` + split, sample FQDN	Native weighted traffic split works
7	Header match patch	Composable L7 routing, seconds to apply
8	Delete CRDs + identity	Clean lifecycle in Kubernetes

Cost note. A managed AGC plus two single-replica pods for an hour is well under ₹100; deleting the namespaces, the ApplicationLoadBalancer, and the identity stops all AGC charges. The AKS cluster itself is the larger cost — reuse an existing test cluster rather than creating one for the lab.

Common mistakes & troubleshooting

This is the playbook — the part you bookmark. First as a scannable table for mid-incident, then the entries that bite hardest expanded with the full reasoning.

#	Symptom	Root cause	Confirm (exact cmd)	Fix
1	`GatewayClass azure-alb-external` never `Accepted`	Controller down, or identity lacks Config Manager role	`kubectl get pods -n azure-alb-system`; `kubectl describe gatewayclass azure-alb-external`	Fix Helm install; grant Config Manager on the AGC scope
2	Controller pod runs but logs 401 / token errors	Federated-credential subject wrong, or workload identity off	`kubectl logs deploy/alb-controller -n azure-alb-system`; `az identity federated-credential show`	Re-create fedcred with exact `system:serviceaccount:azure-alb-system:alb-controller-sa`
3	`ApplicationLoadBalancer` stuck, no AGC created	Subnet not `/24` or not delegated; missing role	`kubectl get applicationloadbalancer -o yaml` (Reason); `az network vnet subnet show --query delegations`	Delegate subnet to `trafficControllers`; size `/24`; grant role
4	`Gateway` has no address, `Programmed=False`	TLS Secret missing, or AGC not ready	`kubectl describe gateway gw-prod -n app`	Create the `app-tls` Secret in the listener ns; wait for AGC
5	`HTTPRoute` `Accepted=False`	`parentRefs` wrong, or `Gateway` `allowedRoutes` disallows it	`kubectl get httproute -o yaml` (parents conditions)	Fix `parentRefs`; widen `allowedRoutes.namespaces`
6	`HTTPRoute` `ResolvedRefs=False`	Service/Secret missing, or cross-namespace without grant	`kubectl describe httproute rt-app -n app`	Create the Service; add a `ReferenceGrant` in the target ns
7	Split lands ~50/50 not 90/10	A `backendRef` missing its `weight` (defaults 1)	`kubectl get httproute -o yaml`	Set explicit integer weights on every ref
8	502 only on the re-encrypt leg	`BackendTLSPolicy` SAN/CA mismatch	controller events; the cert’s actual SAN	Pin `subjectAltName` to the cert SAN; correct `caCertificateRef`
9	Backend rejects AGC (mTLS)	`clientCertificateRef` rotated/invalid	backend (sidecar) TLS logs	Re-issue the client cert Secret
10	All backends marked unhealthy → 503	Probe path 5xx or wrong `statusCodes`	`HealthCheckPolicy`; curl the app `/healthz`	Shallow path; correct `match.statusCodes`
11	Header/query route never matches	More-specific catch-all, or wrong `type`	`kubectl get httproute -o yaml` rules order	Use `Exact`/`RegularExpression`; rely on specificity
12	Routing changes don’t propagate	Watching the wrong object; controller wedged	`kubectl logs deploy/alb-controller -n azure-alb-system`	Restart controller; check it reconciles the object
13	Private FQDN resolves publicly	Private DNS zone not linked, no private endpoint	`nslookup` the AGC FQDN from the spoke	Create PE on the frontend; link Private DNS zone
14	Cert rotation didn’t take effect	Replaced cert content but not the referenced Secret	`kubectl get secret app-tls -o yaml`; Gateway events	Update the exact Secret the listener references

The expanded form, for the entries that cost the most time:

1. GatewayClass azure-alb-external never reaches Accepted. Root cause: the ALB Controller isn’t running, or its identity lacks the AppGw for Containers Configuration Manager role, so it can’t bind the class. Confirm: kubectl get pods -n azure-alb-system (are they Running?), then kubectl describe gatewayclass azure-alb-external for the condition reason. Fix: re-check the Helm install (--set albController.podIdentity.clientID correct) and that the role assignment landed on the right scope (node RG for managed, AGC scope for BYO).

2. The controller pod runs but its logs show 401 / token-exchange errors. Root cause: the federated-credential subject is wrong (the single most common cause), or workload identity isn’t actually enabled so the SA token isn’t projected. Confirm: kubectl logs deploy/alb-controller -n azure-alb-system shows AAD token failures; az identity federated-credential show and compare the subject to system:serviceaccount:azure-alb-system:alb-controller-sa exactly. Fix: re-create the federated credential with the exact subject, issuer, and api://AzureADTokenExchange audience; confirm --enable-workload-identity on the cluster.

3. The ApplicationLoadBalancer CRD applies but no AGC is created. Root cause: the subnet is smaller than /24 or not delegated to Microsoft.ServiceNetworking/trafficControllers, or the controller lacks rights on the subnet’s RG. Confirm: kubectl get applicationloadbalancer alb-prod -n alb-infra -o yaml and read the condition Reason (e.g. SubnetDelegationMissing); az network vnet subnet show --query "{prefix:addressPrefix, deleg:delegations}". Fix: delegate the subnet, size it /24+, and ensure Network Contributor on the subnet’s RG.

6. HTTPRoute shows ResolvedRefs=False. Root cause: a referenced Service or TLS Secret doesn’t exist, or it lives in another namespace without a ReferenceGrant permitting the reference. Confirm: kubectl describe httproute rt-app -n app names the unresolved ref; check the Service/Secret exists in the expected namespace. Fix: create the missing object, or add a ReferenceGrant in the target namespace allowing the route/listener’s namespace to reference it:

apiVersion: gateway.networking.k8s.io/v1beta1
kind: ReferenceGrant
metadata:
  name: allow-app-to-platform-tls
  namespace: platform           # the namespace that OWNS the Secret/Service
spec:
  from:
    - group: gateway.networking.k8s.io
      kind: Gateway
      namespace: app            # the namespace that REFERENCES it
  to:
    - group: ""
      kind: Secret
      name: app-tls

7. The split lands ~50/50 when you configured 90/10. Root cause: one backendRef is missing its weight and defaults to 1, so a 90 against a defaulted-1 is not what you think, or — more often — the two backends ended up in separate rules (most-specific-match wins, no split). Confirm: kubectl get httproute rt-canary -n app -o yaml and verify both refs share one rules[] entry and both carry explicit integer weights. Fix: put both weighted refs under a single rule with explicit weights; remember weights are relative.

8. A 502 appears only on the re-encrypt leg, not before the gateway. Root cause: BackendTLSPolicy verify doesn’t match the backend cert — wrong subjectAltName, or a caCertificateRef that doesn’t sign the backend’s cert. Confirm: controller events on the BackendTLSPolicy; inspect the backend cert’s real SAN (e.g. openssl s_client against the pod) and compare. Fix: set subjectAltName to the cert’s actual SAN and caCertificateRef to the CA that signed it.

12. Routing changes stop propagating. Root cause: the controller is wedged (a bad object earlier in the watch, or it lost its lease), or you’re editing an object the route doesn’t actually parent to. Confirm: kubectl logs deploy/alb-controller -n azure-alb-system for reconcile errors; confirm the HTTPRoute parentRefs points at the live Gateway. Fix: fix or remove the offending object; if needed restart the controller deployment to force a clean reconcile.

Best practices

Adopt Gateway API, not the legacy Ingress path. All AGC routing capability — splits, header matches, backend mTLS — lives in Gateway API. Starting on Ingress means migrating later.
One Gateway per team/namespace. This is what buys you the per-Gateway blast radius; sharing one Gateway across teams re-creates AGIC’s noisy-neighbour coupling.
Pin the chart and controller version. helm upgrade --version <x> so upgrades are deliberate; never track a floating tag in production.
Use the purpose-built role, scoped tight. AppGw for Containers Configuration Manager on the AGC scope, not Contributor — least privilege here is auditable.
Always set explicit integer weights on every backendRef in a split, even the stable one — a defaulted weight is the classic skewed-canary bug.
Keep a weight: 0 drain in the manifest for the previous version during a rollout, so rollback is one edit, not a redeploy.
Re-encrypt to the pod with verify on in production. caCertificateRef + subjectAltName pinned; cleartext-to-pod is a finding in any regulated estate.
Put the WAF upstream deliberately. AGC has none — front it with Front Door (managed WAF) or a thin AppGW v2 + WAF_v2 hop; decide this at design time, not after a pen test.
Set a shallow HealthCheckPolicy path (/healthz, tight statusCodes) per Service; the default GET / can mark healthy backends unhealthy.
Pre-empt cross-namespace refs with ReferenceGrant. If a shared TLS Secret or Service lives elsewhere, the grant must exist or listeners come up Programmed=False.
Drive canaries from metrics, not a human. Wire Argo Rollouts/Flagger via the Gateway API provider so weight ramps gate on real error/latency analysis.
Read status.conditions first, always. GatewayClass.Accepted → Gateway.Programmed → HTTPRoute.Accepted/ResolvedRefs localises 90% of failures before you touch a log.

The defaults to override on every new AGC, and what each prevents:

Default	Override to	Prevents
HTTP to backend (cleartext)	`BackendTLSPolicy` re-encrypt + `verify`	Plaintext-to-pod audit finding
`GET /` health probe	`HealthCheckPolicy` `/healthz`	Healthy backends marked unhealthy
Defaulted `backendRef` weight	Explicit integer weights	Skewed canary splits
`allowedRoutes: All` (if set)	`Same` / a `Selector`	Unintended route attachment
Floating chart tag	Pinned `--version`	Surprise controller upgrades
No `ReferenceGrant`	Pre-created grants	`Programmed=False` on shared Secrets

Security notes

Secretless control-plane auth. The ALB Controller authenticates via workload identity (a federated service account), so there are no service-principal passwords to store or rotate. Keep the federation subject exact and the identity least-privileged.
Least-privilege role, scoped to the AGC. Grant AppGw for Containers Configuration Manager on the AGC’s resource scope only — not subscription-wide Contributor. In BYO mode this lets the network team hand the controller exactly the rights it needs and no more.
Encrypt to the pod, and verify the backend. Use BackendTLSPolicy with verify (caCertificateRef + subjectAltName) so AGC authenticates the backend cert; add clientCertificateRef for true mTLS where the backend must also authenticate AGC (PCI/Zero-Trust).
Private exposure via Private Link. For internal-only services, put a private endpoint on the AGC frontend and resolve its FQDN through a Private DNS zone, keeping north-south traffic off the public internet.
No WAF on AGC — plan the edge. Because AGC has no built-in WAF, front it with Front Door Premium (managed WAF) or a classic Application Gateway v2 + WAF_v2 to inspect L7 before traffic reaches AGC; never assume AGC is filtering OWASP-class attacks.
Guard the TLS Secrets. Listener and client-cert Secrets are sensitive; scope them with RBAC, prefer syncing from Key Vault (see AKS Secrets Store CSI: Key Vault Sync & Rotation), and use ReferenceGrant rather than copying Secrets across namespaces.
Constrain route attachment. Use allowedRoutes.namespaces (Same or a label Selector) so only intended namespaces can bind routes to a Gateway, preventing a foreign namespace from attaching an unwanted route.

The security controls that also harden routing, mapped to what each defends and prevents:

Control	Mechanism	Secures against	Also prevents
Workload identity	Federated SA → UAMI	Stored SP secrets	Credential-rotation breakage
Scoped Config Manager role	Built-in role on AGC scope	Over-privileged controller	Accidental cross-resource changes
`BackendTLSPolicy` + `verify`	Re-encrypt + CA/SAN pin	Cleartext-to-pod, MITM	Backend cert drift going unnoticed
Client cert (mTLS)	`clientCertificateRef`	Unauthenticated upstream to backend	Spoofed gateway traffic
Private Link frontend	PE + Private DNS	Public exposure	DNS leakage of internal services
`allowedRoutes` scoping	`Same` / `Selector`	Foreign route attachment	Route hijack across teams
Upstream WAF	Front Door / AppGW v2	OWASP-class L7 attacks	(AGC has no WAF of its own)

Cost & sizing

The cost model is fundamentally different from AGIC’s per-gateway-hour-plus-capacity-units billing, and far simpler to reason about once you separate the data plane from what surrounds it.

AGC data plane bills on a managed-resource basis (an hourly component plus usage), independent of how many Gateway/HTTPRoute objects you create — so one AGC serving 40 namespaces is dramatically cheaper than 40 sharded AGICs, which was a real cost driver for teams that sharded gateways to dodge AGIC’s blast radius.
The AKS cluster is the larger line item and is unchanged by AGC; the ALB Controller is two small pods (negligible CPU/RAM).
Upstream WAF, if you add Front Door Premium or an Application Gateway v2 + WAF_v2 for L7 protection, is usually the biggest added cost of an AGC design — budget it deliberately, because it’s the price of the WAF AGC doesn’t include.
Private Link (a private endpoint on the frontend) adds a small hourly + per-GB charge when you expose AGC privately.
Data processing / egress scales with traffic as on any L7 proxy; re-encryption and mTLS add negligible cost (CPU on the managed fleet, which you don’t pay per-cycle).

The cost drivers and what each one buys you:

Cost driver	What you pay for	Rough INR / month	What it buys	Watch-out
AGC data plane	Managed proxy (hourly + usage)	~₹3,000–8,000 (traffic-dependent)	The whole L7 ingress fleet	Usage scales with traffic
ALB Controller	2 small pods on AKS	negligible	The control plane	Counts against node capacity
Front Door Premium (WAF)	Edge + managed WAF	~₹25,000+	OWASP protection AGC lacks	Often the biggest added cost
AppGW v2 + WAF_v2 (alt)	Gateway-hour + capacity units	~₹15,000–30,000	WAF at a thin edge hop	Reintroduces an ARM gateway
Private Link	PE hourly + per-GB	~₹1,500–3,000	Private-only exposure	Per-endpoint, per-spoke
Data processing / egress	Per-GB through the proxy	traffic-dependent	(the traffic itself)	Spikes during incidents/sales

Sizing rule of thumb: one AGC per cluster (or per environment) serves many teams via per-namespace Gateways — you almost never need multiple AGCs for capacity, only for hard isolation or BYO governance. Consolidating off sharded AGICs onto a single AGC was, for the fintech team above, a net cost reduction even after adding a Front Door WAF hop, because they collapsed dozens of gateway resources into one managed data plane. For broader AKS cost levers, see Kubernetes Cost Allocation & Rightsizing with Kubecost.

Interview & exam questions

1. How does Application Gateway for Containers differ architecturally from AGIC? AGIC ran a single in-cluster pod that mutated a Standard_v2 Application Gateway ARM resource on every Ingress change, producing multi-minute propagation. AGC has a managed, regional proxy data plane and an in-cluster ALB Controller that programs it via a config plane in seconds, speaks the Gateway API instead of Ingress, scopes blast radius per-Gateway, and has no built-in WAF.

2. Why does AGC use the Gateway API rather than the Ingress API? Because all of AGC’s routing capability — weighted traffic splitting via backendRefs weights, header/path/query matches, BackendTLSPolicy re-encryption and mTLS, HealthCheckPolicy — maps onto typed Gateway API objects, avoiding the annotation sprawl Ingress required. Gateway API is also portable across implementations and is where Microsoft is investing.

3. How does the ALB Controller authenticate to Azure? Via workload identity: a user-assigned managed identity is federated to the controller’s Kubernetes service account (azure-alb-system:alb-controller-sa), and the controller exchanges the projected SA token for an Azure token — no service-principal secret. The identity holds the AppGw for Containers Configuration Manager role on the AGC scope.

4. What are managed mode and BYO mode, and when do you pick each? In managed mode the controller creates the AGC and its subnet association from an ApplicationLoadBalancer CRD — best for greenfield, GitOps-driven estates. In BYO mode a central team provisions the AGC via IaC and the cluster only references it — best when a platform-networking team must own the AGC, subnet delegation, and Private Link independently of any cluster.

5. How do you ship a 90/10 canary on AGC, and what does weight: 0 do? Put two backendRefs (stable and canary) under one HTTPRoute rule with weight: 90 and weight: 10; AGC distributes proportionally. Weights are relative, not percentages. Setting a backend’s weight to 0 drains it to zero traffic without deleting the ref, keeping rollback one edit away.

6. A route shows ResolvedRefs=False. What are the two most likely causes? Either a referenced Service or TLS Secret doesn’t exist in the route’s namespace, or it lives in another namespace without a ReferenceGrant permitting the cross-namespace reference. Confirm with kubectl describe httproute; fix by creating the object or adding a ReferenceGrant in the target namespace.

7. How do you enforce mTLS from AGC to a backend pod? Apply a BackendTLSPolicy targeting the Service with a clientCertificateRef (the cert AGC presents) plus a verify block (caCertificateRef + subjectAltName) so AGC also validates the backend. Drop clientCertificateRef for one-way re-encryption; keep verify on in production either way.

8. The GatewayClass azure-alb-external never reaches Accepted. What do you check? Whether the ALB Controller pods are Running in azure-alb-system, and whether its identity holds AppGw for Containers Configuration Manager on the AGC scope. A controller that’s down or unauthorised can’t bind the class. Then check kubectl describe gatewayclass for the condition reason.

9. Does AGC include a WAF? If not, how do you protect L7? No — AGC has no built-in WAF. You place protection upstream: Front Door Premium (managed WAF) or a classic Application Gateway v2 + WAF_v2 hop in front of the AGC FQDN, letting AGC own routing and backend TLS while the edge does OWASP-class inspection.

10. A split you set to 90/10 is landing ~50/50. What’s wrong? Most likely one backendRef is missing its weight (defaulting to 1) or the two backends ended up in separate rules[] (so more-specific-match wins and there’s no split at all). Put both weighted refs under a single rule with explicit integer weights.

11. How does AGC achieve near-real-time routing changes when AGIC took minutes? AGC’s controller writes desired state to a managed config plane and the regional proxy fleet converges in seconds, rather than re-deploying an ARM Application Gateway resource (which paid ARM control-plane latency and throttling on every change as AGIC did).

12. What subnet requirements does AGC impose? A dedicated subnet of at least /24, delegated to Microsoft.ServiceNetworking/trafficControllers, into which AGC injects its data plane. A smaller or undelegated subnet fails provisioning (often surfaced as an ApplicationLoadBalancer condition reason).

These map primarily to AZ-700 (Designing and Implementing Azure Networking) — load balancing and application delivery — and the CKA/CKAD Gateway API and services/networking domains, with the workload-identity mechanics touching AZ-500. A compact cert-mapping for revision:

Question theme	Primary cert	Objective area
AGC vs AGIC, AGC architecture	AZ-700	Design & implement application delivery
Gateway API objects, routing, splits	CKA / CKAD	Services & networking; Gateway API
Workload identity, federation, roles	AZ-500 / AZ-700	Secure identity; secretless access
BackendTLSPolicy, mTLS, re-encrypt	AZ-700 / AZ-500	Secure connectivity; encryption in transit
BYO mode, Private Link, subnet delegation	AZ-700	Hybrid/private connectivity

Quick check

AGC has no built-in WAF. Name two ways to add L7 protection in front of it.
You set a 90/10 split but traffic lands ~50/50. What is the single most likely misconfiguration?
A Gateway listener’s TLS Secret lives in a different namespace and the listener is Programmed=False. What object fixes it?
How does the ALB Controller authenticate to Azure, and what is the one string that most often breaks it?
What does setting a backend’s weight to 0 accomplish, and why is it useful during a rollout?

Answers

Front AGC with Front Door Premium (managed WAF) or a thin Application Gateway v2 + WAF_v2 hop pointed at the AGC FQDN. AGC owns routing/backend TLS; the upstream hop does OWASP-class inspection.
A backendRef is missing its explicit weight (defaulting to 1), or the two backends are in separate rules[] so there is no split at all. Put both weighted refs under one rule with explicit integer weights.
A ReferenceGrant in the target namespace (the one that owns the Secret), permitting the listener’s namespace and Gateway kind to reference that Secret. Without it, cross-namespace refs are denied and the listener stays Programmed=False.
Via workload identity — a user-assigned managed identity federated to the azure-alb-system:alb-controller-sa service account. The string that most often breaks it is the federated-credential subject, which must be exactly system:serviceaccount:azure-alb-system:alb-controller-sa.
It drains the backend to zero traffic without deleting the backendRef, so the route and the backend stay in the manifest and rollback is a one-line weight edit rather than a redeploy.

Glossary

Application Gateway for Containers (AGC) — Azure’s managed, regional L7 proxy fleet for Kubernetes ingress, backed by Microsoft.ServiceNetworking/trafficControllers; the successor to AGIC.
AGIC — Application Gateway Ingress Controller; the older model that mutated a Standard_v2 Application Gateway ARM resource per Ingress.
ALB Controller — the in-cluster Helm-installed controller (namespace azure-alb-system) that programs the AGC data plane from Gateway API and AGC CRDs.
Gateway API — the Kubernetes API (GatewayClass/Gateway/HTTPRoute + policies) that AGC implements; replaces Ingress for AGC routing.
GatewayClass (azure-alb-external) — the class the ALB Controller registers; Accepted=True means the controller is live and authorised.
Gateway — a Gateway API object defining listeners (hostnames, ports, TLS); becomes an AGC frontend and emits the AGC FQDN address.
HTTPRoute — a Gateway API object defining routing rules (path/header/query matches), weighted backendRefs, and filters; becomes AGC routing rules.
ApplicationLoadBalancer (CRD) — the AGC-specific CRD that, in managed mode, makes the controller create the AGC and its subnet association.
BackendTLSPolicy — an AGC policy attaching to a Service to re-encrypt to the backend (verify with caCertificateRef/subjectAltName) and optionally present a client cert for mTLS (clientCertificateRef).
HealthCheckPolicy — an AGC policy overriding the default GET / backend probe with a path, interval, timeout, and accepted status codes.
ReferenceGrant — a Gateway API object in a target namespace that permits objects in another namespace to reference its Secrets/Services (required for cross-namespace refs).
Workload identity — federated authentication where a user-assigned managed identity is bound to a Kubernetes service account, exchanging the SA token for an Azure token (no secrets).
Federated credential subject — the system:serviceaccount:<ns>:<sa> string that ties the managed identity to the controller’s service account; must match exactly.
AppGw for Containers Configuration Manager — the purpose-built Azure role the controller identity needs on the AGC scope to program it.
Managed mode / BYO mode — whether the controller creates the AGC from a CRD (managed) or references an IaC-provisioned AGC (BYO).
Weight (relative) — a backendRef’s share of traffic, proportional to the sum of weights in the rule; 0 drains a backend without removing it.
Delegated subnet — the /24+ subnet delegated to Microsoft.ServiceNetworking/trafficControllers that AGC injects its data plane into.
Private Link frontend — a private endpoint on the AGC frontend, resolved via a Private DNS zone, exposing AGC on a private IP inside a spoke.

Next steps

You can now stand up AGC on AKS, drive ingress through Gateway API, and ship splits, mTLS and header routing in production. Build outward:

Next: Kubernetes Gateway API: HTTPRoute, Traffic Splitting & Ingress Migration — the portable API model AGC implements, and how to migrate off Ingress.
Related: Application Gateway v2 with WAF, L7 Routing & TLS in Production — the classic data plane and the WAF you’ll often front AGC with.
Related: Progressive Delivery with Argo Rollouts: Canary Metrics — wire weighted splits to metric-gated automated promotion.
Related: Azure Key Vault & Workload Identity for Secrets — the secretless identity pattern the ALB Controller depends on.
Related: AKS Istio Service Mesh Add-on: mTLS, Ingress & Egress — pod-to-pod mTLS behind the gateway, complementing AGC’s backend mTLS.
Related: Production AKS: Networking & Observability — the cluster-networking foundation AGC plugs into.