Application Gateway v2 is the workhorse L7 reverse proxy for Azure web estates: a regional, autoscaling, zone-redundant front door with an integrated Web Application Firewall. The product is easy to stand up and surprisingly easy to misconfigure — terminating TLS at the edge and forwarding plaintext to backends, running the WAF in Detection forever because Prevention “broke checkout,” or pasting in a custom rule that silently never matches.
This is the configuration that holds up under audit: TLS re-encrypted all the way to the backend, mutual TLS for partner callers, and a managed rule set tuned with surgical exclusions instead of being switched off. Everything here is the v2 SKU (Standard_v2 / WAF_v2) with a separate applicationGatewayWebApplicationFirewallPolicy resource, which is the only WAF model Microsoft still develops.
1. The request flow: listeners, rules, and backend settings
Before touching TLS or WAF, internalize how a request traverses App Gateway. Five components chain together, and almost every “it returns a 502” or “the wrong site answers” ticket is a break in this chain.
Client --> Frontend IP/Port --> Listener (host + cert) --> WAF policy
--> Routing rule --> [URL path map] --> Backend pool
--> Backend HTTP settings (probe, cert, SNI) --> Backend
- Frontend IP + port — the public (or private) socket traffic lands on.
- Listener — matches by port and, for multi-site, by
hostName. For HTTPS it holds the server certificate. - Routing rule — binds a listener to a backend pool and a backend HTTP setting. Basic rules are 1:1; path-based rules attach a URL path map.
- Backend pool — IPs, FQDNs, NICs, App Service, or VMSS.
- Backend HTTP settings — protocol to the backend, port, probe, connection draining, cookie affinity, and (critically) the trusted root cert and host-name behavior for re-encryption.
Rule priority matters on v2: every routing rule needs an explicit, unique priority (1–20000, lower wins). Listeners are evaluated multi-site first (host match) then basic, so a catch-all basic listener should sit on its own rule with a high priority number.
2. End-to-end TLS: re-encryption, trusted roots, and SNI
“SSL offload” terminates TLS at the gateway and forwards HTTP. For anything regulated, that plaintext hop inside the VNet is a finding. End-to-end TLS re-encrypts: the gateway terminates the client’s session, inspects the request (so the WAF can read the body), then opens a fresh TLS session to the backend.
Two things make re-encryption work and trip people up constantly:
- Trusted root certificate. On v2 the gateway validates the backend’s certificate chain. You upload the backend’s root CA (
.cer, base64/PEM) as atrustedRootCertificateand reference it from the backend setting. For a public CA, this can be skipped only when the backend uses a well-known public CA and you opt into trusted-CA validation; for private PKI or self-signed backends you must upload the root. - SNI / host name to the backend. The TLS handshake and probe must present a name the backend’s cert is issued for. Use
pickHostNameFromBackendAddresswhen the pool is an FQDN whose cert matches, or pinhostNameexplicitly. Mismatch here is the classic silent 502 withBackendConnectionFailure.
# Upload the backend's root CA so the gateway trusts the re-encrypted leg
az network application-gateway root-cert create \
--gateway-name agw-prod --resource-group rg-edge \
--name backend-root-ca \
--cert-file ./backend-root-ca.cer
# Backend HTTP settings: HTTPS to backend, validate chain, fix SNI
az network application-gateway http-settings create \
--gateway-name agw-prod --resource-group rg-edge \
--name bes-https-app \
--protocol Https --port 443 \
--host-name api.internal.contoso.com \
--root-certs backend-root-ca \
--probe probe-https-health \
--connection-draining-timeout 30 \
--timeout 30
The custom probe should also speak HTTPS and accept the backend’s real health status codes, otherwise the gateway marks a healthy pool unhealthy:
az network application-gateway probe create \
--gateway-name agw-prod --resource-group rg-edge \
--name probe-https-health \
--protocol Https --host api.internal.contoso.com \
--path /healthz --interval 30 --timeout 30 --threshold 3 \
--match-status-codes 200 204
For the frontend (client-facing) certificate, reference Key Vault rather than uploading a PFX — the gateway pulls the cert via its managed identity and picks up renewals. Grant the identity get on secrets and certificates, then bind the versionless secret ID so rotation flows through without a redeploy:
az network application-gateway ssl-cert create \
--gateway-name agw-prod --resource-group rg-edge \
--name fe-cert-contoso \
--key-vault-secret-id "https://kv-edge.vault.azure.net/secrets/contoso-tls"
Reference the versionless secret identifier (no GUID after
/secrets/contoso-tls). With a pinned version the gateway never sees a renewed cert and you are back to manual rotation — exactly what Key Vault integration exists to avoid.
Pair this with an SSL policy that drops legacy protocols. Use a predefined policy or a custom one; TLSv1_2 minimum is the floor, TLSv1_3 where your clients support it:
az network application-gateway ssl-policy set \
--gateway-name agw-prod --resource-group rg-edge \
--policy-type Custom \
--min-protocol-version TLSv1_2 \
--cipher-suites \
TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 \
TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256
3. Enforcing mutual TLS with client-certificate profiles
mTLS on App Gateway v2 means the client presents a certificate during the handshake and the gateway validates it against an SSL profile that carries a trusted client CA chain. This is the right control for partner B2B endpoints and machine-to-machine APIs where bearer tokens alone are not enough.
The moving parts:
- A trusted client CA certificate chain (
.cer, the full chain in one file) the gateway uses to validate presented client certs. - An SSL profile that bundles that chain plus the client-auth configuration and an SSL policy.
- The HTTPS listener is associated with the SSL profile.
# Upload the CA chain that signs your clients' certificates
az network application-gateway trusted-client-cert create \
--gateway-name agw-prod --resource-group rg-edge \
--name partner-client-ca \
--data ./partner-client-ca-chain.cer
# SSL profile: require client cert + enforce TLS 1.2 floor
az network application-gateway ssl-profile add \
--gateway-name agw-prod --resource-group rg-edge \
--name profile-mtls-partners \
--trusted-client-certificates partner-client-ca \
--client-auth-configuration True \
--min-protocol-version TLSv1_2
# Attach the profile to the HTTPS listener
az network application-gateway http-listener update \
--gateway-name agw-prod --resource-group rg-edge \
--name listener-partners-443 \
--ssl-profile profile-mtls-partners
Setting --client-auth-configuration True makes the client certificate mandatory for that listener; the handshake fails without one. The gateway validates the chain but does not by itself check certificate revocation, the subject, or thumbprint. Authorization is your job — forward the client cert details to the backend and decide there:
{
"ruleSequence": 100,
"conditions": [],
"actionSet": {
"requestHeaderConfigurations": [
{
"headerName": "X-Client-Cert-Subject",
"headerValue": "{var_client_certificate_subject}"
},
{
"headerName": "X-Client-Cert-Fingerprint",
"headerValue": "{var_client_certificate_sha1}"
},
{
"headerName": "X-Client-Cert-Verify",
"headerValue": "{var_client_certificate_verification}"
}
]
}
}
The backend must trust only the gateway to set these headers — lock the backend NSG/firewall to the gateway subnet so a caller cannot reach the backend directly and forge
X-Client-Cert-*. mTLS at the edge is worthless if the backend is independently reachable.
4. WAF policy: modes, rule sets, and Detection vs Prevention
On v2 the WAF lives in its own WAF_v2 policy resource, associated globally to the gateway, per-listener, or per-URI path — the most specific association wins. Two knobs define behavior:
- Mode —
Detectionlogs matches and passes traffic;Preventionblocks. Always burn in with Detection, mine the logs, tune, then flip to Prevention. - Managed rule set — the OWASP Core Rule Set (CRS) and Microsoft’s Default Rule Set (DRS). Prefer DRS 2.1, which folds in the Microsoft Threat Intelligence collection. CRS 3.2 remains available for compatibility.
# Create a WAF_v2 policy in Detection while you tune
az network application-gateway waf-policy create \
--name wafp-prod --resource-group rg-edge
az network application-gateway waf-policy policy-setting update \
--policy-name wafp-prod --resource-group rg-edge \
--state Enabled --mode Detection \
--max-request-body-size-in-kb 128 \
--file-upload-limit-in-mb 100 \
--request-body-check true
Set the managed rule set version explicitly so an upstream default never shifts your posture:
az network application-gateway waf-policy managed-rule rule-set add \
--policy-name wafp-prod --resource-group rg-edge \
--type Microsoft_DefaultRuleSet --version 2.1
DRS 2.1 runs in anomaly scoring mode: rules contribute a score by severity (Critical 5, Error 4, Warning 3, Notice 2) rather than each blocking outright. In Prevention, a request is blocked when its cumulative score crosses the threshold (5 by default — a single Critical, or a few lower-severity hits). This is why disabling one noisy rule often fixes a false positive without weakening unrelated coverage.
Associate the policy and only then move to Prevention:
AGW_ID=$(az network application-gateway show -g rg-edge -n agw-prod --query id -o tsv)
WAFP_ID=$(az network application-gateway waf-policy show -g rg-edge -n wafp-prod --query id -o tsv)
az network application-gateway update --ids "$AGW_ID" \
--set firewallPolicy.id="$WAFP_ID"
5. Taming false positives: per-rule exclusions and request-attribute scoping
The wrong move under pressure is dropping the mode back to Detection. The right move is to find which rule fired on which request attribute and exclude that attribute, not the rule globally.
Exclusions strip a named request attribute from inspection before rules run — for example, a JWT in the Authorization header that trips SQLi rules because of its base64 payload, or a rich-text field that looks like XSS. Scope by requestHeaderNames, requestCookieNames, requestArgNames, or RequestArgKeys/RequestArgValues with an operator.
# Exclude the Authorization header from a specific SQLi rule only
az network application-gateway waf-policy managed-rule exclusion rule-set add \
--policy-name wafp-prod --resource-group rg-edge \
--type Microsoft_DefaultRuleSet --version 2.1 \
--group-name SQLI --rule-ids 942100 \
--match-variable RequestHeaderNames \
--selector-match-operator Equals \
--selector Authorization
Three scoping levels, narrowest first:
- Per-rule (
--rule-ids 942100) — exclude the attribute from a single rule. Default to this. - Per-group (
--group-name SQLI) — exclude from a whole rule group when many rules in it misfire on the same field. - Global (omit rule/group) — exclude the attribute from the entire managed set. Reserve for fields you genuinely cannot inspect, like an opaque signed blob.
If a single rule is pure noise for your app and no exclusion fits, disable just that rule rather than the group:
az network application-gateway waf-policy managed-rule rule-set update \
--policy-name wafp-prod --resource-group rg-edge \
--type Microsoft_DefaultRuleSet --version 2.1 \
--group-name PHP --rules 933100 --rule-state Disabled
Every exclusion is a hole. Record why in IaC comments, scope to the tightest attribute, and review on each CRS/DRS version bump — an upgrade can renumber or retune the very rule you excluded.
6. Authoring custom rules: geo-match, rate limiting, and bot protection
Custom rules run before managed rules and short-circuit them, evaluated by ascending priority. Use them for coarse, cheap decisions: block by geography, rate-limit abusive IPs, allow-list partners.
A geo-match block keeping traffic to expected countries:
{
"name": "BlockNonAllowedGeos",
"priority": 10,
"ruleType": "MatchRule",
"action": "Block",
"matchConditions": [
{
"matchVariables": [{ "variableName": "RemoteAddr" }],
"operator": "GeoMatch",
"negationConditon": true,
"matchValues": ["US", "GB", "DE", "IN"]
}
]
}
A rate-limit rule — v2-only RateLimitRule with a sliding window, keyed by client IP, sparing a short burst but capping sustained floods:
{
"name": "ThrottlePerClientIp",
"priority": 20,
"ruleType": "RateLimitRule",
"action": "Block",
"rateLimitThreshold": 100,
"rateLimitDuration": "OneMin",
"groupByUserSession": [
{ "groupByVariables": [{ "variableName": "ClientAddr" }] }
],
"matchConditions": [
{
"matchVariables": [{ "variableName": "RequestUri" }],
"operator": "Contains",
"matchValues": ["/api/"]
}
]
}
Enable the Bot Manager managed rule set alongside custom rules to act on Microsoft’s categorized bot intelligence (Bad / Good / Unknown):
az network application-gateway waf-policy managed-rule rule-set add \
--policy-name wafp-prod --resource-group rg-edge \
--type Microsoft_BotManagerRuleSet --version 1.1
rateLimitThresholdcounts matched requests perrateLimitDurationper group key. Set the duration and group variable deliberately:ClientAddrkeys on the true client IP, whileGeoLocationor a header lets you throttle a population. Threshold too low and you rate-limit a corporate NAT; too high and you never engage.
7. Header rewrites, URL rewrites, and securing Set-Cookie
App Gateway rewrites HTTP headers and URLs at the rule level. The highest-value security use is hardening cookies the backend emits without the right flags — you fix them at the edge instead of waiting on an app change.
Rewrite Set-Cookie to add Secure and HttpOnly (the gateway uses a capturing regex over the existing header value, then re-emits it):
{
"name": "HardenCookies",
"ruleSequence": 200,
"conditions": [
{
"variable": "http_resp_Set-Cookie",
"pattern": "(.+)",
"ignoreCase": true,
"negate": false
}
],
"actionSet": {
"responseHeaderConfigurations": [
{
"headerName": "Set-Cookie",
"headerValue": "{http_resp_Set-Cookie_1}; Secure; HttpOnly; SameSite=Strict"
}
]
}
}
Add standard response security headers in the same rule set, and strip ones that leak backend internals:
{
"responseHeaderConfigurations": [
{ "headerName": "Strict-Transport-Security", "headerValue": "max-age=31536000; includeSubDomains" },
{ "headerName": "X-Content-Type-Options", "headerValue": "nosniff" },
{ "headerName": "Server", "headerValue": "" }
]
}
For URL rewrite, transform the path sent to the backend without a client-visible redirect — useful when the public path differs from the backend route:
# /v1/* on the edge -> /api/* on the backend, conditioned on a URL path map match
az network application-gateway rewrite-rule create \
--gateway-name agw-prod --resource-group rg-edge \
--rule-set-name rwset-prod --name rw-path-v1 \
--sequence 100 \
--modified-path '/api/{var_uri_path_1}'
Setting a header value to an empty string ("") deletes it — that is how you drop Server and X-Powered-By. Rewrites read server variables like {var_uri_path}, {var_client_ip}, and the {http_req_*} / {http_resp_*} families; capture groups from the condition regex are referenced by index as {...; _1}.
Verify
Confirm each layer end to end, not just that the page loads.
End-to-end TLS is actually re-encrypting — backend health must be green with the HTTPS probe, proving the gateway completed a TLS handshake to the backend:
az network application-gateway show-backend-health \
--name agw-prod --resource-group rg-edge \
--query "backendAddressPools[].backendHttpSettingsCollection[].servers[].{addr:address,health:health}" \
-o table
mTLS rejects requests without a client cert and accepts a valid one:
# Expect a TLS handshake failure (no client cert presented)
curl -sv https://partners.contoso.com/ 2>&1 | grep -iE "alert|handshake|certificate"
# Expect 200 with a valid client cert + key
curl -s -o /dev/null -w "%{http_code}\n" \
--cert ./partner.crt --key ./partner.key \
https://partners.contoso.com/healthz
The WAF blocks a probe once in Prevention (a benign canary, not a live exploit):
# A request that trips SQLi scoring; expect 403 in Prevention mode
curl -s -o /dev/null -w "%{http_code}\n" \
"https://www.contoso.com/?id=1%27%20OR%20%271%27=%271"
Confirm which rule fired by reading the firewall log in KQL:
AGWFirewallLogs
| where TimeGenerated > ago(15m)
| where Action == "Blocked"
| summarize hits = count() by RuleId = tostring(RuleId), Message, ClientIp
| order by hits desc
Enterprise scenario
A platform team fronting a payments API for a European retailer ran WAF_v2 in Prevention with DRS 2.1. Within hours of a partner go-live, legitimate POSTs to /v1/settlements started returning 403. The firewall logs pinned it to rule 942430 (restricted SQL characters, anomaly) firing on the request body — the partner’s payload embedded a base64-encoded signature whose +, /, and = runs read as SQL noise.
The constraint: they could not weaken SQLi protection on a payments endpoint, and PCI scope meant TLS had to terminate re-encrypted, so the body was being inspected by design. Dropping to Detection was off the table.
They scoped a per-rule exclusion to exactly the signature field — RequestArgKeys matching the signature argument — for rule 942430 only, leaving every other SQLi rule and every other field fully inspected. A path-specific WAF policy kept the exclusion off the rest of the site.
az network application-gateway waf-policy managed-rule exclusion rule-set add \
--policy-name wafp-payments --resource-group rg-edge \
--type Microsoft_DefaultRuleSet --version 2.1 \
--group-name SQLI --rule-ids 942430 \
--match-variable RequestArgKeys \
--selector-match-operator Equals \
--selector signature
The 403s stopped, SQLi coverage stayed intact everywhere else, and the exclusion went into Terraform with a comment linking the partner ticket and the log query that justified it — so the next DRS bump gets a deliberate review instead of a surprise outage.