Azure Networking

Azure Standard Load Balancer Deep Dive: Outbound Rules, HA Ports, and Cross-Region Load Balancing

Standard Load Balancer is the L4 plumbing almost every Azure network design sits on, and it is the layer people understand least. They reach for it as a “TCP load balancer,” wire up a rule, and never touch the parts that matter under load: explicit outbound rules that decide whether you get deterministic SNAT or a 2 a.m. exhaustion incident, HA Ports that make a firewall sandwich actually highly available, and probe thresholds that decide whether a deploy drains gracefully or black-holes connections. This is the engineering-grade walkthrough of those parts, ending with a global cross-region front end that fails a whole region over without a DNS change.

Everything here is the Standard SKU. Basic Load Balancer retires 30 September 2025 — no SLA, no zones, no outbound rules, no HA Ports — so if you are still on it, migration is the first task, not an optimization.

1. Standard vs Gateway vs cross-region, and when each fits

Azure ships three load balancer “shapes.” They are not interchangeable, and picking the wrong one shows up as a missing feature or a redesign.

SKU / type Scope Primary job Outbound SNAT HA Ports
Standard (regional) One region, zone-aware General L4 load balancing for VMs/VMSS Yes, via outbound rules Yes
Gateway One region Transparent insertion of NVAs (firewalls, packet inspection) via service chaining No (bump-in-the-wire) N/A
Cross-region (Global) Multi-region Anycast global front end over regional Standard LBs No No

The mental model:

The rest of this article uses the regional Standard LB for Steps 2-6, then layers the cross-region LB on top in Step 7.

2. Backend pool design: NIC-based vs IP-based, and zone alignment

A Standard LB backend pool can be defined two ways, and the choice constrains the design.

Zone alignment is the part that gets skipped. A Standard LB frontend is zone-redundant by default (its IP is served from all zones), but HA is only real if the backends span zones too. Spread VMSS instances across zones 1/2/3 and the frontend keeps serving from surviving zones when one fails.

LOC=eastus
RG=rg-lb-prod

# Zone-redundant public frontend IP (Standard SKU, all zones).
az network public-ip create \
  --resource-group $RG --name pip-lb-fe \
  --sku Standard --tier Regional \
  --allocation-method Static --zone 1 2 3

az network lb create \
  --resource-group $RG --name lb-app-prod \
  --sku Standard \
  --public-ip-address pip-lb-fe \
  --frontend-ip-name fe-public \
  --backend-pool-name bep-app

Zonal vs zone-redundant is a real decision. A zone-redundant frontend survives a single zone loss transparently. A zonal frontend (pinned with a single --zone) is occasionally required for latency-sensitive or co-location designs, but it dies with its zone. Default to zone-redundant unless you have a specific reason not to.

3. Outbound rules and explicit SNAT port allocation

This is the part that prevents incidents. By default a Standard LB does not give backends outbound internet access just for being in a pool — Standard is secure by default, and egress is opt-in. The two clean ways to provide it are an explicit outbound rule or a NAT Gateway on the subnet. (NAT Gateway is usually better for pure egress and has its own article; this is the LB-native path, which you want when the LB is already there or the egress IP must be the LB VIP.)

A SNAT port is one entry in a translation table keyed on the full 5-tuple, including the destination IP and port. You are not limited to 64K total connections — you are limited to ~64K simultaneous flows to the same destination IP:port. Exhaustion almost always means many flows to one upstream behind a single VIP.

With an outbound rule you allocate ports explicitly, pre-dividing the 64,000-port budget per frontend IP across the pool. The maths is unforgiving:

ports_per_instance = floor( (64,000 x frontend_IP_count) / backend_instance_count )

64,000 ports, 1 frontend IP, 50 instances  -> 1,280 ports each
64,000 ports, 1 frontend IP, 100 instances ->   640 ports each
64,000 ports, 2 frontend IPs, 100 instances -> 1,280 ports each

Set it too high and you cap pool size; too low (the default auto-allocation is famously stingy) and busy instances exhaust ports while the pool looks half-idle. Always allocate manually.

# Dedicated outbound frontend IP — do NOT share the inbound VIP for outbound
# if you can avoid it; a separate IP keeps the SNAT budget clean.
az network public-ip create \
  --resource-group $RG --name pip-lb-outbound \
  --sku Standard --allocation-method Static --zone 1 2 3

az network lb frontend-ip create \
  --resource-group $RG --lb-name lb-app-prod \
  --name fe-outbound --public-ip-address pip-lb-outbound

# Explicit outbound rule: manual port allocation, generous idle timeout,
# and TCP reset on idle so clients learn the flow is gone.
az network lb outbound-rule create \
  --resource-group $RG --lb-name lb-app-prod \
  --name obr-app \
  --frontend-ip-configs fe-outbound \
  --address-pool bep-app \
  --protocol All \
  --idle-timeout 15 \
  --enable-tcp-reset true \
  --outbound-ports 1280

Three flags that matter:

Each extra frontend IP (or a public IP prefix) adds another 64,000 ports. If you are fighting this maths at scale, that is the signal to move egress to NAT Gateway, which allocates ports on demand instead of pre-carving them.

4. HA Ports for active-active NVAs and firewall sandwiches

HA Ports makes an internal Standard LB load-balance all ports and all protocols with one rule. It exists for the network virtual appliance case: you cannot enumerate every port a firewall must pass, so you balance the whole flow space at once.

An HA Ports rule is just a load-balancing rule with protocol All and both frontendPort and backendPort set to 0. It is available on internal Standard LBs only (not public).

# Internal LB in front of the active-active NVA pool.
az network lb create \
  --resource-group $RG --name lb-nva-internal \
  --sku Standard \
  --vnet-name vnet-hub --subnet snet-nva-frontend \
  --frontend-ip-name fe-nva --private-ip-address 10.0.10.4 \
  --backend-pool-name bep-nva

# HA Ports: protocol All, ports 0/0 — every port, every protocol.
az network lb rule create \
  --resource-group $RG --lb-name lb-nva-internal \
  --name rule-haports \
  --protocol All --frontend-port 0 --backend-port 0 \
  --frontend-ip-name fe-nva \
  --backend-pool-name bep-nva \
  --probe-name probe-nva \
  --enable-tcp-reset true \
  --idle-timeout 15

The classic topology is the firewall sandwich: an external/internal LB pair around an active-active NVA pool, HA Ports on the internal side. Two non-negotiable design rules:

HA Ports balances everything, so a misconfigured NSG or UDR on the NVA subnet now affects all protocols at once. There is no per-port blast radius anymore — treat that subnet as production-critical and test failover explicitly (Step 8).

5. Health probe protocols, thresholds, and graceful drain

Probes decide what “healthy” means, and the defaults are rarely what you want for a zero-downtime deploy. Standard LB supports TCP, HTTP, and HTTPS probes.

Probe type Healthy when Use it for
TCP 3-way handshake completes on the port Non-HTTP backends; cheapest, but only proves the port is open
HTTP GET on the path returns HTTP 200 Web backends; proves the app, not just the socket
HTTPS GET over TLS returns HTTP 200 Web backends requiring encrypted probes

Prefer an HTTP/HTTPS probe against a real /healthz over TCP wherever the backend speaks HTTP. A TCP probe stays “healthy” while the app returns 500s to every user, because the socket is still open. Only an L7 probe catches a wedged-but-listening process.

az network lb probe create \
  --resource-group $RG --lb-name lb-app-prod \
  --name probe-app \
  --protocol Http --port 8080 --path /healthz \
  --interval 5 --probe-threshold 2

Two knobs govern detection speed vs stability:

Detection time is roughly interval x probe_threshold (~10s at 5s/2). Tighter flaps on a merely-slow backend; looser keeps sending traffic to a dead node.

Graceful drain is the other half. When a probe starts failing (or you pull an instance from the pool), Standard LB stops new flows to it but does not kill established TCP connections — existing flows continue until they close or hit the idle timeout. So the clean deploy sequence is:

  1. Flip the instance’s /healthz to non-200 (or stop the app gracefully).
  2. LB marks it unhealthy after interval x threshold and stops new connections.
  3. Wait out in-flight requests (the drain window).
  4. Recycle the instance, bring /healthz back, watch it rejoin.

This is the orchestration VMSS rolling upgrades and App Service slot swaps lean on under the hood. Deploys that black-hole requests almost always skipped the drain wait between steps 2 and 3.

6. A reference deployment in Terraform

Here is the regional public LB, NIC-style pool, explicit outbound rule, HTTP probe, and load-balancing rule as one coherent Terraform reference — the shape you want in the repo, not a pile of CLI commands.

resource "azurerm_public_ip" "lb_fe" {
  name                = "pip-lb-fe"
  resource_group_name = var.rg
  location            = var.location
  allocation_method   = "Static"
  sku                 = "Standard"
  zones               = ["1", "2", "3"]
}

resource "azurerm_lb" "app" {
  name                = "lb-app-prod"
  resource_group_name = var.rg
  location            = var.location
  sku                 = "Standard"

  frontend_ip_configuration {
    name                 = "fe-public"
    public_ip_address_id = azurerm_public_ip.lb_fe.id
  }
}

resource "azurerm_lb_backend_address_pool" "app" {
  name            = "bep-app"
  loadbalancer_id = azurerm_lb.app.id
}

resource "azurerm_lb_probe" "app" {
  name                = "probe-app"
  loadbalancer_id     = azurerm_lb.app.id
  protocol            = "Http"
  port                = 8080
  request_path        = "/healthz"
  interval_in_seconds = 5
  number_of_probes    = 2
}

resource "azurerm_lb_rule" "app" {
  name                           = "rule-https"
  loadbalancer_id                = azurerm_lb.app.id
  protocol                       = "Tcp"
  frontend_port                  = 443
  backend_port                   = 8443
  frontend_ip_configuration_name = "fe-public"
  backend_address_pool_ids       = [azurerm_lb_backend_address_pool.app.id]
  probe_id                       = azurerm_lb_probe.app.id
  idle_timeout_in_minutes        = 15
  enable_tcp_reset               = true
  disable_outbound_snat          = true # outbound handled by the explicit rule below
}

resource "azurerm_lb_outbound_rule" "app" {
  name                     = "obr-app"
  loadbalancer_id          = azurerm_lb.app.id
  protocol                 = "All"
  backend_address_pool_id  = azurerm_lb_backend_address_pool.app.id
  allocated_outbound_ports = 1280
  idle_timeout_in_minutes  = 15
  enable_tcp_reset         = true

  frontend_ip_configuration {
    name = "fe-public"
  }
}

The detail that bites people: set disable_outbound_snat = true on the load-balancing rule (disableOutboundSnat in ARM/Bicep) so the inbound rule does not silently provide implicit, unmanaged SNAT alongside your explicit outbound rule. Without it you get two overlapping SNAT behaviors and unpredictable port use.

7. Cross-region load balancer: global front end, regional pools, failover

The cross-region (Global) LB gives you a single static anycast IP from Microsoft’s edge, with a backend pool of regional Standard load balancers. Traffic enters at the closest edge and steers to the closest healthy region; if a region’s LB goes unhealthy, flows shift to the next automatically — no DNS TTL to wait out, because the IP never changes.

# Global LB lives in a supported "home region" but serves globally.
az network public-ip create \
  --resource-group rg-global --name pip-global \
  --sku Standard --tier Global --allocation-method Static

az network cross-region-lb create \
  --resource-group rg-global --name lb-global \
  --frontend-ip-name fe-global \
  --public-ip-address pip-global \
  --backend-pool-name bep-regions

# Backend members are the *frontend IP configs of regional Standard LBs*.
az network cross-region-lb address-pool address add \
  --resource-group rg-global --lb-name lb-global \
  --pool-name bep-regions --name eastus-lb \
  --frontend-ip-address "$EASTUS_LB_FE_ID"

az network cross-region-lb address-pool address add \
  --resource-group rg-global --lb-name lb-global \
  --pool-name bep-regions --name westeurope-lb \
  --frontend-ip-address "$WESTEUROPE_LB_FE_ID"

What to internalize about the global LB:

This is the cleanest way to give a TCP/UDP service (not just HTTP) one global IP with regional failover — something Traffic Manager (DNS/TTL-bound) and Front Door (HTTP-only) cannot each do alone.

8. Diagnostics: metrics, SNAT counts, and the queries that matter

Standard LB emits multi-dimensional metrics under Microsoft.Network/loadBalancers. Worth alerting on:

A KQL query to catch SNAT pressure before users do:

AzureMetrics
| where ResourceProvider == "MICROSOFT.NETWORK"
| where ResourceId has "/LOADBALANCERS/LB-APP-PROD"
| where MetricName in ("UsedSnatPorts", "AllocatedSnatPorts", "SnatConnectionCount")
| summarize Used = sumif(Total, MetricName == "UsedSnatPorts"),
            Allocated = sumif(Total, MetricName == "AllocatedSnatPorts")
            by bin(TimeGenerated, 5m)
| extend UtilizationPct = round(100.0 * Used / Allocated, 1)
| order by TimeGenerated desc

Alert on SnatConnectionCount with ConnectionState == Failed greater than 0 over 5 minutes — sustained failed SNAT means you are at the ceiling, and the fix is more frontend IPs, higher per-instance ports, or NAT Gateway. An L4 LB has no access logs like an L7 proxy; flow-level visibility comes from VNet flow logs on the backend subnet, fed into Traffic Analytics for top-talker and drop analysis.

Verify

Confirm inbound balancing, the egress IP, probe health, and global failover before you call it done.

# 1) Inbound reaches the pool and spreads across backends.
for i in $(seq 1 10); do curl -s https://<lb-public-ip>/healthz -o /dev/null -w "%{http_code}\n"; done

# 2) Egress from a backend uses the outbound rule's frontend IP, every time.
#    Run from INSIDE a backend VM:
curl -s https://api.ipify.org; echo   # must be pip-lb-outbound's address
# 3) Probe health and pool membership from the control plane.
az network lb probe show -g $RG --lb-name lb-app-prod -n probe-app \
  --query "{proto:protocol, port:port, path:requestPath, interval:intervalInSeconds, threshold:numberOfProbes}" -o jsonc

az network lb address-pool show -g $RG --lb-name lb-app-prod -n bep-app \
  --query "loadBalancerBackendAddresses[].name" -o tsv

For graceful drain, flip one backend’s /healthz to 500 and confirm that new curl loops stop hitting it within interval x threshold seconds while an in-flight long-lived connection survives. For global failover, watch the global IP stay constant while you drop the East US regional LB (e.g. fail its probes) and confirm requests shift to West Europe with no client-side IP change. Zero failed SNAT under a real load test is the final pass condition.

Enterprise scenario

A payments platform ran an active-active NGFW firewall sandwich in their hub VNet: an internal Standard LB in front of three firewall VMs, HA Ports rule, all spoke traffic forced through it via UDRs. It passed lab and functional tests. In production, long-lived database and gRPC connections reset randomly after a few minutes while short HTTP calls were fine, and the firewall logs showed sessions with “no matching state.”

The constraint was classic stateful-inspection asymmetry. HA Ports hashes flows across the three firewalls by 5-tuple, but the return-path UDRs sent reply packets back through a different firewall than the forward path. The second appliance saw a mid-stream packet for a session it never created and dropped it. Short flows finished inside one hash window; long flows lived long enough to hit a state mismatch on a reconvergence or probe-driven rebalance.

The fix had two parts: enable the vendor’s session-state synchronization across the cluster so any appliance can handle any packet of a flow, and enable Floating IP (Direct Server Return) on the HA Ports rule so appliances see the original VIP and routing stays symmetric per the vendor design. They also pointed the probe at a real data-plane liveness URL, not just a listening port.

# HA Ports rule with Floating IP enabled for the stateful NVA sandwich.
az network lb rule create \
  --resource-group rg-hub --lb-name lb-nva-internal \
  --name rule-haports \
  --protocol All --frontend-port 0 --backend-port 0 \
  --frontend-ip-name fe-nva --backend-pool-name bep-nva \
  --probe-name probe-nva-dataplane \
  --floating-ip true \
  --enable-tcp-reset true --idle-timeout 30

The mid-stream resets stopped on the first cutover. The lesson the team wrote into their reference architecture: HA Ports gives you all-port load balancing, but it does not give you flow symmetry — that is your routing’s job, and on a stateful NVA you must engineer for it explicitly or buy it with vendor state sync.

Checklist

AzureLoad BalancerNetworkingHigh AvailabilitySNAT

Comments

Keep Reading