Azure Networking

Azure Load Balancing Deep Dive: Load Balancer, App Gateway, Front Door & Traffic Manager

Almost every production workload on Azure sits behind a load balancer of some kind. The moment you run more than one copy of an app — two VMs, a scale set, a pool of containers — something has to take the incoming connections and spread them across healthy instances, notice when one dies, and stop sending it traffic. On Azure that “something” is most often the Azure Load Balancer: a Layer-4 (TCP/UDP) traffic distributor built into the software-defined network, with no instances for you to patch, no bandwidth ceiling you provision, and a 99.99% SLA on the Standard tier.

But “load balancing on Azure” is bigger than one product. Azure ships four distinct load-balancing services — Load Balancer (L4, regional), Application Gateway (L7, regional, with a WAF), Traffic Manager (DNS-based, global), and Front Door (L7, global, with WAF and CDN) — and the single most common networking question on the AZ-104, AZ-305, and AZ-700 exams is which one do you pick, and why. Get that wrong in an interview and it shows; get it wrong in a design and you either overpay for a global CDN to balance two VMs in one region, or you try to do TLS termination and path-based routing on a service that only understands ports.

This is the exhaustive lesson. We go deep on the Azure Load Balancer first — every SKU, every frontend, backend, probe, and rule type, and the SNAT mechanics that trip up half of all “why are my outbound calls failing under load” tickets — then step back and build the decision framework across all four services with a clear matrix you can reproduce on a whiteboard.

Learning objectives

Prerequisites & where this fits

You should be comfortable with the basics of an Azure Virtual Network — subnets, NICs, NSGs, public vs private IPs, and the 5 reserved addresses per subnet — because a load balancer is glued to those constructs. If any of that is hazy, read the Azure Virtual Networks deep dive first. This lesson is Module 3 (Networking) of the KloudVin Azure Zero-to-Hero course, sitting alongside the NAT Gateway, Application Gateway, and Front Door lessons; it is core material for AZ-104 (configure load balancing) and a building block for the architecture decisions in AZ-305 and the networking depth in AZ-700.

Core concepts

A few mental models make everything else click.

Layer 4 vs Layer 7. The Azure Load Balancer operates at Layer 4 — it sees TCP and UDP packets, source/destination IP and port, nothing more. It does not read URLs, hostnames, cookies, or TLS certificates; it cannot route /images to one pool and /api to another, and it cannot terminate HTTPS. That ignorance is a feature: it is blisteringly fast, protocol-agnostic (it’ll balance a database, a game server, or a custom binary protocol just as happily as a web app), and adds essentially no latency. When you need URL paths, host headers, cookie affinity, TLS offload, or a Web Application Firewall, you reach for a Layer 7 service (Application Gateway or Front Door) instead — covered in the decision framework below.

Regional vs global. Load Balancer and Application Gateway are regional — they live in one region and distribute across instances in that region (across zones, but not across regions). Traffic Manager and Front Door are global — they make a decision before the request reaches a region (Traffic Manager via DNS, Front Door at Microsoft’s edge) and direct the user to the best region. You very often combine a global service with a regional one.

Pass-through, not proxy. The Azure Load Balancer is a pass-through load balancer, not a proxy. It does not terminate connections and open new ones to the backend; it rewrites the destination and (for inbound) forwards the original packet, so the backend sees the client’s source IP. Application Gateway and Front Door, by contrast, are reverse proxies — they terminate the client connection and originate a fresh one to the backend (so the backend sees the proxy’s IP, and the original client IP arrives in an X-Forwarded-For header).

The five building blocks. Every Azure Load Balancer is assembled from the same parts, and the whole rest of this lesson is just these in detail:

Component What it is
Frontend IP configuration The IP(s) clients connect to — public (internet-facing) or private (internal).
Backend pool The set of VMs / NICs / IPs that receive traffic.
Health probe The check that decides which backend members are healthy.
Load-balancing rule Maps frontend:portbackend:port, picking which probe and how to distribute.
Inbound NAT rule Maps a specific frontend:port to one backend instance (e.g. SSH to VM-2).

Two more — outbound rules and the SNAT mechanism behind them — govern how backends reach out to the internet. We’ll cover those too.

Basic vs Standard: the SKU decision

The first choice when you create a Load Balancer is the SKU, and it is effectively permanent — there is no in-place upgrade from Basic to Standard; you migrate (recreate and cut over, or use Microsoft’s migration scripts). It is also urgent: Basic Load Balancer is being retired on 30 September 2025, after which you cannot create new ones and existing ones lose support. For anything new, the answer is Standard. Here is why, axis by axis.

Capability Basic Standard
SLA None 99.99% (when ≥2 healthy backends)
Backend pool size Up to 300 instances Up to 1,000 instances
Backend pool membership Availability set or single VMSS only Any VM/NIC/IP in the VNet; NIC-based or IP-based pools
Availability zones Not zone-aware Zone-redundant or zonal frontends
Security (NSG) Open by default (NSG optional) Closed by default — traffic denied unless an NSG explicitly allows it
Health probes TCP, HTTP TCP, HTTP, HTTPS
HA Ports Not supported Supported (load-balance all ports)
Outbound rules Not configurable (implicit SNAT only) Explicit outbound rules with port allocation
Cross-region (global) LB Not supported Supported (Standard is the regional backend)
Diagnostics / metrics Limited Full multi-dimensional metrics (SNAT, health, throughput)
Pricing model Free Charged per rule + per GB processed (details in Cost)

The two that bite people most:

Gotcha: because there is no upgrade path, choosing Basic to “save money on a quick test” and then needing zones, HA Ports, or the SLA means a full rebuild. Start Standard.

Public vs internal: the frontend type

A Load Balancer is either public (a.k.a. external — internet-facing) or internal (a.k.a. private). The difference is entirely the frontend IP configuration:

Type Frontend IP Reached from Typical use
Public A public IP (Standard SKU) The internet Front a web tier; outbound SNAT for backends
Internal A private IP from a subnet Inside the VNet, peered VNets, or on-prem via VPN/ExpressRoute Front an internal app/database tier; the middle of an n-tier app

A single Load Balancer resource can host multiple frontend IP configurations — for example, several public IPs each fronting a different service, or a mix. (You cannot, however, mix public and private frontends on the same rule; each rule binds one frontend.) An internal LB’s private IP can be static (you pick it from the subnet, recommended for stability) or dynamic (Azure assigns one). A common n-tier pattern is a public LB in front of the web tier and an internal LB in front of the app tier — the internal one never gets a public IP and is only reachable from inside the network.

Frontend zone behaviour (Standard only):

Backend pools: every way to add members

The backend pool is the set of targets. A Standard LB supports two membership models, and you choose one per pool:

Model How members are added When to use Limit / gotcha
NIC-based Reference a VM’s network interface (ipConfiguration) The default for VMs and scale sets in the same VNet Members must be in the same VNet as the LB
IP-based Reference raw private IP addresses Mixed/decoupled backends, faster bulk membership, members that aren’t simple NICs A pool is either NIC-based or IP-based — you can’t mix in one pool; IP-based pools can’t be used for outbound SNAT

Key facts about backend pools:

Gotcha: NIC-based pools require same-VNet membership. If you peer two VNets and expect to add a VM from the peer, you can’t via NIC — use an IP-based pool (which references the private IP and works across peered VNets, with the right routing).

Health probes: TCP, HTTP, HTTPS

A health probe is how the LB decides which backend members are alive. No healthy members → the rule has nowhere to send traffic. This is the single most important thing to get right, because a misconfigured probe either drops a healthy backend (outage) or keeps sending traffic to a dead one (errors).

Probe type What it checks “Healthy” means Available on
TCP Can it complete a TCP handshake on the port? SYN/SYN-ACK/ACK succeeds Basic + Standard
HTTP Sends GET <path> to the port Backend returns HTTP 200 within the timeout Basic + Standard
HTTPS Same as HTTP but over TLS Returns HTTP 200 over TLS Standard only

Probe settings, every field:

Setting What it does Default / range Gotcha
Protocol TCP / HTTP / HTTPS HTTP/HTTPS catch app-level failures TCP misses (a hung app still accepts TCP)
Port The port probed (can differ from the data port) Probe a real health endpoint, not just the app port
Path (HTTP/S) The URL path requested / Use a lightweight /healthz that checks dependencies, returns 200 fast
Interval Seconds between probes 5s (min 5) Lower = faster detection, more probe traffic
Unhealthy threshold (Basic) Consecutive failures before “down” 2 Standard uses a fixed model (see below)

Detection timing. On Standard LB, a backend is marked down after the probe fails and up after it succeeds — the practical detection time is roughly the probe interval. Pick an interval that balances fast failover against probe noise; 5s is the common choice.

Probe source IP. Probes originate from the special Azure address 168.63.129.16 (the same virtual public IP Azure uses for platform DNS/health). Your NSG and host firewall must allow inbound from this address (use service tag AzureLoadBalancer) or every member shows unhealthy — a classic “everything is down and I can’t see why” cause.

Probe-down behaviour. When a probe fails, the LB stops sending new flows to that member. By default existing TCP connections are allowed to continue until they end or idle out; if you need all traffic to a failed/all-down pool to be reset, you can enable connection reset / the “disable on probe-down” behaviours per rule. If all members are unhealthy, the LB by default stops forwarding (you can opt into “all-probes-down” continued forwarding, but you rarely want it).

Gotcha: a TCP probe says “the port is open,” not “the app works.” An app stuck in a redeploy or GC pause often still answers TCP while returning 500s. For web tiers, always use an HTTP/HTTPS probe against a real health endpoint.

Load-balancing rules: distribution, persistence, floating IP

A load-balancing rule ties the pieces together: it maps a frontend IP + port to a backend pool + port, using a named health probe. It is the object that actually distributes traffic.

Every field on a rule:

Field What it does Choices / default Notes
Frontend IP Which frontend this rule listens on one of the LB’s frontends
Frontend port Port clients hit e.g. 80, 443 With HA Ports, set to 0 (all ports)
Backend port Port traffic is sent to e.g. 80 Can differ from frontend port
Backend pool Where traffic goes one pool
Health probe Which probe gates membership one probe
Protocol TCP / UDP / All TCP “All” = HA Ports
Session persistence Affinity model (below) None (5-tuple)
Idle timeout (TCP) How long an idle flow is kept 4 min (4–30) Raise for long-lived/idle connections; or send TCP keepalives
Floating IP (DSR) Direct Server Return mode Disabled For SQL AlwaysOn, certain clustering
TCP reset on idle Send TCP RST when idle timeout hits Disabled Helps apps detect dropped flows cleanly

Distribution mode / session persistence. This decides which backend a given connection lands on:

Mode Hash basis Behaviour Use when
None (default) 5-tuple (src IP, src port, dst IP, dst port, protocol) Each new flow may land on a different backend; best spread Stateless apps (most web apps behind a 5-tuple-friendly design)
Client IP 2-tuple (src IP, dst IP) Same client IP → same backend (across ports/protocols) App needs the same backend per client (some legacy stateful apps)
Client IP and protocol 3-tuple (src IP, dst IP, protocol) Same client IP + protocol → same backend Affinity that still separates TCP vs UDP

Note: this is connection-level affinity by IP, not cookie-based session affinity. For cookie-based affinity you need a Layer-7 service (Application Gateway), which can pin a browser session with a cookie regardless of source IP changes (NAT, mobile networks).

Floating IP (Direct Server Return). Normally the LB rewrites the destination IP to the backend’s IP. With Floating IP enabled, the original frontend IP is preserved all the way to the backend, and the backend is configured with that frontend IP on a loopback — so it responds directly to the client, bypassing the LB on the return path (DSR). This is required for specific scenarios — SQL Server Always On availability group listeners, some clustering, and cases where multiple instances must share the same backend port for different frontends. It is off by default and you should only enable it when a workload explicitly calls for it (it requires backend OS configuration).

HA Ports. Setting the rule protocol to All and the frontend port to 0 creates an HA Ports rule: the LB load-balances every TCP/UDP flow on all ports through one rule. This is purpose-built for network virtual appliances (NVAs) — firewalls, IDS/IPS — in an active-active “firewall sandwich” behind an internal Standard LB, where you must forward arbitrary ports, not just 80/443. HA Ports is Standard-only and works only on internal LBs.

Inbound NAT rules: reaching one specific instance

A load-balancing rule spreads traffic across the whole pool. Sometimes you need the opposite — a deterministic path to one backend instance, typically for management (SSH/RDP). That’s an inbound NAT rule: it maps a specific frontend port to a specific backend instance and port.

Two flavours:

Type Maps Use Example
Inbound NAT rule (single) One frontend port → one backend instance:port Reach a named VM frontendIP:50001VM-1:22, :50002VM-2:22
Inbound NAT rule (pool / port range) (recommended) A range of frontend ports → a backend pool, auto-assigning one port per instance Scale sets where instances come and go :50000-50100 → pool, port 22

The port-range / pool style is the modern, recommended approach: you don’t hand-maintain a rule per VM, and it works cleanly with scale sets as instances are added or removed. Inbound NAT does not use a health probe (it’s a 1:1 mapping, not load distribution).

Use case: you have a public LB in front of a web pool. Rather than give each VM a public IP for SSH, you add an inbound NAT rule mapping publicIP:50001→VM1:22 and :50002→VM2:22, then ssh -p 50001 to reach VM1. No per-VM public IPs, one NSG rule, smaller attack surface.

Outbound connectivity, SNAT, and port exhaustion

This is the part of Azure Load Balancer that generates the most production incidents, so we’ll take it slowly.

The problem. VMs with only private IPs still often need to reach the internet (call an API, pull packages, hit a webhook). To do that, Azure performs Source NAT (SNAT): it rewrites the VM’s private source IP to a public IP and assigns a SNAT port from that public IP so return traffic can be mapped back. Each public IP provides 64,512 SNAT ports, and each outbound flow to a unique destination IP:port consumes one SNAT port (ports are reused per destination, but distinct destinations each need their own).

The four ways a VM gets outbound (and the order of precedence):

Method How Recommendation
NAT Gateway on the subnet Dedicated outbound resource, up to 16 public IPs × 64,512 ports each, dynamic on-demand allocation Best practice — deterministic, massive scale, stable egress IP. See NAT Gateway.
LB outbound rule Explicit rule on a Standard public LB allocating ports per instance Fine when you already have a public LB and tune ports
Instance-level public IP A public IP directly on the VM NIC Per-VM egress; doesn’t scale, larger surface
Default outbound access Implicit SNAT Azure gives VMs with no other method Being retired (Sep 2025) — never rely on it

Precedence: if a subnet has a NAT Gateway, it wins for outbound — overriding LB outbound rules and default outbound. An instance-level public IP is used for that VM’s own egress. Default outbound applies only when nothing else does (and is going away).

Outbound rules (Standard LB). On a Standard public LB you can define explicit outbound rules to control SNAT precisely:

Setting What it controls Gotcha
Frontend public IP(s) The egress IP pool More IPs = more total ports (64,512 each)
Backend pool Which instances get this egress IP-based pools can’t be used for outbound
Protocol TCP / UDP / All Match your outbound traffic
Port allocation Ports per instance (manual) or “use default number of ports” Manual is safer — see below
Idle timeout Outbound flow idle timeout (4–30 min) Long-lived idle flows hold ports
TCP reset Send RST on idle timeout Cleaner connection teardown

SNAT port exhaustion — the classic incident. Symptoms: outbound connections start failing or timing out intermittently under load, especially apps that open many connections to a few destinations (microservices hammering one API, a chatty database driver, connection-per-request HTTP clients). Cause: the instance ran out of SNAT ports for that destination. Math: if you let the LB auto-allocate ports, the per-instance allocation shrinks as the pool grows (64,512 ports ÷ number of instances), so a big pool can leave each VM with very few ports.

How to prevent / fix it:

  1. Use a NAT Gateway (best). It allocates ports dynamically on demand across up to 16 IPs (over a million ports), so a single noisy instance can borrow from the whole pool. This is the recommended modern answer and removes the whole class of problem.
  2. If using LB outbound rules, set port allocation manually to a sane per-instance number and add more frontend public IPs to grow the total pool.
  3. Reduce port pressure at the app: connection pooling / keep-alive (reuse connections instead of one-per-request), avoid opening thousands of short-lived flows to the same IP:port.
  4. Lower the idle timeout so abandoned flows release ports sooner, and enable TCP reset on idle.
  5. Use Private Link / service endpoints for Azure PaaS destinations so that traffic doesn’t consume SNAT at all (it stays on the backbone).
  6. Watch the metric SNAT Connection Count and the allocated/used SNAT-port metrics in Azure Monitor; alert before you hit the ceiling.

Gotcha: adding instances to fix throughput can make SNAT exhaustion worse under auto-allocation, because each instance gets a thinner slice of the shared port pool. Fix the egress design (NAT Gateway), don’t just scale out.

Cross-region (global) Load Balancer

A standard Load Balancer is regional. The cross-region (global) Load Balancer is a Standard-tier feature that puts a single, static, geo-anycast public IP in front of regional Standard Load Balancers in multiple regions. It is still Layer 4 — no L7 features — but it adds global reach and failover:

Use it when you need global, low-latency, L4 distribution with instant failover and a stable IP — for example a TCP/UDP service (not HTTP) that must be highly available across regions. If your global service is HTTP(S) and wants WAF/CDN/path routing, Front Door is the L7 answer instead (decision framework below).

Gateway Load Balancer

The Gateway Load Balancer (GWLB) is a specialised SKU for one job: transparently inserting third-party network virtual appliances (firewalls, IDS/IPS, deep-packet-inspection) into the traffic path without re-architecting routing. You “chain” a GWLB to a frontend (a public IP or another LB), and traffic is steered through a bump-in-the-wire pool of NVAs (using VXLAN encapsulation) and back, with source/destination preserved.

After creation: what you can (and can’t) change

Operation Allowed? Notes
Add/remove frontend IPs Add more public/private frontends to an existing LB
Add/remove/edit backend pool members Add VMs, attach a VMSS, switch which NICs/IPs are in the pool
Add/edit/delete rules, probes, NAT rules, outbound rules Day-to-day operations
Change a probe’s protocol/port/path/interval Takes effect quickly
Change a rule’s session persistence, idle timeout, floating IP Floating IP also needs backend OS changes
Change the SKU (Basic ↔ Standard) No in-place upgrade — recreate / migrate and cut over
Change a public IP’s SKU (Basic ↔ Standard) Tied to the LB SKU; recreate
Change a public IP from dynamic→static Standard IPs are static anyway
Switch a backend pool NIC-based ↔ IP-based ❌ (per pool) A pool’s model is fixed; create a new pool
Move an LB to another region Recreate in the target region (IaC makes this trivial)

The big one to remember for exams and real life: SKU is immutable. Migrating Basic→Standard means new frontend IP(s) (or carefully reusing IPs via Microsoft’s migration tooling), new LB, and a cutover.

Creating a Standard Load Balancer with az CLI

The portal presents these as Basics (subscription/RG, name, region, SKU, type public/internal, tier regional/global), then Frontend IP configuration, Backend pools, Inbound rules (LB rules + NAT rules), Outbound rules, Tags, Review + create. The CLI mirrors that flow:

LOC=eastus
RG=rg-lb-lab

az group create -n $RG -l $LOC

# Standard public IP (zone-redundant by default)
az network public-ip create -g $RG -n pip-lb \
  --sku Standard --allocation-method Static --zone 1 2 3

# Standard public Load Balancer with a frontend + an (empty) backend pool
az network lb create -g $RG -n lb-web \
  --sku Standard --public-ip-address pip-lb \
  --frontend-ip-name fe-web --backend-pool-name be-web

# Health probe (HTTP against /)
az network lb probe create -g $RG --lb-name lb-web \
  -n probe-http --protocol Http --port 80 --path / \
  --interval 5 --threshold 2

# Load-balancing rule: frontend:80 -> backend:80, with the probe
az network lb rule create -g $RG --lb-name lb-web \
  -n rule-http --protocol Tcp \
  --frontend-ip-name fe-web --frontend-port 80 \
  --backend-pool-name be-web --backend-port 80 \
  --probe-name probe-http --idle-timeout 4 --enable-tcp-reset true

# Inbound NAT rule: SSH to a specific instance via frontend port 50001
az network lb inbound-nat-rule create -g $RG --lb-name lb-web \
  -n ssh-vm1 --protocol Tcp \
  --frontend-ip-name fe-web --frontend-port 50001 --backend-port 22

# Outbound rule (explicit SNAT) — manual port allocation, e.g. 10000 ports/instance
az network lb outbound-rule create -g $RG --lb-name lb-web \
  -n outbound-web --protocol All \
  --frontend-ip-configs fe-web --address-pool be-web \
  --outbound-ports 10000 --idle-timeout 4 --enable-tcp-reset true

For an internal LB, swap the frontend for a subnet + private IP:

az network lb create -g $RG -n lb-internal --sku Standard \
  --vnet-name vnet-app --subnet snet-app \
  --frontend-ip-name fe-int --private-ip-address 10.10.2.10 \
  --backend-pool-name be-app

The same, in Bicep

param location string = resourceGroup().location

resource pip 'Microsoft.Network/publicIPAddresses@2023-09-01' = {
  name: 'pip-lb'
  location: location
  sku: { name: 'Standard' }
  zones: ['1','2','3']
  properties: { publicIPAllocationMethod: 'Static' }
}

resource lb 'Microsoft.Network/loadBalancers@2023-09-01' = {
  name: 'lb-web'
  location: location
  sku: { name: 'Standard' }
  properties: {
    frontendIPConfigurations: [
      {
        name: 'fe-web'
        properties: { publicIPAddress: { id: pip.id } }
      }
    ]
    backendAddressPools: [ { name: 'be-web' } ]
    probes: [
      {
        name: 'probe-http'
        properties: {
          protocol: 'Http'
          port: 80
          requestPath: '/'
          intervalInSeconds: 5
          numberOfProbes: 2
        }
      }
    ]
    loadBalancingRules: [
      {
        name: 'rule-http'
        properties: {
          protocol: 'Tcp'
          frontendPort: 80
          backendPort: 80
          enableTcpReset: true
          idleTimeoutInMinutes: 4
          frontendIPConfiguration: {
            id: resourceId('Microsoft.Network/loadBalancers/frontendIPConfigurations', 'lb-web', 'fe-web')
          }
          backendAddressPool: {
            id: resourceId('Microsoft.Network/loadBalancers/backendAddressPools', 'lb-web', 'be-web')
          }
          probe: {
            id: resourceId('Microsoft.Network/loadBalancers/probes', 'lb-web', 'probe-http')
          }
        }
      }
    ]
  }
}

The decision framework: which load-balancing service when

Here is the question that shows up on every Azure networking exam and in every architecture interview. Azure has four load-balancing services. They split cleanly on two axes: global vs regional, and Layer 4 (transport) vs Layer 7 (application/HTTP).

Layer 4 (TCP/UDP) Layer 7 (HTTP/S)
Global Cross-region Load Balancer (anycast IP, geo-proximity, L4) Front Door (edge, WAF, CDN, path/host routing, global failover)
Regional Load Balancer (the subject of this lesson) Application Gateway (WAF, path/host routing, TLS, cookie affinity)

The four in one comparison:

Load Balancer Application Gateway Traffic Manager Front Door
OSI layer L4 (TCP/UDP) L7 (HTTP/S) DNS (no data path) L7 (HTTP/S)
Scope Regional Regional Global Global
How it routes Pass-through (rewrites dst, keeps client IP) Reverse proxy in your VNet Returns a DNS answer; client connects directly Reverse proxy at Microsoft edge (anycast)
TLS termination
URL path / host routing
WAF ✅ (WAF SKU)
CDN / caching
Cookie session affinity ❌ (IP affinity only) n/a
Health probes TCP/HTTP/HTTPS HTTP/HTTPS HTTP/HTTPS/TCP endpoint checks HTTP/HTTPS
Backends VMs/VMSS/IPs in VNet VMs/VMSS/IPs/App Service/IPs Any endpoint (Azure or external, by DNS) Azure or any public origin
Non-HTTP protocols ✅ (any TCP/UDP) ❌ (HTTP/S, plus WebSocket/HTTP2) ✅ (it’s just DNS) ❌ (HTTP/S)
Static anycast IP Cross-region: ✅ Regional VIP No (DNS name) ✅ (edge anycast)
Typical SLA 99.99% 99.95%+ 99.99% (DNS) 99.99%

Plain-English “which one when”:

The combinations interviewers love:

Exam tip — the two discriminators that resolve almost every question: (1) global or regional? (worldwide users / multi-region failover ⇒ Traffic Manager or Front Door; one region ⇒ Load Balancer or Application Gateway). (2) HTTP with L7 needs, or raw TCP/UDP? (URL/host routing, TLS, WAF, cookies ⇒ App Gateway or Front Door; anything-on-a-port ⇒ Load Balancer or Traffic Manager). Plot those two answers on the 2×2 grid and the service falls out.

Diagram: Azure load balancing

Azure load balancing overview: an internet-facing Standard public Load Balancer with a frontend public IP, health probe, load-balancing rule and inbound NAT rules distributing L4 traffic across a zone-spread backend pool of VMs, outbound SNAT and a NAT Gateway for egress, an internal Load Balancer fronting the app tier, and the decision framework across Load Balancer (L4 regional), Application Gateway (L7 regional + WAF), Traffic Manager (DNS global) and Front Door (L7 global + WAF/CDN)

The diagram ties the lesson together: trace a client request hitting the global/edge tier, landing on a regional Standard public Load Balancer, passing a health probe and a load-balancing rule to a zone-spread backend pool; an inbound NAT rule carving out a management port to one VM; backends reaching out via SNAT / NAT Gateway; and an internal LB in front of the app tier. Around the edges sits the four-service decision grid — Load Balancer and Application Gateway regional, Traffic Manager and Front Door global — so you can place any scenario at a glance.

Hands-on lab

Build a Standard public Load Balancer that distributes HTTP across two VMs, prove the distribution works, then tear it all down. Everything is az CLI in Cloud Shell. The only billable pieces are two tiny B1s VMs and a Standard public IP — all deleted at the end.

1. Resource group, VNet, and NSG

LOC=eastus
RG=rg-lb-lab
az group create -n $RG -l $LOC

az network vnet create -g $RG -n vnet-lb -l $LOC \
  --address-prefixes 10.30.0.0/16 \
  --subnet-name snet-web --subnet-prefixes 10.30.1.0/24

# NSG: a Standard LB is CLOSED by default — we MUST allow HTTP and the probe
az network nsg create -g $RG -n nsg-web -l $LOC
az network nsg rule create -g $RG --nsg-name nsg-web -n Allow-HTTP \
  --priority 100 --direction Inbound --access Allow --protocol Tcp \
  --source-address-prefixes Internet --destination-port-ranges 80
az network nsg rule create -g $RG --nsg-name nsg-web -n Allow-LB-Probe \
  --priority 110 --direction Inbound --access Allow --protocol '*' \
  --source-address-prefixes AzureLoadBalancer --destination-port-ranges '*'

az network vnet subnet update -g $RG --vnet-name vnet-lb -n snet-web \
  --network-security-group nsg-web

2. Standard public IP and Load Balancer

az network public-ip create -g $RG -n pip-lb \
  --sku Standard --allocation-method Static --zone 1 2 3

az network lb create -g $RG -n lb-web --sku Standard \
  --public-ip-address pip-lb \
  --frontend-ip-name fe-web --backend-pool-name be-web

az network lb probe create -g $RG --lb-name lb-web -n probe-http \
  --protocol Http --port 80 --path / --interval 5 --threshold 2

az network lb rule create -g $RG --lb-name lb-web -n rule-http \
  --protocol Tcp --frontend-ip-name fe-web --frontend-port 80 \
  --backend-pool-name be-web --backend-port 80 --probe-name probe-http \
  --idle-timeout 4 --enable-tcp-reset true

3. Two VMs, each serving its own hostname, added to the pool

# cloud-init: install nginx and write the VM's hostname to the index page
cat > /tmp/cloud-init-web.txt <<'EOF'
#cloud-config
package_update: true
packages: [nginx]
runcmd:
  - bash -c 'echo "Hello from $(hostname)" > /var/www/html/index.html'
EOF

for i in 1 2; do
  az vm create -g $RG -n web-vm$i -l $LOC \
    --image Ubuntu2204 --size Standard_B1s \
    --vnet-name vnet-lb --subnet snet-web \
    --nsg "" --public-ip-address "" \
    --custom-data /tmp/cloud-init-web.txt \
    --admin-username azureuser --generate-ssh-keys
done

# Add each VM's NIC ipconfig to the backend pool
for i in 1 2; do
  NICID=$(az vm show -g $RG -n web-vm$i --query "networkProfile.networkInterfaces[0].id" -o tsv)
  NIC=$(basename "$NICID")
  IPCFG=$(az network nic show --ids "$NICID" --query "ipConfigurations[0].name" -o tsv)
  az network nic ip-config address-pool add -g $RG \
    --nic-name "$NIC" --ip-config-name "$IPCFG" \
    --lb-name lb-web --address-pool be-web
done

Note: the VMs have no public IP and no per-NIC NSG — they’re reachable only through the LB, and the subnet NSG governs traffic.

4. Test the distribution

LBIP=$(az network public-ip show -g $RG -n pip-lb --query ipAddress -o tsv)
echo "LB public IP: $LBIP"

# Hit it repeatedly — responses should alternate between web-vm1 and web-vm2
for i in $(seq 1 10); do curl -s http://$LBIP/; done

Expected output: a mix of Hello from web-vm1 and Hello from web-vm2 across the ten requests (5-tuple distribution; exact ratio varies). If you see only one name, give cloud-init a minute to finish installing nginx on the other VM, then retry.

5. Validate health and rules

# Backend pool members
az network lb address-pool show -g $RG --lb-name lb-web -n be-web \
  --query "loadBalancerBackendAddresses[].name" -o tsv

# The rule and its probe
az network lb rule show -g $RG --lb-name lb-web -n rule-http \
  --query "{fePort:frontendPort,bePort:backendPort,protocol:protocol}" -o json

# Health probe status via the LB metric (Dimensions: dip status per backend)
az monitor metrics list --resource \
  $(az network lb show -g $RG -n lb-web --query id -o tsv) \
  --metric DipAvailability --interval PT1M -o table | head

Confirm both VMs appear in the pool, the rule maps 80→80/TCP, and the DipAvailability metric shows healthy backends.

Cleanup

az group delete -n $RG --yes --no-wait

Cost note

The Load Balancer itself in this lab costs a few rupees an hour (one rule + small data volume on Standard). The real cost is the two B1s VMs (~₹3–4/hour combined) and the Standard public IP (billed hourly even idle). Run the lab in well under an hour and az group delete immediately and you’ll spend roughly ₹10–20 total. Forgetting the VMs running overnight is what turns a ₹15 lab into a ₹100 surprise — always delete the resource group.

Common mistakes & troubleshooting

Symptom Likely cause Fix
Standard LB frontend returns nothing No NSG allowing the data port (Standard is closed by default) Add an NSG rule allowing your port (e.g. 80/443) inbound, and allow AzureLoadBalancer.
All backends show unhealthy NSG/host firewall blocks the probe from 168.63.129.16; or probe path/port wrong Allow service tag AzureLoadBalancer; verify the probe port/path returns 200.
Traffic goes to a backend that’s actually broken TCP probe on an app that hangs but keeps the port open Switch to an HTTP/HTTPS probe against a real /healthz.
Outbound calls fail intermittently under load SNAT port exhaustion Attach a NAT Gateway; or set manual outbound ports + add public IPs; pool connections.
Can’t add a peered-VNet VM to the pool NIC-based pool requires same-VNet membership Use an IP-based backend pool.
“Can’t attach Basic public IP to this LB” SKU mismatch (Standard LB needs Standard IP) Recreate the public IP as Standard.
Sessions break / cart empties for some users App needs affinity but rule is 5-tuple, or NAT changes client IP Use Client IP persistence (L4) — or move to App Gateway for cookie affinity.
Long-lived idle connections drop TCP idle timeout (default 4 min) elapsed Raise idle timeout (up to 30 min) and/or send TCP keepalives; enable TCP reset.
SQL Always On listener won’t fail over correctly Floating IP not enabled on the rule (or backend loopback not configured) Enable Floating IP and configure the listener IP on the nodes.
No SLA / can’t use zones or HA Ports Deployed a Basic LB Migrate to Standard (no in-place upgrade).

Best practices

Security notes

Cost & sizing

The Load Balancer bill has a small number of levers:

Lever Cost behaviour
Basic Load Balancer Free (but retiring — don’t build new on it)
Standard Load Balancer — rules Charged per rule (load-balancing + outbound + NAT, banded; first few rules a flat hourly, then per additional rule)
Standard Load Balancer — data processed Per GB of data processed by the LB
Standard public IP Billed hourly even when idle (one per frontend)
NAT Gateway (if used for egress) Hourly resource charge + per-GB processed + its public IP(s)
Cross-region (global) LB Standard pricing + the regional LBs behind it
Data transfer Standard egress / inter-zone / cross-region data charges apply to the traffic itself

Sizing rules of thumb:

Interview & exam questions

  1. What’s the practical difference between Basic and Standard Load Balancer, and why does it matter? Standard adds a 99.99% SLA, availability-zone support, up to 1,000 backends, HTTPS probes, HA Ports, configurable outbound rules, cross-region LB, and full metrics — and it is secure by default (closed until an NSG allows traffic). Basic has none of that and is retiring (Sep 2025). There is no in-place upgrade, so you must start with Standard.

  2. A Standard LB returns nothing on its public IP even though backends are running. What’s the most likely cause? No NSG rule allowing the data port. A Standard LB is closed by default; you must add an NSG rule permitting (say) TCP 80/443 inbound and allow the AzureLoadBalancer service tag for the probe.

  3. Explain SNAT and SNAT port exhaustion. How do you fix it? Outbound from private VMs is source-NAT’d to a public IP using one of 64,512 SNAT ports per IP; each unique destination IP:port consumes a port. Under heavy outbound to few destinations, ports run out and connections fail intermittently. Fix: use a NAT Gateway (dynamic ports across up to 16 IPs), or set manual outbound port allocation and add public IPs, plus connection pooling and Private Link for PaaS.

  4. Why can adding more backend VMs make SNAT exhaustion worse? With automatic port allocation, the 64,512 ports are split across instances, so a larger pool means fewer ports per instance. Scaling out shrinks each VM’s share — the answer is to fix egress (NAT Gateway), not add instances.

  5. What is the difference between a load-balancing rule and an inbound NAT rule? A load-balancing rule distributes a frontend:port across the whole backend pool (uses a probe). An inbound NAT rule maps a specific frontend:port to one backend instance (e.g. SSH to VM-2) — no distribution, no probe.

  6. When would you enable Floating IP (DSR) on a rule? For workloads that need the original frontend IP preserved to the backend so the backend responds directly — chiefly SQL Server Always On availability group listeners and some clustering. It requires backend OS loopback configuration and is off by default.

  7. What are HA Ports and when do you use them? An HA Ports rule (protocol All, port 0) load-balances all TCP/UDP ports through a single rule on an internal Standard LB — built for active-active NVAs/firewalls (a “firewall sandwich”) that must forward arbitrary ports.

  8. Load Balancer vs Application Gateway — how do you choose? Load Balancer is L4 (TCP/UDP, no URL/TLS/WAF, pass-through, cheapest). Application Gateway is L7 (HTTP/S, URL/host routing, TLS termination, cookie affinity, WAF), a regional reverse proxy. Use App Gateway when you need any L7 feature; use LB for raw transport distribution.

  9. Traffic Manager vs Front Door — both are “global,” so what’s the difference? Traffic Manager is DNS-based — it returns the best endpoint and the client connects directly; it’s protocol-agnostic (any TCP/UDP, any endpoint incl. on-prem) but failover is bound to DNS TTL and it does no proxying/TLS/caching/WAF. Front Door is an L7 reverse proxy at Microsoft’s edge with WAF, CDN/caching, TLS, path/host routing and near-instant failover — for global HTTP(S) apps.

  10. Design: a web app must serve users on three continents with edge caching, a WAF, and regional private backends. What do you put in front? Front Door (global edge: anycast, WAF, CDN, TLS, latency routing) → optionally Application Gateway WAF regionally for L7 control → private backends. If backends are simple VM pools you can go Front Door → Load Balancer per region.

  11. What is the cross-region Load Balancer and when is it the right choice over Front Door? A Layer-4 global LB with a single anycast IP fronting regional Standard LBs, routing by geo-proximity with automatic failover and client-IP preservation. Choose it over Front Door when the global service is non-HTTP (TCP/UDP) or needs a static IP and pass-through behaviour rather than L7 proxying/WAF/CDN.

  12. What’s the difference between NIC-based and IP-based backend pools? NIC-based pools reference a VM’s NIC ipConfiguration and require same-VNet membership (the default for VMs/VMSS). IP-based pools reference private IPs, allow faster bulk/decoupled membership and cross-peered-VNet backends — but can’t be used for outbound SNAT. A pool is one model or the other, not both.

  13. Your app’s user sessions keep breaking behind the LB. Diagnose. The app is stateful but the rule uses default 5-tuple distribution, so each new flow may hit a different backend. Either enable Client IP session persistence (L4 affinity) — fragile when clients share NAT or change IP — or move to Application Gateway for robust cookie-based affinity. Best long-term fix: externalise session state.

Quick check

  1. What is the source IP address of Azure Load Balancer health probes, and what must you do about it?
  2. A Standard public LB shows nothing on its frontend though backends are up. What’s the first thing to check?
  3. How many SNAT ports does one public IP provide, and what consumes one?
  4. Which load-balancing service operates at L7 and is global?
  5. True or false: you can upgrade a Basic Load Balancer to Standard in place.

Answers

  1. 168.63.129.16 — allow it inbound on NSGs/host firewalls via the AzureLoadBalancer service tag, or all backends show unhealthy.
  2. The NSG — a Standard LB is closed by default; ensure a rule allows your data port (and the probe).
  3. 64,512 SNAT ports per public IP; each unique outbound destination IP:port flow consumes one.
  4. Front Door (L7, global, with WAF/CDN). (Application Gateway is L7 but regional; cross-region LB is global but L4.)
  5. False — SKU is immutable; you must migrate/recreate and cut over.

Exercise

Extend the lab into a small, realistic design and write up your reasoning:

  1. Add an internal tier. Deploy a second, internal Standard LB (fe-int on a new snet-app subnet) in front of two “app” VMs, and have your web VMs call the internal LB’s private IP. You now have the canonical public-LB → internal-LB n-tier pattern.
  2. Add deterministic egress. Attach a NAT Gateway to the web subnet and confirm (via curl ifconfig.me from a VM) that outbound now leaves via the NAT Gateway’s IP, not LB SNAT. Note how this removes SNAT-exhaustion risk.
  3. Add management access without public IPs. Create an inbound NAT rule (:50001→web-vm1:22, :50002→web-vm2:22) and SSH in via ssh -p 50001 azureuser@<LBIP>.
  4. Decide the global layer. In a short paragraph, choose between Front Door and Traffic Manager to make this app multi-region, and justify it using the two discriminators (global-vs-regional, L4-vs-L7). State explicitly what each would and wouldn’t give you.
  5. Clean up with az group delete.

Deliverable: the working two-tier deployment plus a few sentences on (a) why you chose NIC- or IP-based pools, (b) how you’d size SNAT, and © your global-layer choice.

Certification mapping

Exam Where this lesson applies
AZ-104 (Administrator) Configure load balancing — create/configure Azure Load Balancer (SKUs, frontends, backend pools, health probes, LB & NAT rules) and understand Application Gateway at a configuration level; troubleshoot health and connectivity.
AZ-305 (Solutions Architect) Design network solutionschoose the right load-balancing service (the 2×2: global/regional × L4/L7), design for availability zones, multi-region resilience, and combine Front Door/App Gateway/LB.
AZ-700 (Network Engineer) Design and implement load balancing — deep configuration of Load Balancer (outbound rules, SNAT, HA Ports, cross-region), Application Gateway, Front Door, Traffic Manager, and integrating them.

Glossary

Next steps

AzureLoad BalancerNetworkingSNATApplication GatewayFront Door
Need this built for real?

Vinod is a Senior Cloud Architect (22+ yrs) — available for Azure / AWS / GCP architecture, landing zones, and migrations.

Work with me

Comments

Keep Reading