Networking AWS

Deploy F5 BIG-IP Virtual Edition on AWS with Active-Active GWLB Inspection

A payments company is consolidating twelve product VPCs behind a shared-services account and the security architect has one hard requirement: every packet leaving a workload VPC for the internet, and every packet crossing between VPCs, must pass through a real F5 BIG-IP for L4-L7 inspection — TLS visibility, an iRule-driven WAF policy, and IPS — with no single point of failure and no re-IP of the application fleet. The previous design pinned all traffic through a single BIG-IP in one Availability Zone; when that AZ blipped during a maintenance window, the entire egress path went dark and the on-call paged at 3 a.m. This guide rebuilds that inspection layer the way it should have been done the first time: a pair (or more) of BIG-IP Virtual Edition appliances behind an AWS Gateway Load Balancer (GWLB), inspecting transparently in active-active across two AZs, so losing an appliance or a whole zone costs you capacity, not connectivity.

GWLB is the piece that makes this clean. It is a bump-in-the-wire L3 load balancer that uses the GENEVE protocol (UDP 6081) to tunnel original, unmodified packets to a pool of appliances and bring them back, while preserving 5-tuple flow stickiness so both directions of a connection always hit the same BIG-IP — which is exactly what a stateful firewall needs. You insert it with a GWLB endpoint (GWLBE) in each spoke and a few route-table edits; the application VPCs never learn that an inspection layer exists. This is the centralized-inspection pattern AWS documents, built with F5 as the virtual appliance.

Prerequisites

Target topology

Deploy F5 BIG-IP Virtual Edition on AWS with Active-Active GWLB Inspection — topology

The design is a centralized inspection VPC fronted by GWLB, with spoke VPCs steering traffic to it through GWLB endpoints:

Everything below provisions this with Terraform, configures the BIG-IPs to inspect GENEVE traffic, validates the path, and wires it into the operating model.

1. Lay down the inspection VPC and BIG-IP subnets

Start with the network. Keep management, data, and the GWLB target subnets separate per AZ — BIG-IP VE is multi-NIC and you do not want GENEVE traffic and SSH sharing an interface.

# versions.tf
terraform {
  required_version = ">= 1.6"
  required_providers {
    aws = { source = "hashicorp/aws", version = ">= 5.40" }
  }
}

# vpc.tf
locals {
  azs = ["us-east-1a", "us-east-1b"]
}

resource "aws_vpc" "inspection" {
  cidr_block           = "10.100.0.0/16"
  enable_dns_hostnames = true
  tags = { Name = "insp-vpc", Tier = "security" }
}

# Per-AZ: mgmt (.0.x/.1.x), data/GWLB (.10.x/.11.x)
resource "aws_subnet" "mgmt" {
  for_each          = { "a" = ["10.100.0.0/24", local.azs[0]], "b" = ["10.100.1.0/24", local.azs[1]] }
  vpc_id            = aws_vpc.inspection.id
  cidr_block        = each.value[0]
  availability_zone = each.value[1]
  tags = { Name = "insp-mgmt-${each.key}" }
}

resource "aws_subnet" "data" {
  for_each          = { "a" = ["10.100.10.0/24", local.azs[0]], "b" = ["10.100.11.0/24", local.azs[1]] }
  vpc_id            = aws_vpc.inspection.id
  cidr_block        = each.value[0]
  availability_zone = each.value[1]
  tags = { Name = "insp-data-${each.key}" }
}

# Public subnets for the egress NAT/IGW path, one per AZ
resource "aws_subnet" "public" {
  for_each          = { "a" = ["10.100.20.0/24", local.azs[0]], "b" = ["10.100.21.0/24", local.azs[1]] }
  vpc_id            = aws_vpc.inspection.id
  cidr_block        = each.value[0]
  availability_zone = each.value[1]
  map_public_ip_on_launch = true
  tags = { Name = "insp-public-${each.key}" }
}

resource "aws_internet_gateway" "igw" {
  vpc_id = aws_vpc.inspection.id
  tags   = { Name = "insp-igw" }
}

Apply this slice first so the AZ/subnet IDs exist for the appliance and GWLB resources:

terraform init
terraform apply -target=aws_vpc.inspection \
  -target=aws_subnet.mgmt -target=aws_subnet.data \
  -target=aws_subnet.public -target=aws_internet_gateway.igw

2. Launch two BIG-IP VE appliances, one per AZ

Each BIG-IP gets two NICs: mgmt (eth0, for the GUI/SSH/iControl REST API) and data (eth1, where GWLB delivers GENEVE-tunneled packets). Pull the Marketplace AMI dynamically so the module is region-portable.

# bigip.tf
data "aws_ami" "bigip" {
  most_recent = true
  owners      = ["679593333241"] # AWS Marketplace
  filter {
    name   = "name"
    # PAYG bundle with LTM+AFM+ASM (Good/Better/Best). Match your subscription.
    values = ["F5 BIGIP-17.* PAYG-Best 25Mbps*"]
  }
}

resource "aws_security_group" "bigip_mgmt" {
  name_prefix = "bigip-mgmt-"
  vpc_id      = aws_vpc.inspection.id
  ingress { from_port = 443  to_port = 443  protocol = "tcp" cidr_blocks = ["10.0.0.0/8"] }
  ingress { from_port = 22   to_port = 22   protocol = "tcp" cidr_blocks = ["10.0.0.0/8"] }
  egress  { from_port = 0    to_port = 0    protocol = "-1"  cidr_blocks = ["0.0.0.0/0"] }
}

resource "aws_security_group" "bigip_data" {
  name_prefix = "bigip-data-"
  vpc_id      = aws_vpc.inspection.id
  # GENEVE from GWLB
  ingress { from_port = 6081 to_port = 6081 protocol = "udp" cidr_blocks = ["10.100.0.0/16"] }
  # GWLB health check (TCP/8080 to a BIG-IP virtual server we create in step 4)
  ingress { from_port = 8080 to_port = 8080 protocol = "tcp" cidr_blocks = ["10.100.0.0/16"] }
  egress  { from_port = 0    to_port = 0    protocol = "-1"  cidr_blocks = ["0.0.0.0/0"] }
}

# Data-plane ENIs (source/dest check OFF — this box routes through, not to, itself)
resource "aws_network_interface" "data" {
  for_each          = aws_subnet.data
  subnet_id         = each.value.id
  security_groups   = [aws_security_group.bigip_data.id]
  source_dest_check = false
  tags = { Name = "bigip-data-${each.key}" }
}

resource "aws_instance" "bigip" {
  for_each      = aws_subnet.mgmt
  ami           = data.aws_ami.bigip.id
  instance_type = "m5.xlarge"
  key_name      = var.key_pair_name
  # eth0 = mgmt
  subnet_id              = each.value.id
  vpc_security_group_ids = [aws_security_group.bigip_mgmt.id]
  # eth1 = data
  network_interface {
    network_interface_id = aws_network_interface.data[each.key].id
    device_index         = 1
  }
  user_data = file("${path.module}/bigip-onboard.sh")  # F5 Declarative Onboarding bootstrap
  tags = { Name = "bigip-ve-${each.key}", Role = "inspection" }
}

The source_dest_check = false on the data ENI is mandatory — without it AWS drops the transit packets and you will chase a phantom “GENEVE arrives but nothing returns” bug for hours. Set the management password and licensing via F5 Declarative Onboarding (DO) in bigip-onboard.sh; pull the BIG-IP admin password and (for BYOL) the license key from HashiCorp Vault at boot rather than baking them into user-data:

#!/bin/bash
# bigip-onboard.sh — runs on first boot via cloud-init
ADMIN_PW=$(vault kv get -field=admin_password secret/aws/bigip/inspection)
cat > /config/cloud/do.json <<EOF
{ "schema_version": "1.0.0", "class": "Device", "Common": { "class": "Tenant",
  "myProvision": { "class": "Provision", "ltm": "nominal", "afm": "nominal", "asm": "nominal" },
  "admin": { "class": "User", "userType": "regular", "password": "${ADMIN_PW}", "shell": "bash" }
}}
EOF
# DO is applied by f5-cloud-libs already present on the VE image

3. Create the Gateway Load Balancer and target group

GWLB lives in the data subnets and load-balances across both BIG-IPs. The target group protocol is GENEVE on port 6081; the health check is a separate, ordinary TCP/HTTP probe to a port the BIG-IP answers.

# gwlb.tf
resource "aws_lb" "gwlb" {
  name               = "insp-gwlb"
  load_balancer_type = "gateway"
  subnets            = [for s in aws_subnet.data : s.id]
  tags = { Name = "insp-gwlb" }
}

resource "aws_lb_target_group" "bigip" {
  name        = "insp-bigip-tg"
  protocol    = "GENEVE"
  port        = 6081
  vpc_id      = aws_vpc.inspection.id
  target_type = "instance"

  health_check {
    protocol = "TCP"
    port     = 8080          # answered by the health-check virtual server in step 4
    interval = 10
    healthy_threshold   = 3
    unhealthy_threshold = 3
  }
  # Keep both directions of a flow on the same appliance after a target change
  stickiness { type = "source_ip_dest_ip_proto" enabled = true }
}

resource "aws_lb_target_group_attachment" "bigip" {
  for_each         = aws_instance.bigip
  target_group_arn = aws_lb_target_group.bigip.arn
  target_id        = each.value.id
}

resource "aws_lb_listener" "gwlb" {
  load_balancer_arn = aws_lb.gwlb.arn
  default_action {
    type             = "forward"
    target_group_arn = aws_lb_target_group.bigip.arn
  }
}

# Publish GWLB as an endpoint service the spokes consume
resource "aws_vpc_endpoint_service" "gwlb" {
  acceptance_required        = false
  gateway_load_balancer_arns = [aws_lb.gwlb.arn]
  tags = { Name = "insp-gwlb-svc" }
}

The stickiness 3-tuple (source_ip_dest_ip_proto) is the active-active correctness guarantee: if a target is added or removed, existing flows keep landing on their original appliance instead of being rehashed mid-connection and dropped by the stateful firewall.

4. Configure BIG-IP to inspect the GENEVE traffic

This is the F5-specific heart of the build, done in-guest (via the GUI, tmsh, or Ansible with f5networks.f5_modules). BIG-IP must (a) terminate the GENEVE tunnel from GWLB, (b) hand decapsulated traffic to a wildcard forwarding virtual server that applies your firewall/WAF policy, and © expose a health-check VIP on TCP 8080.

# Run on each BIG-IP. Create the GENEVE tunnel that faces GWLB.
create net tunnels tunnel gwlb-tunnel { profile geneve local-address 10.100.10.10 }

# A wildcard "L3 forwarding" virtual that inspects everything arriving on the tunnel.
# type forwarding(ip) = transparent in/out; attach AFM + ASM policy here.
create ltm virtual vs_inspect_all {
    destination 0.0.0.0:any
    ip-protocol any
    profiles add { fastL4 { } }
    vlans add { gwlb-tunnel }
    vlans-enabled
    translate-address disabled
    translate-port disabled
}

# Health-check responder GWLB probes on TCP 8080
create ltm virtual vs_healthcheck {
    destination 0.0.0.0:8080
    ip-protocol tcp
    profiles add { tcp { } }
    rules { hc_200_ok }
}
create ltm rule hc_200_ok {
    when CLIENT_ACCEPTED { TCP::respond "HTTP/1.1 200 OK\r\nContent-Length: 0\r\n\r\n"; TCP::close }
}
saveDB sys config

Attach your security policy to vs_inspect_all: an AFM network-firewall policy for the L3-L4 rules and an ASM WAF policy (or an iRule) for L7. Because the virtual is translate-address disabled, BIG-IP inspects the original client IP and destination transparently — the application servers still see the real source, and GWLB returns the packet over the same GENEVE tunnel. This is what “transparent bump-in-the-wire” means in F5 terms.

5. Insert GWLB endpoints in the spokes and fix the route tables

Now make a spoke VPC actually use the inspection layer. Create a GWLBE per AZ in the spoke, then edit route tables so traffic detours through it. The pattern for egress inspection: the spoke’s private subnets default-route to the GWLBE; the GWLBE’s subnet default-routes to the local IGW/NAT path (which here lives in the inspection VPC, reached via a Transit Gateway or VPC peering in a real multi-VPC build — shown single-VPC here for clarity).

# spoke-endpoints.tf
resource "aws_vpc_endpoint" "gwlbe" {
  for_each          = toset(["a", "b"])
  service_name      = aws_vpc_endpoint_service.gwlb.service_name
  vpc_endpoint_type = "GatewayLoadBalancer"
  vpc_id            = aws_vpc.spoke.id
  subnet_ids        = [aws_subnet.spoke_gwlbe[each.key].id]
}

# Spoke app subnet default-routes INTO the GWLB endpoint (traffic gets inspected)
resource "aws_route" "app_to_gwlbe" {
  for_each               = aws_route_table.spoke_app
  route_table_id         = each.value.id
  destination_cidr_block = "0.0.0.0/0"
  vpc_endpoint_id        = aws_vpc_endpoint.gwlbe[each.key].id
}

# Inspection VPC: after BIG-IP, traffic egresses to the internet via IGW
resource "aws_route_table" "data_egress" {
  for_each = aws_subnet.data
  vpc_id   = aws_vpc.inspection.id
  route {
    cidr_block = "0.0.0.0/0"
    gateway_id = aws_internet_gateway.igw.id
  }
}

Apply the whole stack and capture the GWLB endpoint IDs:

terraform plan -out tf.plan && terraform apply tf.plan
terraform output -json | jq '.gwlbe_ids'
aws ec2 describe-vpc-endpoints \
  --filters Name=vpc-endpoint-type,Values=GatewayLoadBalancer \
  --query 'VpcEndpoints[].{id:VpcEndpointId,state:State}' --output table

6. Wire it into the operating model

Inspection appliances that no one watches rot. Bolt the BIG-IP pair onto the platform you already run:

This is also exactly the kind of shared virtual-appliance inspection layer that fronts other internet-facing estates — e.g. a university’s Moodle LMS or an Akamai-fronted web property where Akamai handles edge CDN/WAF and the BIG-IP pair does the deeper L4-L7 inspection and IPS on traffic reaching origin.

Validation

Prove the path before you trust it.

  1. Targets healthy. Both BIG-IPs should be healthy in the GWLB target group:

    aws elbv2 describe-target-health \
      --target-group-arn "$(terraform output -raw bigip_tg_arn)" \
      --query 'TargetHealthDescriptions[].{id:Target.Id,health:TargetHealth.State}' --output table
    
  2. Traffic is actually inspected. From a spoke instance, generate egress and watch it on the BIG-IP. On each appliance:

    # On the BIG-IP, confirm GENEVE arrives and the wildcard virtual sees connections
    tcpdump -nni 0.0 udp port 6081 -c 20            # GENEVE-tunneled packets from GWLB
    tmsh show ltm virtual vs_inspect_all | grep -E 'Connections|Bits'
    

    From the spoke:

    curl -s https://ifconfig.me ; echo          # should succeed via the inspection path
    
  3. Active-active distribution. Run sustained traffic from several source IPs and confirm connections land on both appliances (tmsh show ltm virtual on each shows non-zero, roughly balanced connection counts) — GWLB hashes by flow, so many flows spread; a single long flow stays pinned.

  4. AFM/ASM enforcement. Send a request that violates the WAF policy (e.g. a crafted SQLi pattern) and confirm BIG-IP blocks it and logs the event to Datadog/the SOC.

Failover test (chaos drill)

Do not wait for a real AZ event. Stop one BIG-IP and prove the survivor carries the load:

aws ec2 stop-instances --instance-ids "$(terraform output -raw bigip_a_id)"
# Within ~30s (3 failed health checks * 10s) the target group marks it unhealthy
watch -n5 'aws elbv2 describe-target-health \
  --target-group-arn "$(terraform output -raw bigip_tg_arn)" \
  --query "TargetHealthDescriptions[].TargetHealth.State"'
# Re-run the curl from the spoke — egress must still succeed, now via bigip-b only

New flows immediately hash to the healthy appliance. Existing flows pinned to the stopped box reset once — acceptable for a stateful firewall failure — and reconnect through the survivor. Start the instance again and confirm flows rebalance.

Rollback / teardown

Because the spokes only reach the inspection layer through a default route into the GWLBE, you can un-insert the inspection layer instantly without destroying the appliances — restore the spoke’s default route to its own NAT gateway:

# Emergency bypass: point the spoke app subnets back at the NAT gateway
aws ec2 replace-route --route-table-id "$SPOKE_RTB" \
  --destination-cidr-block 0.0.0.0/0 --nat-gateway-id "$SPOKE_NAT"

Full teardown is ordinary Terraform, but order matters — GWLB endpoints must go before the endpoint service and the GWLB:

terraform destroy -target=aws_vpc_endpoint.gwlbe      # remove spoke endpoints first
terraform destroy -target=aws_vpc_endpoint_service.gwlb
terraform destroy                                     # then the rest

If a destroy hangs on the endpoint service, confirm no GWLBE connections remain (aws ec2 describe-vpc-endpoint-connections); a lingering accepted connection blocks deletion.

Common pitfalls

Security notes

Run the appliances under least privilege: management plane reachable only from the corporate CIDR or via SSM, SAML SSO through Okta/Entra with MFA for admins, and the local admin account as glass-break only with its credential in HashiCorp Vault. For TLS inspection (SSL forward proxy), the signing CA key is the crown jewel — store it in Vault, never on disk in the repo. Let Wiz continuously verify the posture (no public data ENI, no over-broad SG) and Wiz Code gate the Terraform PR; let CrowdStrike Falcon cover the host runtime. AFM enforces the L3-L4 firewall and ASM the L7 WAF on the wildcard virtual, so inspection is a real control, not a passthrough. Pin the BIG-IP software version explicitly and patch through the ServiceNow change gate.

Cost notes

The recurring spend is two BIG-IP VE instances (m5.xlarge on-demand is meaningful 24/7 — buy Savings Plans or Reserved Instances once the design is stable), the F5 license (PAYG hourly bundle vs. cheaper-at-scale BYOL — BYOL wins past steady utilization), the GWLB hourly + LCU/GB-processed charge, and the GWLB endpoint hourly + data-processing charge per spoke. Watch cross-AZ data transfer: keeping one appliance per AZ avoids the cross-zone charge that a single-AZ design silently incurs. Right-size the instance type to measured throughput in Datadog rather than guessing — BIG-IP VE PAYG tiers are throughput-capped (e.g. 25 Mbps/200 Mbps/1 Gbps), so an over-spec’d tier is pure waste. Scaling out (more appliances in the same target group) is the lever for capacity; the active-active design means each added appliance adds usable throughput, not just standby.

AWSF5 BIG-IPGateway Load BalancerTerraformGENEVENetwork Security
Need this built for real?

Vinod is a Senior Cloud Architect (22+ yrs) — available for Azure / AWS / GCP architecture, landing zones, and migrations.

Work with me

Comments

Keep Reading