Deploy F5 BIG-IP Virtual Edition on AWS with Active-Active GWLB Inspection

A payments company is consolidating twelve product VPCs behind a shared-services account and the security architect has one hard requirement: every packet leaving a workload VPC for the internet, and every packet crossing between VPCs, must pass through a real F5 BIG-IP for L4-L7 inspection — TLS visibility, an iRule-driven WAF policy, and IPS — with no single point of failure and no re-IP of the application fleet. The previous design pinned all traffic through a single BIG-IP in one Availability Zone; when that AZ blipped during a maintenance window, the entire egress path went dark and the on-call paged at 3 a.m. This guide rebuilds that inspection layer the way it should have been done the first time: a pair (or more) of BIG-IP Virtual Edition appliances behind an AWS Gateway Load Balancer (GWLB), inspecting transparently in active-active across two AZs, so losing an appliance or a whole zone costs you capacity, not connectivity.

GWLB is the piece that makes this clean. It is a bump-in-the-wire L3 load balancer that uses the GENEVE protocol (UDP 6081) to tunnel original, unmodified packets to a pool of appliances and bring them back, while preserving 5-tuple flow stickiness so both directions of a connection always hit the same BIG-IP — which is exactly what a stateful firewall needs. You insert it with a GWLB endpoint (GWLBE) in each spoke and a few route-table edits; the application VPCs never learn that an inspection layer exists. This is the centralized-inspection pattern AWS documents, built with F5 as the virtual appliance.

Prerequisites

An AWS account (or a shared inspection/security VPC in a multi-account org via AWS Organizations) and Terraform >= 1.6 with the aws provider >= 5.40. Ansible is optional for in-guest BIG-IP config.
A subscribed F5 BIG-IP Virtual Edition AMI from AWS Marketplace — either BYOL or a PAYG / hourly bundle that includes LTM + AFM + ASM (you need the firewall and WAF modules, not just LTM). Note the AMI ID per region.
An EC2 key pair, and an instance type F5 supports for VE — m5.xlarge or larger (BIG-IP VE wants 8 GB+ RAM and at least 4 vCPU for production throughput).
A grasp of GWLB, GENEVE, VPC route tables, and the fact that GWLB targets are registered by instance ID or IP, health-checked on a TCP/HTTP port you choose.
CLI access: aws v2 configured, terraform, jq, and SSH. A bastion or SSM access into the management subnet.
IAM permission to create VPCs, GWLB, VPC endpoints, route tables, EC2 instances, and (for secrets) read access to HashiCorp Vault.

Target topology

Deploy F5 BIG-IP Virtual Edition on AWS with Active-Active GWLB Inspection — topology

The design is a centralized inspection VPC fronted by GWLB, with spoke VPCs steering traffic to it through GWLB endpoints:

A dedicated inspection VPC (10.100.0.0/16) spanning two AZs (a and b). Each AZ holds three subnets per BIG-IP: a management subnet, a data/GENEVE subnet where GWLB sends tunneled traffic, and the appliances themselves.
Two BIG-IP VE instances — one per AZ — each registered as a GWLB target. Both are active and inspecting simultaneously; GWLB hashes flows across them. Lose one and its share rebalances to the survivor.
A Gateway Load Balancer with one target group containing both BIG-IPs, health-checked so a failed appliance is pulled in seconds.
A GWLB endpoint service (powered by AWS PrivateLink) that the spokes consume. Each spoke VPC gets a GWLBE in each AZ.
Route tables that force the path: spoke subnet → GWLBE → GWLB → BIG-IP (inspect) → GWLB → back out via the Internet/NAT gateway, which lives in the inspection VPC. Symmetric routing is enforced by GWLB flow stickiness, so return traffic lands on the same appliance.

Everything below provisions this with Terraform, configures the BIG-IPs to inspect GENEVE traffic, validates the path, and wires it into the operating model.

1. Lay down the inspection VPC and BIG-IP subnets

Start with the network. Keep management, data, and the GWLB target subnets separate per AZ — BIG-IP VE is multi-NIC and you do not want GENEVE traffic and SSH sharing an interface.

# versions.tf
terraform {
  required_version = ">= 1.6"
  required_providers {
    aws = { source = "hashicorp/aws", version = ">= 5.40" }
  }
}

# vpc.tf
locals {
  azs = ["us-east-1a", "us-east-1b"]
}

resource "aws_vpc" "inspection" {
  cidr_block           = "10.100.0.0/16"
  enable_dns_hostnames = true
  tags = { Name = "insp-vpc", Tier = "security" }
}

# Per-AZ: mgmt (.0.x/.1.x), data/GWLB (.10.x/.11.x)
resource "aws_subnet" "mgmt" {
  for_each          = { "a" = ["10.100.0.0/24", local.azs[0]], "b" = ["10.100.1.0/24", local.azs[1]] }
  vpc_id            = aws_vpc.inspection.id
  cidr_block        = each.value[0]
  availability_zone = each.value[1]
  tags = { Name = "insp-mgmt-${each.key}" }
}

resource "aws_subnet" "data" {
  for_each          = { "a" = ["10.100.10.0/24", local.azs[0]], "b" = ["10.100.11.0/24", local.azs[1]] }
  vpc_id            = aws_vpc.inspection.id
  cidr_block        = each.value[0]
  availability_zone = each.value[1]
  tags = { Name = "insp-data-${each.key}" }
}

# Public subnets for the egress NAT/IGW path, one per AZ
resource "aws_subnet" "public" {
  for_each          = { "a" = ["10.100.20.0/24", local.azs[0]], "b" = ["10.100.21.0/24", local.azs[1]] }
  vpc_id            = aws_vpc.inspection.id
  cidr_block        = each.value[0]
  availability_zone = each.value[1]
  map_public_ip_on_launch = true
  tags = { Name = "insp-public-${each.key}" }
}

resource "aws_internet_gateway" "igw" {
  vpc_id = aws_vpc.inspection.id
  tags   = { Name = "insp-igw" }
}

Apply this slice first so the AZ/subnet IDs exist for the appliance and GWLB resources:

terraform init
terraform apply -target=aws_vpc.inspection \
  -target=aws_subnet.mgmt -target=aws_subnet.data \
  -target=aws_subnet.public -target=aws_internet_gateway.igw

2. Launch two BIG-IP VE appliances, one per AZ

Each BIG-IP gets two NICs: mgmt (eth0, for the GUI/SSH/iControl REST API) and data (eth1, where GWLB delivers GENEVE-tunneled packets). Pull the Marketplace AMI dynamically so the module is region-portable.

# bigip.tf
data "aws_ami" "bigip" {
  most_recent = true
  owners      = ["679593333241"] # AWS Marketplace
  filter {
    name   = "name"
    # PAYG bundle with LTM+AFM+ASM (Good/Better/Best). Match your subscription.
    values = ["F5 BIGIP-17.* PAYG-Best 25Mbps*"]
  }
}

resource "aws_security_group" "bigip_mgmt" {
  name_prefix = "bigip-mgmt-"
  vpc_id      = aws_vpc.inspection.id
  ingress { from_port = 443  to_port = 443  protocol = "tcp" cidr_blocks = ["10.0.0.0/8"] }
  ingress { from_port = 22   to_port = 22   protocol = "tcp" cidr_blocks = ["10.0.0.0/8"] }
  egress  { from_port = 0    to_port = 0    protocol = "-1"  cidr_blocks = ["0.0.0.0/0"] }
}

resource "aws_security_group" "bigip_data" {
  name_prefix = "bigip-data-"
  vpc_id      = aws_vpc.inspection.id
  # GENEVE from GWLB
  ingress { from_port = 6081 to_port = 6081 protocol = "udp" cidr_blocks = ["10.100.0.0/16"] }
  # GWLB health check (TCP/8080 to a BIG-IP virtual server we create in step 4)
  ingress { from_port = 8080 to_port = 8080 protocol = "tcp" cidr_blocks = ["10.100.0.0/16"] }
  egress  { from_port = 0    to_port = 0    protocol = "-1"  cidr_blocks = ["0.0.0.0/0"] }
}

# Data-plane ENIs (source/dest check OFF — this box routes through, not to, itself)
resource "aws_network_interface" "data" {
  for_each          = aws_subnet.data
  subnet_id         = each.value.id
  security_groups   = [aws_security_group.bigip_data.id]
  source_dest_check = false
  tags = { Name = "bigip-data-${each.key}" }
}

resource "aws_instance" "bigip" {
  for_each      = aws_subnet.mgmt
  ami           = data.aws_ami.bigip.id
  instance_type = "m5.xlarge"
  key_name      = var.key_pair_name
  # eth0 = mgmt
  subnet_id              = each.value.id
  vpc_security_group_ids = [aws_security_group.bigip_mgmt.id]
  # eth1 = data
  network_interface {
    network_interface_id = aws_network_interface.data[each.key].id
    device_index         = 1
  }
  user_data = file("${path.module}/bigip-onboard.sh")  # F5 Declarative Onboarding bootstrap
  tags = { Name = "bigip-ve-${each.key}", Role = "inspection" }
}

The source_dest_check = false on the data ENI is mandatory — without it AWS drops the transit packets and you will chase a phantom “GENEVE arrives but nothing returns” bug for hours. Set the management password and licensing via F5 Declarative Onboarding (DO) in bigip-onboard.sh; pull the BIG-IP admin password and (for BYOL) the license key from HashiCorp Vault at boot rather than baking them into user-data:

#!/bin/bash
# bigip-onboard.sh — runs on first boot via cloud-init
ADMIN_PW=$(vault kv get -field=admin_password secret/aws/bigip/inspection)
cat > /config/cloud/do.json <<EOF
{ "schema_version": "1.0.0", "class": "Device", "Common": { "class": "Tenant",
  "myProvision": { "class": "Provision", "ltm": "nominal", "afm": "nominal", "asm": "nominal" },
  "admin": { "class": "User", "userType": "regular", "password": "${ADMIN_PW}", "shell": "bash" }
}}
EOF
# DO is applied by f5-cloud-libs already present on the VE image

3. Create the Gateway Load Balancer and target group

GWLB lives in the data subnets and load-balances across both BIG-IPs. The target group protocol is GENEVE on port 6081; the health check is a separate, ordinary TCP/HTTP probe to a port the BIG-IP answers.

# gwlb.tf
resource "aws_lb" "gwlb" {
  name               = "insp-gwlb"
  load_balancer_type = "gateway"
  subnets            = [for s in aws_subnet.data : s.id]
  tags = { Name = "insp-gwlb" }
}

resource "aws_lb_target_group" "bigip" {
  name        = "insp-bigip-tg"
  protocol    = "GENEVE"
  port        = 6081
  vpc_id      = aws_vpc.inspection.id
  target_type = "instance"

  health_check {
    protocol = "TCP"
    port     = 8080          # answered by the health-check virtual server in step 4
    interval = 10
    healthy_threshold   = 3
    unhealthy_threshold = 3
  }
  # Keep both directions of a flow on the same appliance after a target change
  stickiness { type = "source_ip_dest_ip_proto" enabled = true }
}

resource "aws_lb_target_group_attachment" "bigip" {
  for_each         = aws_instance.bigip
  target_group_arn = aws_lb_target_group.bigip.arn
  target_id        = each.value.id
}

resource "aws_lb_listener" "gwlb" {
  load_balancer_arn = aws_lb.gwlb.arn
  default_action {
    type             = "forward"
    target_group_arn = aws_lb_target_group.bigip.arn
  }
}

# Publish GWLB as an endpoint service the spokes consume
resource "aws_vpc_endpoint_service" "gwlb" {
  acceptance_required        = false
  gateway_load_balancer_arns = [aws_lb.gwlb.arn]
  tags = { Name = "insp-gwlb-svc" }
}

The stickiness 3-tuple (source_ip_dest_ip_proto) is the active-active correctness guarantee: if a target is added or removed, existing flows keep landing on their original appliance instead of being rehashed mid-connection and dropped by the stateful firewall.

4. Configure BIG-IP to inspect the GENEVE traffic

This is the F5-specific heart of the build, done in-guest (via the GUI, tmsh, or Ansible with f5networks.f5_modules). BIG-IP must (a) terminate the GENEVE tunnel from GWLB, (b) hand decapsulated traffic to a wildcard forwarding virtual server that applies your firewall/WAF policy, and © expose a health-check VIP on TCP 8080.

# Run on each BIG-IP. Create the GENEVE tunnel that faces GWLB.
create net tunnels tunnel gwlb-tunnel { profile geneve local-address 10.100.10.10 }

# A wildcard "L3 forwarding" virtual that inspects everything arriving on the tunnel.
# type forwarding(ip) = transparent in/out; attach AFM + ASM policy here.
create ltm virtual vs_inspect_all {
    destination 0.0.0.0:any
    ip-protocol any
    profiles add { fastL4 { } }
    vlans add { gwlb-tunnel }
    vlans-enabled
    translate-address disabled
    translate-port disabled
}

# Health-check responder GWLB probes on TCP 8080
create ltm virtual vs_healthcheck {
    destination 0.0.0.0:8080
    ip-protocol tcp
    profiles add { tcp { } }
    rules { hc_200_ok }
}
create ltm rule hc_200_ok {
    when CLIENT_ACCEPTED { TCP::respond "HTTP/1.1 200 OK\r\nContent-Length: 0\r\n\r\n"; TCP::close }
}
saveDB sys config

Attach your security policy to vs_inspect_all: an AFM network-firewall policy for the L3-L4 rules and an ASM WAF policy (or an iRule) for L7. Because the virtual is translate-address disabled, BIG-IP inspects the original client IP and destination transparently — the application servers still see the real source, and GWLB returns the packet over the same GENEVE tunnel. This is what “transparent bump-in-the-wire” means in F5 terms.

5. Insert GWLB endpoints in the spokes and fix the route tables

Now make a spoke VPC actually use the inspection layer. Create a GWLBE per AZ in the spoke, then edit route tables so traffic detours through it. The pattern for egress inspection: the spoke’s private subnets default-route to the GWLBE; the GWLBE’s subnet default-routes to the local IGW/NAT path (which here lives in the inspection VPC, reached via a Transit Gateway or VPC peering in a real multi-VPC build — shown single-VPC here for clarity).

# spoke-endpoints.tf
resource "aws_vpc_endpoint" "gwlbe" {
  for_each          = toset(["a", "b"])
  service_name      = aws_vpc_endpoint_service.gwlb.service_name
  vpc_endpoint_type = "GatewayLoadBalancer"
  vpc_id            = aws_vpc.spoke.id
  subnet_ids        = [aws_subnet.spoke_gwlbe[each.key].id]
}

# Spoke app subnet default-routes INTO the GWLB endpoint (traffic gets inspected)
resource "aws_route" "app_to_gwlbe" {
  for_each               = aws_route_table.spoke_app
  route_table_id         = each.value.id
  destination_cidr_block = "0.0.0.0/0"
  vpc_endpoint_id        = aws_vpc_endpoint.gwlbe[each.key].id
}

# Inspection VPC: after BIG-IP, traffic egresses to the internet via IGW
resource "aws_route_table" "data_egress" {
  for_each = aws_subnet.data
  vpc_id   = aws_vpc.inspection.id
  route {
    cidr_block = "0.0.0.0/0"
    gateway_id = aws_internet_gateway.igw.id
  }
}

Apply the whole stack and capture the GWLB endpoint IDs:

terraform plan -out tf.plan && terraform apply tf.plan
terraform output -json | jq '.gwlbe_ids'
aws ec2 describe-vpc-endpoints \
  --filters Name=vpc-endpoint-type,Values=GatewayLoadBalancer \
  --query 'VpcEndpoints[].{id:VpcEndpointId,state:State}' --output table

6. Wire it into the operating model

Inspection appliances that no one watches rot. Bolt the BIG-IP pair onto the platform you already run:

Identity / management access. Front the BIG-IP Configuration Utility and iControl REST API with SSO via Okta (or Microsoft Entra ID) using SAML, so admins authenticate with the corporate IdP and MFA instead of local BIG-IP accounts; map IdP groups to BIG-IP roles (admin vs. auditor). Local admin stays as glass-break only, its password leased from HashiCorp Vault.
Secrets. All sensitive values — the BIG-IP admin password, BYOL license/registration keys, the TLS private keys used for SSL-forward-proxy inspection — live in HashiCorp Vault and are pulled at boot or rotated via the Vault agent, never committed to the Terraform repo.
Cloud posture & IaC scanning. Run Wiz for agentless CSPM across the inspection account — it flags a BIG-IP data ENI that drifts to a public IP, a security group opened too wide, or a missing source_dest_check change — and run Wiz Code in the pull request to scan the Terraform for misconfigurations (public exposure, permissive SGs) before merge.
Runtime security. Endpoint workloads behind the inspection layer run CrowdStrike Falcon sensors for runtime threat detection on the EC2 fleet; Falcon detections and BIG-IP AFM/ASM blocks both feed the SOC, giving correlated network-plus-host visibility.
Observability. Ship BIG-IP telemetry with the F5 Telemetry Streaming (TS) declaration to Datadog (or Dynatrace) — throughput per appliance, CPU/TMM utilization, active connections, and GWLB target health — with a dashboard and a monitor that alerts when a target goes unhealthy or one appliance carries a lopsided share of flows. Pair it with CloudWatch metrics on the GWLB target group (HealthyHostCount, UnHealthyHostCount).
Change management. Every Terraform apply and every BIG-IP policy change flows through a ServiceNow change request; a guardrail breach (an appliance unhealthy for > 5 min, a sustained ASM attack signature) auto-raises a ServiceNow incident so network security gets a ticket, not just a Datadog blip.
Pipeline. The Terraform and the BIG-IP AS3/DO declarations live in Git; GitHub Actions (or Jenkins) runs terraform plan, the Wiz Code scan, and a policy lint on every PR, authenticating to AWS via OIDC with no stored keys. For the Kubernetes-adjacent pieces of the platform, Argo CD reconciles the declared state. Ansible (with f5networks.f5_modules) applies the in-guest tmsh/AS3 config so the appliances are reproducible, not hand-built.

This is also exactly the kind of shared virtual-appliance inspection layer that fronts other internet-facing estates — e.g. a university’s Moodle LMS or an Akamai-fronted web property where Akamai handles edge CDN/WAF and the BIG-IP pair does the deeper L4-L7 inspection and IPS on traffic reaching origin.

Validation

Prove the path before you trust it.

Targets healthy. Both BIG-IPs should be healthy in the GWLB target group:

aws elbv2 describe-target-health \
  --target-group-arn "$(terraform output -raw bigip_tg_arn)" \
  --query 'TargetHealthDescriptions[].{id:Target.Id,health:TargetHealth.State}' --output table

Traffic is actually inspected. From a spoke instance, generate egress and watch it on the BIG-IP. On each appliance:

# On the BIG-IP, confirm GENEVE arrives and the wildcard virtual sees connections
tcpdump -nni 0.0 udp port 6081 -c 20            # GENEVE-tunneled packets from GWLB
tmsh show ltm virtual vs_inspect_all | grep -E 'Connections|Bits'

From the spoke:

curl -s https://ifconfig.me ; echo          # should succeed via the inspection path

Active-active distribution. Run sustained traffic from several source IPs and confirm connections land on both appliances (tmsh show ltm virtual on each shows non-zero, roughly balanced connection counts) — GWLB hashes by flow, so many flows spread; a single long flow stays pinned.
AFM/ASM enforcement. Send a request that violates the WAF policy (e.g. a crafted SQLi pattern) and confirm BIG-IP blocks it and logs the event to Datadog/the SOC.

Failover test (chaos drill)

Do not wait for a real AZ event. Stop one BIG-IP and prove the survivor carries the load:

aws ec2 stop-instances --instance-ids "$(terraform output -raw bigip_a_id)"
# Within ~30s (3 failed health checks * 10s) the target group marks it unhealthy
watch -n5 'aws elbv2 describe-target-health \
  --target-group-arn "$(terraform output -raw bigip_tg_arn)" \
  --query "TargetHealthDescriptions[].TargetHealth.State"'
# Re-run the curl from the spoke — egress must still succeed, now via bigip-b only

New flows immediately hash to the healthy appliance. Existing flows pinned to the stopped box reset once — acceptable for a stateful firewall failure — and reconnect through the survivor. Start the instance again and confirm flows rebalance.

Rollback / teardown

Because the spokes only reach the inspection layer through a default route into the GWLBE, you can un-insert the inspection layer instantly without destroying the appliances — restore the spoke’s default route to its own NAT gateway:

# Emergency bypass: point the spoke app subnets back at the NAT gateway
aws ec2 replace-route --route-table-id "$SPOKE_RTB" \
  --destination-cidr-block 0.0.0.0/0 --nat-gateway-id "$SPOKE_NAT"

Full teardown is ordinary Terraform, but order matters — GWLB endpoints must go before the endpoint service and the GWLB:

terraform destroy -target=aws_vpc_endpoint.gwlbe      # remove spoke endpoints first
terraform destroy -target=aws_vpc_endpoint_service.gwlb
terraform destroy                                     # then the rest

If a destroy hangs on the endpoint service, confirm no GWLBE connections remain (aws ec2 describe-vpc-endpoint-connections); a lingering accepted connection blocks deletion.

Common pitfalls

source_dest_check left on. The single most common failure: GENEVE packets arrive at the BIG-IP data ENI but never return. Disable source/dest check on the data ENI (step 2).
Health check pointed at GENEVE/6081. GWLB cannot health-check the GENEVE port itself — it needs a real TCP/HTTP responder. Use the dedicated TCP/8080 virtual server (step 4); if you skip it, both targets show unhealthy and all traffic blackholes.
Asymmetric routing. If return traffic does not come back through GWLB (e.g. the inspection VPC’s egress route is wrong), the stateful BIG-IP sees half a conversation and drops it. Keep stickiness on and verify both directions traverse the appliance.
Cross-AZ data charges and AZ affinity. GWLB keeps traffic in-AZ when an in-AZ target is healthy; if only the other AZ’s appliance is up, traffic crosses AZs (works, but you pay for it). Run one appliance per AZ so each zone is self-sufficient.
MTU. GENEVE adds encapsulation overhead. Either lower the workload MTU or ensure the path supports the larger frames, or large packets fragment and throughput tanks.
Marketplace AMI not subscribed. Terraform apply fails with an opaque error if the F5 BIG-IP Marketplace listing has not been subscribed in that account/region. Subscribe first.
Wrong module bundle. A PAYG image with only LTM cannot run AFM/ASM. Pick the Best bundle (LTM+AFM+ASM) for full inspection.

Security notes

Run the appliances under least privilege: management plane reachable only from the corporate CIDR or via SSM, SAML SSO through Okta/Entra with MFA for admins, and the local admin account as glass-break only with its credential in HashiCorp Vault. For TLS inspection (SSL forward proxy), the signing CA key is the crown jewel — store it in Vault, never on disk in the repo. Let Wiz continuously verify the posture (no public data ENI, no over-broad SG) and Wiz Code gate the Terraform PR; let CrowdStrike Falcon cover the host runtime. AFM enforces the L3-L4 firewall and ASM the L7 WAF on the wildcard virtual, so inspection is a real control, not a passthrough. Pin the BIG-IP software version explicitly and patch through the ServiceNow change gate.

Cost notes

The recurring spend is two BIG-IP VE instances (m5.xlarge on-demand is meaningful 24/7 — buy Savings Plans or Reserved Instances once the design is stable), the F5 license (PAYG hourly bundle vs. cheaper-at-scale BYOL — BYOL wins past steady utilization), the GWLB hourly + LCU/GB-processed charge, and the GWLB endpoint hourly + data-processing charge per spoke. Watch cross-AZ data transfer: keeping one appliance per AZ avoids the cross-zone charge that a single-AZ design silently incurs. Right-size the instance type to measured throughput in Datadog rather than guessing — BIG-IP VE PAYG tiers are throughput-capped (e.g. 25 Mbps/200 Mbps/1 Gbps), so an over-spec’d tier is pure waste. Scaling out (more appliances in the same target group) is the lever for capacity; the active-active design means each added appliance adds usable throughput, not just standby.