IaC AWS

Terraform Module: AWS App Mesh — a versioned, default-deny service mesh boundary for ECS and EKS workloads

Quick take — A reusable hashicorp/aws ~> 5.0 Terraform module for AWS App Mesh: provisions an aws_appmesh_mesh with a deliberate egress filter and IP preference, plus optional virtual nodes and a virtual service, wired for production. New here? Jump to the Quickstart below to deploy it in minutes; read on for how it works and when to reach for it.

Quickstart (copy-paste)

Minimal, runnable configuration — drop this in a .tf file and fill in the "..." placeholders (each required input is commented):

provider "aws" {
  region = "us-east-1"
}

module "app_mesh" {
  source = "git::https://dev.azure.com/teknohut/kloudvin/_git/terraform-modules//terraform-module-aws-app-mesh?ref=v1.0.0"

  mesh_name = "..."  # Name of the mesh; the boundary all nodes/routers/servic…
}

Then terraform init && terraform apply. Every other input has a sensible default — see Inputs below to override behaviour.

What this module is

AWS App Mesh is a managed application-level service mesh built on the Envoy proxy. It does not run your containers — it standardizes how the containers you already run on ECS, Fargate, or EKS talk to each other: client-side load balancing, retries, timeouts, circuit breaking, mutual TLS, and consistent L7 telemetry, all configured as data-plane rules rather than baked into application code. The mesh itself is the outermost object in that model: a logical boundary that every virtual node, virtual service, virtual router, and route lives inside. Get the mesh wrong — most commonly by leaving egress wide open — and every workload you later attach inherits that mistake.

The aws_appmesh_mesh resource looks almost trivial: a name and a small spec. But two spec decisions are quietly load-bearing in production. The egress filter controls whether Envoy sidecars are allowed to reach destinations outside the mesh (ALLOW_ALL) or are restricted to only the virtual services you have explicitly defined (DROP_ALL) — the difference between a permissive default and a default-deny posture where every external call is an audited, declared dependency. The service-discovery IP preference dictates whether Envoy resolves and dials backends over IPv4 or IPv6, which matters the moment you run dual-stack subnets or migrate toward IPv6-only networking.

This module wraps the mesh plus the pieces you almost never want to omit. It creates the mesh with an explicit egress filter and IP preference (no silent defaults), and optionally provisions a set of virtual nodes (your actual service endpoints, with DNS or AWS Cloud Map discovery, listeners, health checks, and Envoy access logging) and a fronting virtual service, so a consumer can stand up a real, routable mesh segment from one block — and get back the mesh id, arn, and the node/service ARNs needed to render the Envoy APPMESH_RESOURCE_ARN for each task.

When to use it

Reach for something else when the mesh would only ever contain one or two services with no cross-service policy needs (the sidecar overhead is not worth it), when you need features App Mesh does not offer such as traffic mirroring or rich WASM filters (consider a self-managed Envoy/Istio or AWS VPC Lattice), or when your traffic is purely north-south at the edge — that is an ALB/API Gateway concern, not a mesh one. Note also that AWS has announced App Mesh’s end-of-support timeline; treat this module as the right tool for existing App Mesh estates and migrations, and evaluate VPC Lattice for greenfield east-west connectivity.

Module structure

terraform-module-aws-app-mesh/
├── versions.tf      # provider + Terraform version pins
├── main.tf          # mesh, optional virtual nodes, optional virtual service
├── variables.tf     # var-driven inputs with validation
└── outputs.tf       # mesh + node/service identifiers and ARNs
# versions.tf
terraform {
  required_version = ">= 1.5.0"

  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "~> 5.0"
    }
  }
}
# main.tf

locals {
  tags = merge(
    {
      "Name"      = var.mesh_name
      "ManagedBy" = "terraform"
      "Module"    = "terraform-module-aws-app-mesh"
    },
    var.tags,
  )
}

# ---------------------------------------------------------------------------
# The mesh — the outermost boundary. The egress filter and IP preference are
# set explicitly so the posture is reviewed in code, never silently defaulted.
# DROP_ALL means sidecars may only reach declared virtual services.
# ---------------------------------------------------------------------------
resource "aws_appmesh_mesh" "this" {
  name = var.mesh_name

  spec {
    egress_filter {
      type = var.egress_filter_type
    }

    dynamic "service_discovery" {
      for_each = var.ip_preference != null ? [1] : []
      content {
        ip_preference = var.ip_preference
      }
    }
  }

  tags = local.tags
}

# ---------------------------------------------------------------------------
# Virtual nodes — the concrete service endpoints. Each gets a listener with a
# health check, a discovery method (DNS hostname or AWS Cloud Map), optional
# declared backends, and Envoy access logging to stdout for the log pipeline.
# ---------------------------------------------------------------------------
resource "aws_appmesh_virtual_node" "this" {
  for_each = var.virtual_nodes

  name      = each.key
  mesh_name = aws_appmesh_mesh.this.name

  spec {
    listener {
      port_mapping {
        port     = each.value.port
        protocol = each.value.protocol
      }

      health_check {
        protocol            = each.value.protocol
        port                = each.value.port
        path                = contains(["http", "http2"], each.value.protocol) ? each.value.health_check_path : null
        healthy_threshold   = each.value.health_check_healthy_threshold
        unhealthy_threshold = each.value.health_check_unhealthy_threshold
        interval_millis     = each.value.health_check_interval_millis
        timeout_millis      = each.value.health_check_timeout_millis
      }
    }

    # Declared backends — required when the mesh egress filter is DROP_ALL,
    # since only listed virtual services are reachable from this node.
    dynamic "backend" {
      for_each = each.value.backends
      content {
        virtual_service {
          virtual_service_name = backend.value
        }
      }
    }

    service_discovery {
      # AWS Cloud Map discovery when a namespace is supplied; otherwise DNS.
      dynamic "aws_cloud_map" {
        for_each = each.value.cloud_map_namespace_name != null ? [1] : []
        content {
          namespace_name = each.value.cloud_map_namespace_name
          service_name   = coalesce(each.value.cloud_map_service_name, each.key)
        }
      }

      dynamic "dns" {
        for_each = each.value.cloud_map_namespace_name == null ? [1] : []
        content {
          hostname      = each.value.dns_hostname
          ip_preference = var.ip_preference
        }
      }
    }

    dynamic "logging" {
      for_each = var.enable_access_logs ? [1] : []
      content {
        access_log {
          file {
            path = var.access_log_path
          }
        }
      }
    }
  }

  tags = local.tags
}

# ---------------------------------------------------------------------------
# Optional virtual service — the stable name other services target. Here it is
# backed directly by one virtual node (a virtual router can be layered later
# for weighted routing across node versions).
# ---------------------------------------------------------------------------
resource "aws_appmesh_virtual_service" "this" {
  count = var.virtual_service_name != null ? 1 : 0

  name      = var.virtual_service_name
  mesh_name = aws_appmesh_mesh.this.name

  spec {
    provider {
      virtual_node {
        virtual_node_name = aws_appmesh_virtual_node.this[var.virtual_service_backend_node].name
      }
    }
  }

  tags = local.tags
}
# variables.tf

variable "mesh_name" {
  description = "Name of the service mesh. Used as the boundary every node, router, and service is created within."
  type        = string

  validation {
    condition     = can(regex("^[A-Za-z0-9][A-Za-z0-9_-]{0,254}$", var.mesh_name))
    error_message = "mesh_name must be 1-255 chars, start alphanumeric, and contain only letters, numbers, hyphens, or underscores."
  }
}

variable "egress_filter_type" {
  description = "Egress posture for sidecars: DROP_ALL (default-deny; only declared virtual services are reachable) or ALLOW_ALL (any destination)."
  type        = string
  default     = "DROP_ALL"

  validation {
    condition     = contains(["ALLOW_ALL", "DROP_ALL"], var.egress_filter_type)
    error_message = "egress_filter_type must be either DROP_ALL or ALLOW_ALL."
  }
}

variable "ip_preference" {
  description = "Service-discovery IP version preference for the mesh and node DNS resolution. Null lets App Mesh use its default (IPv6_PREFERRED)."
  type        = string
  default     = null

  validation {
    condition     = var.ip_preference == null || contains(["IPv6_PREFERRED", "IPv4_PREFERRED", "IPv6_ONLY", "IPv4_ONLY"], var.ip_preference)
    error_message = "ip_preference must be one of IPv6_PREFERRED, IPv4_PREFERRED, IPv6_ONLY, IPv4_ONLY, or null."
  }
}

variable "virtual_nodes" {
  description = <<-EOT
    Map of virtual node name => endpoint definition. Each node becomes an
    aws_appmesh_virtual_node with a listener, health check, discovery, and
    optional declared backends. Set cloud_map_namespace_name to use AWS Cloud
    Map discovery; otherwise dns_hostname is required.
  EOT
  type = map(object({
    port                             = number
    protocol                         = optional(string, "http")
    dns_hostname                     = optional(string)
    cloud_map_namespace_name         = optional(string)
    cloud_map_service_name           = optional(string)
    backends                         = optional(list(string), [])
    health_check_path                = optional(string, "/")
    health_check_healthy_threshold   = optional(number, 3)
    health_check_unhealthy_threshold = optional(number, 3)
    health_check_interval_millis     = optional(number, 5000)
    health_check_timeout_millis      = optional(number, 2000)
  }))
  default = {}

  validation {
    condition = alltrue([
      for n in values(var.virtual_nodes) :
      contains(["http", "http2", "grpc", "tcp"], n.protocol)
    ])
    error_message = "Each virtual node protocol must be one of http, http2, grpc, tcp."
  }

  validation {
    condition = alltrue([
      for n in values(var.virtual_nodes) :
      n.cloud_map_namespace_name != null || n.dns_hostname != null
    ])
    error_message = "Each virtual node needs either cloud_map_namespace_name (Cloud Map) or dns_hostname (DNS)."
  }

  validation {
    condition = alltrue([
      for n in values(var.virtual_nodes) :
      n.port > 0 && n.port <= 65535
    ])
    error_message = "Each virtual node port must be between 1 and 65535."
  }
}

variable "virtual_service_name" {
  description = "Optional fully qualified virtual service name other services target (e.g. orders.svc.cluster.local). Null skips creating a virtual service."
  type        = string
  default     = null
}

variable "virtual_service_backend_node" {
  description = "Key in virtual_nodes that backs the virtual service. Required when virtual_service_name is set."
  type        = string
  default     = null
}

variable "enable_access_logs" {
  description = "Emit Envoy access logs from each virtual node. The path is interpreted inside the Envoy container; /dev/stdout ships them to the task log driver."
  type        = bool
  default     = true
}

variable "access_log_path" {
  description = "File path Envoy writes access logs to. Use /dev/stdout to forward to the container log driver (awslogs/Fluent Bit)."
  type        = string
  default     = "/dev/stdout"
}

variable "tags" {
  description = "Additional tags merged onto every resource the module creates."
  type        = map(string)
  default     = {}
}
# outputs.tf

output "mesh_id" {
  description = "The App Mesh mesh ID (its name)."
  value       = aws_appmesh_mesh.this.id
}

output "mesh_name" {
  description = "Name of the mesh (set as APPMESH_VIRTUAL_NODE_NAME mesh segment for sidecars)."
  value       = aws_appmesh_mesh.this.name
}

output "mesh_arn" {
  description = "ARN of the mesh (use to grant appmesh:* IAM permissions scoped to this mesh)."
  value       = aws_appmesh_mesh.this.arn
}

output "mesh_owner" {
  description = "AWS account ID that owns the mesh (relevant for shared/multi-account meshes)."
  value       = aws_appmesh_mesh.this.mesh_owner
}

output "egress_filter_type" {
  description = "The effective egress filter applied to the mesh (DROP_ALL or ALLOW_ALL)."
  value       = var.egress_filter_type
}

output "virtual_node_arns" {
  description = "Map of virtual node name => ARN. Feed the ARN into each task's Envoy APPMESH_RESOURCE_ARN."
  value       = { for k, v in aws_appmesh_virtual_node.this : k => v.arn }
}

output "virtual_node_names" {
  description = "Map of virtual node name => resolved node name (matches the map keys)."
  value       = { for k, v in aws_appmesh_virtual_node.this : k => v.name }
}

output "virtual_service_arn" {
  description = "ARN of the fronting virtual service, or null when none was created."
  value       = try(aws_appmesh_virtual_service.this[0].arn, null)
}

output "virtual_service_name" {
  description = "Name of the fronting virtual service, or null when none was created."
  value       = try(aws_appmesh_virtual_service.this[0].name, null)
}

How to use it

module "app_mesh" {
  source = "git::https://dev.azure.com/teknohut/kloudvin/_git/terraform-modules//terraform-module-aws-app-mesh?ref=v1.0.0"

  mesh_name = "commerce-prod"

  # Default-deny egress: services may only reach the virtual services we declare.
  egress_filter_type = "DROP_ALL"
  ip_preference      = "IPv4_ONLY"

  virtual_nodes = {
    # The orders service, discovered via AWS Cloud Map, allowed to call payments.
    "orders-vn" = {
      port                     = 8080
      protocol                 = "http2"
      cloud_map_namespace_name = "commerce.local"
      cloud_map_service_name   = "orders"
      backends                 = ["payments.commerce.local"]
      health_check_path        = "/healthz"
    }

    # The payments service, discovered via DNS, no outbound mesh backends.
    "payments-vn" = {
      port              = 8080
      protocol          = "http2"
      dns_hostname      = "payments.commerce.local"
      health_check_path = "/healthz"
    }
  }

  # A stable virtual service name other meshed services target.
  virtual_service_name         = "orders.commerce.local"
  virtual_service_backend_node = "orders-vn"

  enable_access_logs = true
  access_log_path    = "/dev/stdout"

  tags = {
    Team        = "commerce"
    Environment = "prod"
    CostCenter  = "cc-4471"
  }
}

# Downstream: inject the orders virtual-node ARN into the ECS task's Envoy
# sidecar so the proxy registers against the correct mesh resource.
resource "aws_ecs_task_definition" "orders" {
  family = "orders"

  proxy_configuration {
    type           = "APPMESH"
    container_name = "envoy"
    properties = {
      AppPorts         = "8080"
      ProxyIngressPort = "15000"
      ProxyEgressPort  = "15001"
      IgnoredUID       = "1337"
      EgressIgnoredIPs = "169.254.170.2,169.254.169.254"
    }
  }

  container_definitions = jsonencode([
    {
      name  = "envoy"
      image = "public.ecr.aws/appmesh/aws-appmesh-envoy:v1.29.5.0-prod"
      environment = [
        {
          name  = "APPMESH_RESOURCE_ARN"
          value = module.app_mesh.virtual_node_arns["orders-vn"]
        }
      ]
    }
    # ... your application container ...
  ])
}

With Terragrunt

Terragrunt keeps this module DRY across environments — define the backend and provider once in a root config, then a thin terragrunt.hcl per environment supplies only the inputs that differ.

1. Root configlive/terragrunt.hcl (inherited by every module):

remote_state {
  backend = "s3"
  generate = { path = "backend.tf", if_exists = "overwrite" }
  config = {
    # ...s3 state bucket/container + key per path...
  }
}

2. Module configlive/prod/app_mesh/terragrunt.hcl:

include "root" {
  path = find_in_parent_folders()
}

terraform {
  source = "git::https://dev.azure.com/teknohut/kloudvin/_git/terraform-modules//terraform-module-aws-app-mesh?ref=v1.0.0"
}

inputs = {
  mesh_name = "..."
}

3. Deploy one environment, or roll out all modules together:

cd live/prod/app_mesh && terragrunt apply        # this module
terragrunt run-all apply                      # every module under live/prod

Why Terragrunt here: the backend and provider live in one place instead of being copy-pasted into every module; inputs is overridden per environment (dev / stage / prod) without forking the module; and run-all orchestrates dependencies across modules. Reach for it once you have more than one environment or more than a handful of modules — for a single stack, the plain Quickstart above is enough.

Inputs

Name Type Default Required Description
mesh_name string yes Name of the mesh; the boundary all nodes/routers/services live in (validated).
egress_filter_type string "DROP_ALL" no DROP_ALL (default-deny) or ALLOW_ALL egress posture for sidecars.
ip_preference string null no IPv6_PREFERRED / IPv4_PREFERRED / IPv6_ONLY / IPv4_ONLY; null = App Mesh default.
virtual_nodes map(object) {} no Map of node name → endpoint def (port, protocol, discovery, backends, health check).
virtual_service_name string null no Stable virtual service name other services target; null skips it.
virtual_service_backend_node string null no Which virtual_nodes key backs the virtual service (required if name set).
enable_access_logs bool true no Emit Envoy access logs from each virtual node.
access_log_path string "/dev/stdout" no Path Envoy writes access logs to (/dev/stdout → container log driver).
tags map(string) {} no Extra tags merged onto every resource.

Per-node object fields (within virtual_nodes): port (number, required), protocol (http/http2/grpc/tcp, default http), dns_hostname or cloud_map_namespace_name (one required), cloud_map_service_name, backends (list of virtual service names), health_check_path (default /), health_check_healthy_threshold (3), health_check_unhealthy_threshold (3), health_check_interval_millis (5000), health_check_timeout_millis (2000).

Outputs

Name Description
mesh_id The mesh ID (its name).
mesh_name Name of the mesh (the mesh segment of each sidecar’s resource ARN).
mesh_arn ARN of the mesh (for scoping appmesh:* IAM permissions).
mesh_owner AWS account ID owning the mesh (for shared/multi-account meshes).
egress_filter_type Effective egress filter (DROP_ALL or ALLOW_ALL).
virtual_node_arns Map of node name → ARN; feed into each task’s APPMESH_RESOURCE_ARN.
virtual_node_names Map of node name → resolved node name.
virtual_service_arn ARN of the fronting virtual service, or null.
virtual_service_name Name of the fronting virtual service, or null.

Enterprise scenario

A commerce platform runs about thirty east-west microservices on ECS Fargate that previously called each other over raw service-discovery DNS with no consistent retries, TLS, or audit of who-calls-what. The platform team adopted this module to define a commerce-prod mesh with egress_filter_type = "DROP_ALL", so each service’s allowed dependencies became an explicit, code-reviewed backends list — when a new service wants to call payments, that edge now shows up as a pull request, not a silent runtime discovery. Every virtual node ships Envoy access logs to /dev/stdout, which the existing Fluent Bit sidecar already forwards to OpenSearch, giving the SRE team uniform L7 latency and 5xx dashboards across all thirty services without touching a single line of application code. As the company plans its move off App Mesh ahead of end-of-support, having the entire mesh topology declared in one versioned module makes the dependency graph it must reproduce in VPC Lattice fully enumerable.

Best practices

TerraformAWSApp MeshModuleIaC
Need this built for real?

Vinod is a Senior Cloud Architect (22+ yrs) — available for Azure / AWS / GCP architecture, landing zones, and migrations.

Work with me

Comments

Keep Reading