IaC AWS

Terraform Module: AWS API Gateway (REST) — a versioned, deployable REST API edge in one module

Quick take — Build a production-ready AWS API Gateway REST API with Terraform: regional endpoints, OpenAPI body import, stage with X-Ray and access logs, CloudWatch alarms, and clean redeploy triggers. New here? Jump to the Quickstart below to deploy it in minutes; read on for how it works and when to reach for it.

Quickstart (copy-paste)

Minimal, runnable configuration — drop this in a .tf file and fill in the "..." placeholders (each required input is commented):

provider "aws" {
  region = "us-east-1"
}

module "api_gateway" {
  source = "git::https://dev.azure.com/teknohut/kloudvin/_git/terraform-modules//terraform-module-aws-api-gateway?ref=v1.0.0"

  name         = "..."  # Name of the REST API; also used in log group and alarm …
  openapi_body = "..."  # OpenAPI 3.0 JSON document with `x-amazon-apigateway-int…
}

Then terraform init && terraform apply. Every other input has a sensible default — see Inputs below to override behaviour.

What this module is

AWS API Gateway (REST) is the v1, fully-featured HTTP front door for your backends — it terminates TLS, authenticates callers, validates requests, throttles traffic, and proxies to Lambda, HTTP endpoints, or other AWS services. Unlike the cheaper HTTP API (v2), the REST flavour gives you request/response transformations (VTL mapping templates), API keys with usage plans, WAF integration, private (VPC) endpoints, and resource policies — the features regulated enterprises usually need.

The catch is that a working REST API is never one resource. You stand up an aws_api_gateway_rest_api, then a tree of resources, methods, integrations, method/integration responses, a deployment, and a stage — plus the maddening detail that the deployment will not redeploy on config changes unless you give it an explicit trigger. This module wraps all of that into one versioned unit driven by an OpenAPI document, wires up a managed stage with X-Ray tracing and structured access logging, and exposes the invoke_url and execution ARN downstream consumers actually need. You define the API contract once in OpenAPI; the module owns the lifecycle, observability, and safe redeploys.

When to use it

Reach for the HTTP API (aws_apigatewayv2_api) instead if you only need a thin, low-latency Lambda/JWT proxy and none of the REST features above — it is roughly 70% cheaper per million calls.

Module structure

terraform-module-aws-api-gateway/
├── versions.tf      # provider + Terraform version pins
├── main.tf          # rest_api, deployment, stage, method settings, logs, alarm
├── variables.tf     # var-driven inputs with validation
└── outputs.tf       # id, arns, invoke_url, stage name

versions.tf

terraform {
  required_version = ">= 1.5.0"

  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "~> 5.0"
    }
  }
}

main.tf

locals {
  # A stable hash of the API contract + key settings. Any change here forces a
  # fresh deployment, which is the only reliable way to redeploy a REST API.
  redeploy_trigger = sha1(jsonencode({
    body     = var.openapi_body
    endpoint = var.endpoint_type
    policy   = var.resource_policy
  }))

  access_log_format = jsonencode({
    requestId      = "$context.requestId"
    ip             = "$context.identity.sourceIp"
    requestTime    = "$context.requestTime"
    httpMethod     = "$context.httpMethod"
    resourcePath   = "$context.resourcePath"
    status         = "$context.status"
    protocol       = "$context.protocol"
    responseLength = "$context.responseLength"
    integrationLatency = "$context.integrationLatency"
    error          = "$context.error.message"
  })
}

resource "aws_api_gateway_rest_api" "this" {
  name        = var.name
  description = var.description

  # OpenAPI 3.0 document (with x-amazon-apigateway-integration blocks) defines
  # the full resource/method/integration tree in one import.
  body = var.openapi_body

  # Body import is authoritative for the resource tree; let it overwrite drift.
  put_rest_api_mode = "overwrite"

  endpoint_configuration {
    types            = [var.endpoint_type]
    vpc_endpoint_ids = var.endpoint_type == "PRIVATE" ? var.vpc_endpoint_ids : null
  }

  # Resource policy (required for PRIVATE; optional otherwise).
  policy = var.resource_policy

  # Smallest payload that may be sent compressed (bytes). null disables.
  minimum_compression_size = var.minimum_compression_size

  tags = var.tags
}

resource "aws_api_gateway_deployment" "this" {
  rest_api_id = aws_api_gateway_rest_api.this.id

  triggers = {
    redeployment = local.redeploy_trigger
  }

  # Create the new deployment before destroying the old one so the stage is
  # never pointed at a deleted deployment.
  lifecycle {
    create_before_destroy = true
  }
}

# Access log group, encrypted at rest if a KMS key is supplied.
resource "aws_cloudwatch_log_group" "access" {
  name              = "/aws/apigateway/${var.name}/${var.stage_name}"
  retention_in_days = var.log_retention_days
  kms_key_id        = var.log_kms_key_arn
  tags              = var.tags
}

resource "aws_api_gateway_stage" "this" {
  rest_api_id          = aws_api_gateway_rest_api.this.id
  deployment_id        = aws_api_gateway_deployment.this.id
  stage_name           = var.stage_name
  xray_tracing_enabled = var.xray_tracing_enabled
  cache_cluster_enabled = var.cache_cluster_enabled
  cache_cluster_size    = var.cache_cluster_enabled ? var.cache_cluster_size : null

  access_log_settings {
    destination_arn = aws_cloudwatch_log_group.access.arn
    format          = local.access_log_format
  }

  variables = var.stage_variables
  tags      = var.tags
}

# Account-wide / stage-wide method settings: throttling + execution logging.
resource "aws_api_gateway_method_settings" "this" {
  rest_api_id = aws_api_gateway_rest_api.this.id
  stage_name  = aws_api_gateway_stage.this.stage_name
  method_path = "*/*" # all methods on all resources

  settings {
    metrics_enabled        = true
    logging_level          = var.execution_logging_level
    data_trace_enabled     = var.data_trace_enabled
    throttling_burst_limit = var.throttling_burst_limit
    throttling_rate_limit  = var.throttling_rate_limit
  }
}

# Optional WAFv2 web ACL association (regional scope).
resource "aws_wafv2_web_acl_association" "this" {
  count        = var.web_acl_arn != null ? 1 : 0
  resource_arn = aws_api_gateway_stage.this.arn
  web_acl_arn  = var.web_acl_arn
}

# Guardrail alarm on server-side errors for this stage.
resource "aws_cloudwatch_metric_alarm" "five_xx" {
  count               = var.create_5xx_alarm ? 1 : 0
  alarm_name          = "${var.name}-${var.stage_name}-5xx"
  comparison_operator = "GreaterThanThreshold"
  evaluation_periods  = 1
  metric_name         = "5XXError"
  namespace           = "AWS/ApiGateway"
  period              = 300
  statistic           = "Sum"
  threshold           = var.five_xx_alarm_threshold
  alarm_description   = "API Gateway 5XX errors for ${var.name}/${var.stage_name}"
  treat_missing_data  = "notBreaching"

  dimensions = {
    ApiName = var.name
    Stage   = var.stage_name
  }

  alarm_actions = var.alarm_sns_topic_arns
  ok_actions    = var.alarm_sns_topic_arns
  tags          = var.tags
}

variables.tf

variable "name" {
  description = "Name of the REST API (also used in log group + alarm naming)."
  type        = string

  validation {
    condition     = can(regex("^[A-Za-z0-9-_]{1,128}$", var.name))
    error_message = "name must be 1-128 chars: letters, digits, hyphen, underscore."
  }
}

variable "description" {
  description = "Human-readable description of the API."
  type        = string
  default     = "Managed by Terraform"
}

variable "openapi_body" {
  description = "OpenAPI 3.0 document (JSON string) with x-amazon-apigateway-integration blocks defining the full API."
  type        = string
}

variable "endpoint_type" {
  description = "Endpoint type: EDGE, REGIONAL, or PRIVATE."
  type        = string
  default     = "REGIONAL"

  validation {
    condition     = contains(["EDGE", "REGIONAL", "PRIVATE"], var.endpoint_type)
    error_message = "endpoint_type must be one of EDGE, REGIONAL, PRIVATE."
  }
}

variable "vpc_endpoint_ids" {
  description = "VPC endpoint IDs to bind when endpoint_type is PRIVATE."
  type        = list(string)
  default     = []
}

variable "resource_policy" {
  description = "Resource policy JSON. Required for PRIVATE endpoints to scope access to your VPCE(s)."
  type        = string
  default     = null
}

variable "minimum_compression_size" {
  description = "Minimum response size in bytes before gzip is applied (0-10485760). null disables compression."
  type        = number
  default     = null

  validation {
    condition     = var.minimum_compression_size == null || (var.minimum_compression_size >= 0 && var.minimum_compression_size <= 10485760)
    error_message = "minimum_compression_size must be between 0 and 10485760, or null."
  }
}

variable "stage_name" {
  description = "Name of the deployment stage (e.g. prod, staging)."
  type        = string
  default     = "prod"

  validation {
    condition     = can(regex("^[A-Za-z0-9-_]{1,128}$", var.stage_name))
    error_message = "stage_name must be 1-128 chars: letters, digits, hyphen, underscore."
  }
}

variable "stage_variables" {
  description = "Stage variables map (e.g. { lambdaAlias = \"live\" }) for use in integration URIs."
  type        = map(string)
  default     = {}
}

variable "xray_tracing_enabled" {
  description = "Enable AWS X-Ray active tracing on the stage."
  type        = bool
  default     = true
}

variable "cache_cluster_enabled" {
  description = "Enable the stage response cache (billed hourly while on)."
  type        = bool
  default     = false
}

variable "cache_cluster_size" {
  description = "Cache cluster size in GB. Allowed: 0.5, 1.6, 6.1, 13.5, 28.4, 58.2, 118, 237."
  type        = string
  default     = "0.5"

  validation {
    condition     = contains(["0.5", "1.6", "6.1", "13.5", "28.4", "58.2", "118", "237"], var.cache_cluster_size)
    error_message = "cache_cluster_size must be one of the allowed API Gateway cache sizes."
  }
}

variable "execution_logging_level" {
  description = "Execution log level for all methods: OFF, ERROR, or INFO."
  type        = string
  default     = "INFO"

  validation {
    condition     = contains(["OFF", "ERROR", "INFO"], var.execution_logging_level)
    error_message = "execution_logging_level must be OFF, ERROR, or INFO."
  }
}

variable "data_trace_enabled" {
  description = "Log full request/response bodies in execution logs. Keep false in prod (may capture PII)."
  type        = bool
  default     = false
}

variable "throttling_burst_limit" {
  description = "Stage-wide throttling burst limit (concurrent). -1 leaves account default."
  type        = number
  default     = 5000
}

variable "throttling_rate_limit" {
  description = "Stage-wide steady-state throttling rate (requests/sec). -1 leaves account default."
  type        = number
  default     = 10000
}

variable "log_retention_days" {
  description = "Retention for the access log group in days."
  type        = number
  default     = 30

  validation {
    condition     = contains([1, 3, 5, 7, 14, 30, 60, 90, 120, 150, 180, 365, 400, 545, 731, 1827, 3653], var.log_retention_days)
    error_message = "log_retention_days must be a valid CloudWatch Logs retention value."
  }
}

variable "log_kms_key_arn" {
  description = "KMS key ARN to encrypt the access log group. null uses CloudWatch default encryption."
  type        = string
  default     = null
}

variable "web_acl_arn" {
  description = "Regional WAFv2 web ACL ARN to associate with the stage. null skips association."
  type        = string
  default     = null
}

variable "create_5xx_alarm" {
  description = "Create a CloudWatch alarm on stage 5XXError count."
  type        = bool
  default     = true
}

variable "five_xx_alarm_threshold" {
  description = "5XXError sum (per 5 min) above which the alarm fires."
  type        = number
  default     = 5
}

variable "alarm_sns_topic_arns" {
  description = "SNS topic ARNs to notify on alarm/OK transitions."
  type        = list(string)
  default     = []
}

variable "tags" {
  description = "Tags applied to all taggable resources."
  type        = map(string)
  default     = {}
}

outputs.tf

output "rest_api_id" {
  description = "ID of the REST API."
  value       = aws_api_gateway_rest_api.this.id
}

output "rest_api_name" {
  description = "Name of the REST API."
  value       = aws_api_gateway_rest_api.this.name
}

output "rest_api_arn" {
  description = "ARN of the REST API."
  value       = aws_api_gateway_rest_api.this.arn
}

output "root_resource_id" {
  description = "Resource ID of the API's root path (useful for attaching extra resources)."
  value       = aws_api_gateway_rest_api.this.root_resource_id
}

output "execution_arn" {
  description = "Execution ARN prefix used in Lambda permission source_arn (e.g. <execution_arn>/*/*/*)."
  value       = aws_api_gateway_rest_api.this.execution_arn
}

output "stage_name" {
  description = "Deployed stage name."
  value       = aws_api_gateway_stage.this.stage_name
}

output "stage_arn" {
  description = "ARN of the stage (used for WAF association / IAM)."
  value       = aws_api_gateway_stage.this.arn
}

output "invoke_url" {
  description = "Base HTTPS URL clients call for the deployed stage."
  value       = aws_api_gateway_stage.this.invoke_url
}

output "access_log_group_name" {
  description = "CloudWatch Logs group receiving stage access logs."
  value       = aws_cloudwatch_log_group.access.name
}

How to use it

# OpenAPI contract: a /orders GET proxied to a Lambda (AWS_PROXY integration).
locals {
  orders_openapi = jsonencode({
    openapi = "3.0.1"
    info    = { title = "orders-api", version = "1.0" }
    paths = {
      "/orders" = {
        get = {
          x-amazon-apigateway-integration = {
            type                 = "AWS_PROXY"
            httpMethod           = "POST" # always POST for Lambda invocation
            uri                  = "arn:aws:apigateway:${var.region}:lambda:path/2015-03-31/functions/${aws_lambda_function.orders.arn}/invocations"
            passthroughBehavior  = "when_no_match"
            payloadFormatVersion = "1.0"
          }
          responses = { "200" = { description = "OK" } }
        }
      }
    }
  })
}

module "api_gateway_rest_orders" {
  source = "git::https://dev.azure.com/teknohut/kloudvin/_git/terraform-modules//terraform-module-aws-api-gateway?ref=v1.0.0"

  name          = "orders-api"
  description   = "Public orders REST API"
  endpoint_type = "REGIONAL"
  stage_name    = "prod"

  openapi_body = local.orders_openapi

  # Observability + guardrails
  xray_tracing_enabled = true
  log_retention_days   = 90
  web_acl_arn          = aws_wafv2_web_acl.edge.arn
  create_5xx_alarm     = true
  alarm_sns_topic_arns = [aws_sns_topic.alerts.arn]

  # Throttle defensively for a public API
  throttling_rate_limit  = 2000
  throttling_burst_limit = 1000

  tags = {
    team        = "checkout"
    environment = "prod"
  }
}

# API Gateway must be granted permission to invoke the backing Lambda.
# The execution_arn output scopes the permission to exactly this API.
resource "aws_lambda_permission" "allow_apigw" {
  statement_id  = "AllowAPIGatewayInvoke"
  action        = "lambda:InvokeFunction"
  function_name = aws_lambda_function.orders.function_name
  principal     = "apigateway.amazonaws.com"
  source_arn    = "${module.api_gateway_rest_orders.execution_arn}/*/*/*"
}

# Downstream: publish the live base URL to SSM for clients to discover.
resource "aws_ssm_parameter" "orders_api_url" {
  name  = "/checkout/orders-api/base-url"
  type  = "String"
  value = module.api_gateway_rest_orders.invoke_url
}

With Terragrunt

Terragrunt keeps this module DRY across environments — define the backend and provider once in a root config, then a thin terragrunt.hcl per environment supplies only the inputs that differ.

1. Root configlive/terragrunt.hcl (inherited by every module):

remote_state {
  backend = "s3"
  generate = { path = "backend.tf", if_exists = "overwrite" }
  config = {
    # ...s3 state bucket/container + key per path...
  }
}

2. Module configlive/prod/api_gateway/terragrunt.hcl:

include "root" {
  path = find_in_parent_folders()
}

terraform {
  source = "git::https://dev.azure.com/teknohut/kloudvin/_git/terraform-modules//terraform-module-aws-api-gateway?ref=v1.0.0"
}

inputs = {
  name = "..."
  openapi_body = "..."
}

3. Deploy one environment, or roll out all modules together:

cd live/prod/api_gateway && terragrunt apply        # this module
terragrunt run-all apply                      # every module under live/prod

Why Terragrunt here: the backend and provider live in one place instead of being copy-pasted into every module; inputs is overridden per environment (dev / stage / prod) without forking the module; and run-all orchestrates dependencies across modules. Reach for it once you have more than one environment or more than a handful of modules — for a single stack, the plain Quickstart above is enough.

Inputs

Name Type Default Required Description
name string yes Name of the REST API; also used in log group and alarm naming.
description string "Managed by Terraform" no Human-readable API description.
openapi_body string yes OpenAPI 3.0 JSON document with x-amazon-apigateway-integration blocks.
endpoint_type string "REGIONAL" no EDGE, REGIONAL, or PRIVATE.
vpc_endpoint_ids list(string) [] no VPC endpoint IDs bound when endpoint_type = PRIVATE.
resource_policy string null no Resource policy JSON (required for PRIVATE).
minimum_compression_size number null no Min response bytes before gzip (0–10485760); null disables.
stage_name string "prod" no Deployment stage name.
stage_variables map(string) {} no Stage variables referenced in integration URIs.
xray_tracing_enabled bool true no Enable X-Ray active tracing on the stage.
cache_cluster_enabled bool false no Enable the stage response cache (billed hourly).
cache_cluster_size string "0.5" no Cache size in GB (allowed API Gateway sizes only).
execution_logging_level string "INFO" no Method execution log level: OFF, ERROR, INFO.
data_trace_enabled bool false no Log full request/response bodies (keep false in prod).
throttling_burst_limit number 5000 no Stage-wide burst limit; -1 keeps account default.
throttling_rate_limit number 10000 no Stage-wide steady-state rate (req/s); -1 keeps default.
log_retention_days number 30 no Access log group retention (valid CloudWatch value).
log_kms_key_arn string null no KMS key ARN to encrypt the access log group.
web_acl_arn string null no Regional WAFv2 web ACL ARN to associate with the stage.
create_5xx_alarm bool true no Create a CloudWatch alarm on stage 5XXError.
five_xx_alarm_threshold number 5 no 5XXError sum per 5 min that trips the alarm.
alarm_sns_topic_arns list(string) [] no SNS topics notified on alarm/OK.
tags map(string) {} no Tags applied to all taggable resources.

Outputs

Name Description
rest_api_id ID of the REST API.
rest_api_name Name of the REST API.
rest_api_arn ARN of the REST API.
root_resource_id Resource ID of the API root path.
execution_arn Execution ARN prefix for Lambda permission source_arn.
stage_name Deployed stage name.
stage_arn ARN of the stage (WAF association / IAM).
invoke_url Base HTTPS URL clients call for the stage.
access_log_group_name CloudWatch Logs group receiving access logs.

Enterprise scenario

A retail platform team runs a multi-account AWS org where every product squad ships its own REST API. They standardise on this module so all 40+ APIs land with identical JSON access logs, 90-day retention, X-Ray tracing, and a regional WAF web ACL — satisfying the security team’s audit baseline without per-team review. Each squad supplies only its OpenAPI contract and Lambda ARNs; the platform’s CI passes the shared web_acl_arn, alarm_sns_topic_arns, and log_kms_key_arn so encryption, alerting, and edge protection are enforced centrally rather than left to each team to remember.

Best practices

TerraformAWSAPI Gateway (REST)ModuleIaC
Need this built for real?

Vinod is a Senior Cloud Architect (22+ yrs) — available for Azure / AWS / GCP architecture, landing zones, and migrations.

Work with me

Comments

Keep Reading