Quick take — Build a production-ready AWS API Gateway REST API with Terraform: regional endpoints, OpenAPI body import, stage with X-Ray and access logs, CloudWatch alarms, and clean redeploy triggers. New here? Jump to the Quickstart below to deploy it in minutes; read on for how it works and when to reach for it.
Quickstart (copy-paste)
Minimal, runnable configuration — drop this in a .tf file and fill in the "..." placeholders (each required input is commented):
provider "aws" {
region = "us-east-1"
}
module "api_gateway" {
source = "git::https://dev.azure.com/teknohut/kloudvin/_git/terraform-modules//terraform-module-aws-api-gateway?ref=v1.0.0"
name = "..." # Name of the REST API; also used in log group and alarm …
openapi_body = "..." # OpenAPI 3.0 JSON document with `x-amazon-apigateway-int…
}
Then terraform init && terraform apply. Every other input has a sensible default — see Inputs below to override behaviour.
What this module is
AWS API Gateway (REST) is the v1, fully-featured HTTP front door for your backends — it terminates TLS, authenticates callers, validates requests, throttles traffic, and proxies to Lambda, HTTP endpoints, or other AWS services. Unlike the cheaper HTTP API (v2), the REST flavour gives you request/response transformations (VTL mapping templates), API keys with usage plans, WAF integration, private (VPC) endpoints, and resource policies — the features regulated enterprises usually need.
The catch is that a working REST API is never one resource. You stand up an aws_api_gateway_rest_api, then a tree of resources, methods, integrations, method/integration responses, a deployment, and a stage — plus the maddening detail that the deployment will not redeploy on config changes unless you give it an explicit trigger. This module wraps all of that into one versioned unit driven by an OpenAPI document, wires up a managed stage with X-Ray tracing and structured access logging, and exposes the invoke_url and execution ARN downstream consumers actually need. You define the API contract once in OpenAPI; the module owns the lifecycle, observability, and safe redeploys.
When to use it
- You are publishing one or more REST APIs and want the contract defined as OpenAPI 3.0 (
bodyimport) rather than dozens of hand-wiredaws_api_gateway_method/integrationresources. - You need REST-only features: API keys + usage plans, VTL request/response mapping, request validation, WAFv2 association, resource policies, or
PRIVATEVPC endpoints. - You want every stage to ship with access logs, X-Ray tracing, and method-level throttling by default, plus a 5XX/latency alarm — without copy-pasting it per team.
- You operate a multi-account, multi-team platform and want a single audited module so naming, log retention, and tracing are consistent across every API.
Reach for the HTTP API (aws_apigatewayv2_api) instead if you only need a thin, low-latency Lambda/JWT proxy and none of the REST features above — it is roughly 70% cheaper per million calls.
Module structure
terraform-module-aws-api-gateway/
├── versions.tf # provider + Terraform version pins
├── main.tf # rest_api, deployment, stage, method settings, logs, alarm
├── variables.tf # var-driven inputs with validation
└── outputs.tf # id, arns, invoke_url, stage name
versions.tf
terraform {
required_version = ">= 1.5.0"
required_providers {
aws = {
source = "hashicorp/aws"
version = "~> 5.0"
}
}
}
main.tf
locals {
# A stable hash of the API contract + key settings. Any change here forces a
# fresh deployment, which is the only reliable way to redeploy a REST API.
redeploy_trigger = sha1(jsonencode({
body = var.openapi_body
endpoint = var.endpoint_type
policy = var.resource_policy
}))
access_log_format = jsonencode({
requestId = "$context.requestId"
ip = "$context.identity.sourceIp"
requestTime = "$context.requestTime"
httpMethod = "$context.httpMethod"
resourcePath = "$context.resourcePath"
status = "$context.status"
protocol = "$context.protocol"
responseLength = "$context.responseLength"
integrationLatency = "$context.integrationLatency"
error = "$context.error.message"
})
}
resource "aws_api_gateway_rest_api" "this" {
name = var.name
description = var.description
# OpenAPI 3.0 document (with x-amazon-apigateway-integration blocks) defines
# the full resource/method/integration tree in one import.
body = var.openapi_body
# Body import is authoritative for the resource tree; let it overwrite drift.
put_rest_api_mode = "overwrite"
endpoint_configuration {
types = [var.endpoint_type]
vpc_endpoint_ids = var.endpoint_type == "PRIVATE" ? var.vpc_endpoint_ids : null
}
# Resource policy (required for PRIVATE; optional otherwise).
policy = var.resource_policy
# Smallest payload that may be sent compressed (bytes). null disables.
minimum_compression_size = var.minimum_compression_size
tags = var.tags
}
resource "aws_api_gateway_deployment" "this" {
rest_api_id = aws_api_gateway_rest_api.this.id
triggers = {
redeployment = local.redeploy_trigger
}
# Create the new deployment before destroying the old one so the stage is
# never pointed at a deleted deployment.
lifecycle {
create_before_destroy = true
}
}
# Access log group, encrypted at rest if a KMS key is supplied.
resource "aws_cloudwatch_log_group" "access" {
name = "/aws/apigateway/${var.name}/${var.stage_name}"
retention_in_days = var.log_retention_days
kms_key_id = var.log_kms_key_arn
tags = var.tags
}
resource "aws_api_gateway_stage" "this" {
rest_api_id = aws_api_gateway_rest_api.this.id
deployment_id = aws_api_gateway_deployment.this.id
stage_name = var.stage_name
xray_tracing_enabled = var.xray_tracing_enabled
cache_cluster_enabled = var.cache_cluster_enabled
cache_cluster_size = var.cache_cluster_enabled ? var.cache_cluster_size : null
access_log_settings {
destination_arn = aws_cloudwatch_log_group.access.arn
format = local.access_log_format
}
variables = var.stage_variables
tags = var.tags
}
# Account-wide / stage-wide method settings: throttling + execution logging.
resource "aws_api_gateway_method_settings" "this" {
rest_api_id = aws_api_gateway_rest_api.this.id
stage_name = aws_api_gateway_stage.this.stage_name
method_path = "*/*" # all methods on all resources
settings {
metrics_enabled = true
logging_level = var.execution_logging_level
data_trace_enabled = var.data_trace_enabled
throttling_burst_limit = var.throttling_burst_limit
throttling_rate_limit = var.throttling_rate_limit
}
}
# Optional WAFv2 web ACL association (regional scope).
resource "aws_wafv2_web_acl_association" "this" {
count = var.web_acl_arn != null ? 1 : 0
resource_arn = aws_api_gateway_stage.this.arn
web_acl_arn = var.web_acl_arn
}
# Guardrail alarm on server-side errors for this stage.
resource "aws_cloudwatch_metric_alarm" "five_xx" {
count = var.create_5xx_alarm ? 1 : 0
alarm_name = "${var.name}-${var.stage_name}-5xx"
comparison_operator = "GreaterThanThreshold"
evaluation_periods = 1
metric_name = "5XXError"
namespace = "AWS/ApiGateway"
period = 300
statistic = "Sum"
threshold = var.five_xx_alarm_threshold
alarm_description = "API Gateway 5XX errors for ${var.name}/${var.stage_name}"
treat_missing_data = "notBreaching"
dimensions = {
ApiName = var.name
Stage = var.stage_name
}
alarm_actions = var.alarm_sns_topic_arns
ok_actions = var.alarm_sns_topic_arns
tags = var.tags
}
variables.tf
variable "name" {
description = "Name of the REST API (also used in log group + alarm naming)."
type = string
validation {
condition = can(regex("^[A-Za-z0-9-_]{1,128}$", var.name))
error_message = "name must be 1-128 chars: letters, digits, hyphen, underscore."
}
}
variable "description" {
description = "Human-readable description of the API."
type = string
default = "Managed by Terraform"
}
variable "openapi_body" {
description = "OpenAPI 3.0 document (JSON string) with x-amazon-apigateway-integration blocks defining the full API."
type = string
}
variable "endpoint_type" {
description = "Endpoint type: EDGE, REGIONAL, or PRIVATE."
type = string
default = "REGIONAL"
validation {
condition = contains(["EDGE", "REGIONAL", "PRIVATE"], var.endpoint_type)
error_message = "endpoint_type must be one of EDGE, REGIONAL, PRIVATE."
}
}
variable "vpc_endpoint_ids" {
description = "VPC endpoint IDs to bind when endpoint_type is PRIVATE."
type = list(string)
default = []
}
variable "resource_policy" {
description = "Resource policy JSON. Required for PRIVATE endpoints to scope access to your VPCE(s)."
type = string
default = null
}
variable "minimum_compression_size" {
description = "Minimum response size in bytes before gzip is applied (0-10485760). null disables compression."
type = number
default = null
validation {
condition = var.minimum_compression_size == null || (var.minimum_compression_size >= 0 && var.minimum_compression_size <= 10485760)
error_message = "minimum_compression_size must be between 0 and 10485760, or null."
}
}
variable "stage_name" {
description = "Name of the deployment stage (e.g. prod, staging)."
type = string
default = "prod"
validation {
condition = can(regex("^[A-Za-z0-9-_]{1,128}$", var.stage_name))
error_message = "stage_name must be 1-128 chars: letters, digits, hyphen, underscore."
}
}
variable "stage_variables" {
description = "Stage variables map (e.g. { lambdaAlias = \"live\" }) for use in integration URIs."
type = map(string)
default = {}
}
variable "xray_tracing_enabled" {
description = "Enable AWS X-Ray active tracing on the stage."
type = bool
default = true
}
variable "cache_cluster_enabled" {
description = "Enable the stage response cache (billed hourly while on)."
type = bool
default = false
}
variable "cache_cluster_size" {
description = "Cache cluster size in GB. Allowed: 0.5, 1.6, 6.1, 13.5, 28.4, 58.2, 118, 237."
type = string
default = "0.5"
validation {
condition = contains(["0.5", "1.6", "6.1", "13.5", "28.4", "58.2", "118", "237"], var.cache_cluster_size)
error_message = "cache_cluster_size must be one of the allowed API Gateway cache sizes."
}
}
variable "execution_logging_level" {
description = "Execution log level for all methods: OFF, ERROR, or INFO."
type = string
default = "INFO"
validation {
condition = contains(["OFF", "ERROR", "INFO"], var.execution_logging_level)
error_message = "execution_logging_level must be OFF, ERROR, or INFO."
}
}
variable "data_trace_enabled" {
description = "Log full request/response bodies in execution logs. Keep false in prod (may capture PII)."
type = bool
default = false
}
variable "throttling_burst_limit" {
description = "Stage-wide throttling burst limit (concurrent). -1 leaves account default."
type = number
default = 5000
}
variable "throttling_rate_limit" {
description = "Stage-wide steady-state throttling rate (requests/sec). -1 leaves account default."
type = number
default = 10000
}
variable "log_retention_days" {
description = "Retention for the access log group in days."
type = number
default = 30
validation {
condition = contains([1, 3, 5, 7, 14, 30, 60, 90, 120, 150, 180, 365, 400, 545, 731, 1827, 3653], var.log_retention_days)
error_message = "log_retention_days must be a valid CloudWatch Logs retention value."
}
}
variable "log_kms_key_arn" {
description = "KMS key ARN to encrypt the access log group. null uses CloudWatch default encryption."
type = string
default = null
}
variable "web_acl_arn" {
description = "Regional WAFv2 web ACL ARN to associate with the stage. null skips association."
type = string
default = null
}
variable "create_5xx_alarm" {
description = "Create a CloudWatch alarm on stage 5XXError count."
type = bool
default = true
}
variable "five_xx_alarm_threshold" {
description = "5XXError sum (per 5 min) above which the alarm fires."
type = number
default = 5
}
variable "alarm_sns_topic_arns" {
description = "SNS topic ARNs to notify on alarm/OK transitions."
type = list(string)
default = []
}
variable "tags" {
description = "Tags applied to all taggable resources."
type = map(string)
default = {}
}
outputs.tf
output "rest_api_id" {
description = "ID of the REST API."
value = aws_api_gateway_rest_api.this.id
}
output "rest_api_name" {
description = "Name of the REST API."
value = aws_api_gateway_rest_api.this.name
}
output "rest_api_arn" {
description = "ARN of the REST API."
value = aws_api_gateway_rest_api.this.arn
}
output "root_resource_id" {
description = "Resource ID of the API's root path (useful for attaching extra resources)."
value = aws_api_gateway_rest_api.this.root_resource_id
}
output "execution_arn" {
description = "Execution ARN prefix used in Lambda permission source_arn (e.g. <execution_arn>/*/*/*)."
value = aws_api_gateway_rest_api.this.execution_arn
}
output "stage_name" {
description = "Deployed stage name."
value = aws_api_gateway_stage.this.stage_name
}
output "stage_arn" {
description = "ARN of the stage (used for WAF association / IAM)."
value = aws_api_gateway_stage.this.arn
}
output "invoke_url" {
description = "Base HTTPS URL clients call for the deployed stage."
value = aws_api_gateway_stage.this.invoke_url
}
output "access_log_group_name" {
description = "CloudWatch Logs group receiving stage access logs."
value = aws_cloudwatch_log_group.access.name
}
How to use it
# OpenAPI contract: a /orders GET proxied to a Lambda (AWS_PROXY integration).
locals {
orders_openapi = jsonencode({
openapi = "3.0.1"
info = { title = "orders-api", version = "1.0" }
paths = {
"/orders" = {
get = {
x-amazon-apigateway-integration = {
type = "AWS_PROXY"
httpMethod = "POST" # always POST for Lambda invocation
uri = "arn:aws:apigateway:${var.region}:lambda:path/2015-03-31/functions/${aws_lambda_function.orders.arn}/invocations"
passthroughBehavior = "when_no_match"
payloadFormatVersion = "1.0"
}
responses = { "200" = { description = "OK" } }
}
}
}
})
}
module "api_gateway_rest_orders" {
source = "git::https://dev.azure.com/teknohut/kloudvin/_git/terraform-modules//terraform-module-aws-api-gateway?ref=v1.0.0"
name = "orders-api"
description = "Public orders REST API"
endpoint_type = "REGIONAL"
stage_name = "prod"
openapi_body = local.orders_openapi
# Observability + guardrails
xray_tracing_enabled = true
log_retention_days = 90
web_acl_arn = aws_wafv2_web_acl.edge.arn
create_5xx_alarm = true
alarm_sns_topic_arns = [aws_sns_topic.alerts.arn]
# Throttle defensively for a public API
throttling_rate_limit = 2000
throttling_burst_limit = 1000
tags = {
team = "checkout"
environment = "prod"
}
}
# API Gateway must be granted permission to invoke the backing Lambda.
# The execution_arn output scopes the permission to exactly this API.
resource "aws_lambda_permission" "allow_apigw" {
statement_id = "AllowAPIGatewayInvoke"
action = "lambda:InvokeFunction"
function_name = aws_lambda_function.orders.function_name
principal = "apigateway.amazonaws.com"
source_arn = "${module.api_gateway_rest_orders.execution_arn}/*/*/*"
}
# Downstream: publish the live base URL to SSM for clients to discover.
resource "aws_ssm_parameter" "orders_api_url" {
name = "/checkout/orders-api/base-url"
type = "String"
value = module.api_gateway_rest_orders.invoke_url
}
With Terragrunt
Terragrunt keeps this module DRY across environments — define the backend and provider once in a root config, then a thin terragrunt.hcl per environment supplies only the inputs that differ.
1. Root config — live/terragrunt.hcl (inherited by every module):
remote_state {
backend = "s3"
generate = { path = "backend.tf", if_exists = "overwrite" }
config = {
# ...s3 state bucket/container + key per path...
}
}
2. Module config — live/prod/api_gateway/terragrunt.hcl:
include "root" {
path = find_in_parent_folders()
}
terraform {
source = "git::https://dev.azure.com/teknohut/kloudvin/_git/terraform-modules//terraform-module-aws-api-gateway?ref=v1.0.0"
}
inputs = {
name = "..."
openapi_body = "..."
}
3. Deploy one environment, or roll out all modules together:
cd live/prod/api_gateway && terragrunt apply # this module
terragrunt run-all apply # every module under live/prod
Why Terragrunt here: the backend and provider live in one place instead of being copy-pasted into every module; inputs is overridden per environment (dev / stage / prod) without forking the module; and run-all orchestrates dependencies across modules. Reach for it once you have more than one environment or more than a handful of modules — for a single stack, the plain Quickstart above is enough.
Inputs
| Name | Type | Default | Required | Description |
|---|---|---|---|---|
name |
string |
— | yes | Name of the REST API; also used in log group and alarm naming. |
description |
string |
"Managed by Terraform" |
no | Human-readable API description. |
openapi_body |
string |
— | yes | OpenAPI 3.0 JSON document with x-amazon-apigateway-integration blocks. |
endpoint_type |
string |
"REGIONAL" |
no | EDGE, REGIONAL, or PRIVATE. |
vpc_endpoint_ids |
list(string) |
[] |
no | VPC endpoint IDs bound when endpoint_type = PRIVATE. |
resource_policy |
string |
null |
no | Resource policy JSON (required for PRIVATE). |
minimum_compression_size |
number |
null |
no | Min response bytes before gzip (0–10485760); null disables. |
stage_name |
string |
"prod" |
no | Deployment stage name. |
stage_variables |
map(string) |
{} |
no | Stage variables referenced in integration URIs. |
xray_tracing_enabled |
bool |
true |
no | Enable X-Ray active tracing on the stage. |
cache_cluster_enabled |
bool |
false |
no | Enable the stage response cache (billed hourly). |
cache_cluster_size |
string |
"0.5" |
no | Cache size in GB (allowed API Gateway sizes only). |
execution_logging_level |
string |
"INFO" |
no | Method execution log level: OFF, ERROR, INFO. |
data_trace_enabled |
bool |
false |
no | Log full request/response bodies (keep false in prod). |
throttling_burst_limit |
number |
5000 |
no | Stage-wide burst limit; -1 keeps account default. |
throttling_rate_limit |
number |
10000 |
no | Stage-wide steady-state rate (req/s); -1 keeps default. |
log_retention_days |
number |
30 |
no | Access log group retention (valid CloudWatch value). |
log_kms_key_arn |
string |
null |
no | KMS key ARN to encrypt the access log group. |
web_acl_arn |
string |
null |
no | Regional WAFv2 web ACL ARN to associate with the stage. |
create_5xx_alarm |
bool |
true |
no | Create a CloudWatch alarm on stage 5XXError. |
five_xx_alarm_threshold |
number |
5 |
no | 5XXError sum per 5 min that trips the alarm. |
alarm_sns_topic_arns |
list(string) |
[] |
no | SNS topics notified on alarm/OK. |
tags |
map(string) |
{} |
no | Tags applied to all taggable resources. |
Outputs
| Name | Description |
|---|---|
rest_api_id |
ID of the REST API. |
rest_api_name |
Name of the REST API. |
rest_api_arn |
ARN of the REST API. |
root_resource_id |
Resource ID of the API root path. |
execution_arn |
Execution ARN prefix for Lambda permission source_arn. |
stage_name |
Deployed stage name. |
stage_arn |
ARN of the stage (WAF association / IAM). |
invoke_url |
Base HTTPS URL clients call for the stage. |
access_log_group_name |
CloudWatch Logs group receiving access logs. |
Enterprise scenario
A retail platform team runs a multi-account AWS org where every product squad ships its own REST API. They standardise on this module so all 40+ APIs land with identical JSON access logs, 90-day retention, X-Ray tracing, and a regional WAF web ACL — satisfying the security team’s audit baseline without per-team review. Each squad supplies only its OpenAPI contract and Lambda ARNs; the platform’s CI passes the shared web_acl_arn, alarm_sns_topic_arns, and log_kms_key_arn so encryption, alerting, and edge protection are enforced centrally rather than left to each team to remember.
Best practices
- Make redeploys explicit. The
triggershash overopenapi_bodypluscreate_before_destroyis what forces a clean redeployment on contract changes — never rely on API Gateway auto-detecting body edits, and never share a deployment across stages. - Lock down PRIVATE APIs with a resource policy. A
PRIVATEendpoint without a policy that pins youraws:sourceVpceis effectively open to any VPCE in the account; always pass bothvpc_endpoint_idsand a scopedresource_policy. - Throttle every public API and front it with WAF. Set sane
throttling_rate_limit/throttling_burst_limitper stage and associate a regionalweb_acl_arn; API Gateway bills per request, so throttling is also a direct cost control against abuse. - Keep
data_trace_enabled = falsein production. Full request/response logging is great for debugging but can write PII and secrets to CloudWatch; combineINFOexecution logs with structured access logs instead, and encrypt the log group withlog_kms_key_arn. - Treat the cache as a deliberate cost decision.
cache_cluster_enabledbills hourly per GB whether or not it is hit — enable it only for read-heavy, cacheable endpoints and right-sizecache_cluster_sizerather than defaulting it on everywhere. - Name and tag consistently. Drive
name,stage_name, andtagsfrom your platform convention so the auto-generated log group (/aws/apigateway/<name>/<stage>) and5xxalarm are discoverable and cost-attributable across accounts.