IaC AWS

Terraform Module: AWS Aurora Cluster — production-ready provisioner-aware clusters in one block

Quick take — A reusable hashicorp/aws ~> 5.0 Terraform module for Amazon Aurora (PostgreSQL/MySQL): provisioned or Serverless v2 clusters with KMS encryption, IAM auth, automated backups and cluster instances. New here? Jump to the Quickstart below to deploy it in minutes; read on for how it works and when to reach for it.

Quickstart (copy-paste)

Minimal, runnable configuration — drop this in a .tf file and fill in the "..." placeholders (each required input is commented):

provider "aws" {
  region = "us-east-1"
}

module "aurora" {
  source = "git::https://dev.azure.com/teknohut/kloudvin/_git/terraform-modules//terraform-module-aws-aurora?ref=v1.0.0"

  name_prefix = "..."           # Prefix for resource names; becomes `<prefix>-aurora`.
  vpc_id      = "..."           # VPC for the cluster security group.
  subnet_ids  = ["...", "..."]  # Subnets (>= 2 AZs) for the DB subnet group.
}

Then terraform init && terraform apply. Every other input has a sensible default — see Inputs below to override behaviour.

What this module is

Amazon Aurora is AWS’s MySQL- and PostgreSQL-compatible relational engine that decouples compute from a distributed, self-healing storage layer replicated six ways across three Availability Zones. Unlike plain RDS, an Aurora cluster is the top-level object: you create one aws_rds_cluster that owns the shared storage volume, the writer endpoint, and the reader endpoint, and then attach one or more aws_rds_cluster_instance compute nodes to it. The cluster also governs backups, encryption, the engine version, and (for Serverless v2) the autoscaling capacity range.

That two-tier shape — one cluster, N instances — is exactly why a raw Aurora setup is verbose and error-prone to hand-write. You need a subnet group, a security group, a parameter group at both the cluster and instance level, a KMS key for storage encryption, an enhanced-monitoring IAM role, and careful sequencing so instances don’t get created before the cluster exists. Copy-pasting that across a dozen services drifts fast. This module collapses the whole stack into a single, var-driven block: pick the engine, the instance class, the number of instances (or a Serverless v2 capacity range), and it wires up encryption, IAM database authentication, deletion protection, and a sane backup window with correct dependency ordering — emitting the endpoints and security-group id you need downstream.

When to use it

Module structure

terraform-module-aws-aurora/
├── versions.tf      # provider + Terraform version pins
├── main.tf          # cluster, instances, subnet group, SG, KMS, monitoring role
├── variables.tf     # all tunables, with validation
└── outputs.tf       # endpoints, ids, port, SG id, KMS key arn

versions.tf

terraform {
  required_version = ">= 1.5.0"

  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "~> 5.0"
    }
  }
}

main.tf

locals {
  name = "${var.name_prefix}-aurora"

  # Aurora Serverless v2 is modelled as a normal provisioned cluster whose
  # instances use the special "db.serverless" class plus a capacity range.
  is_serverless = var.serverless_v2 != null
  instance_class = local.is_serverless ? "db.serverless" : var.instance_class

  # Engine-aware default port: PostgreSQL 5432, MySQL 3306.
  default_port = startswith(var.engine, "aurora-postgresql") ? 5432 : 3306
  port         = coalesce(var.port, local.default_port)

  tags = merge(var.tags, {
    "ManagedBy" = "terraform"
    "Module"    = "terraform-module-aws-aurora"
  })
}

# --- KMS key for storage encryption (created only if no key is supplied) ---
resource "aws_kms_key" "this" {
  count = var.storage_encrypted && var.kms_key_id == null ? 1 : 0

  description             = "Storage encryption key for ${local.name}"
  deletion_window_in_days = 14
  enable_key_rotation     = true
  tags                    = local.tags
}

resource "aws_kms_alias" "this" {
  count = var.storage_encrypted && var.kms_key_id == null ? 1 : 0

  name          = "alias/${local.name}"
  target_key_id = aws_kms_key.this[0].key_id
}

# --- Networking: subnet group + cluster security group ---
resource "aws_db_subnet_group" "this" {
  name       = local.name
  subnet_ids = var.subnet_ids
  tags       = local.tags
}

resource "aws_security_group" "this" {
  name_prefix = "${local.name}-"
  description = "Aurora cluster ${local.name} access"
  vpc_id      = var.vpc_id
  tags        = local.tags

  lifecycle {
    create_before_destroy = true
  }
}

resource "aws_vpc_security_group_ingress_rule" "this" {
  for_each = toset(var.allowed_cidr_blocks)

  security_group_id = aws_security_group.this.id
  description       = "DB access from ${each.value}"
  cidr_ipv4         = each.value
  from_port         = local.port
  to_port           = local.port
  ip_protocol       = "tcp"
}

resource "aws_vpc_security_group_egress_rule" "this" {
  security_group_id = aws_security_group.this.id
  description       = "Allow all egress"
  cidr_ipv4         = "0.0.0.0/0"
  ip_protocol       = "-1"
}

# --- Enhanced monitoring role (only when interval > 0) ---
data "aws_iam_policy_document" "monitoring_assume" {
  count = var.monitoring_interval > 0 ? 1 : 0

  statement {
    actions = ["sts:AssumeRole"]
    principals {
      type        = "Service"
      identifiers = ["monitoring.rds.amazonaws.com"]
    }
  }
}

resource "aws_iam_role" "monitoring" {
  count = var.monitoring_interval > 0 ? 1 : 0

  name_prefix        = "${var.name_prefix}-rds-mon-"
  assume_role_policy = data.aws_iam_policy_document.monitoring_assume[0].json
  tags               = local.tags
}

resource "aws_iam_role_policy_attachment" "monitoring" {
  count = var.monitoring_interval > 0 ? 1 : 0

  role       = aws_iam_role.monitoring[0].name
  policy_arn = "arn:aws:iam::aws:policy/service-role/AmazonRDSEnhancedMonitoringRole"
}

# --- The Aurora cluster (shared storage, endpoints, backups) ---
resource "aws_rds_cluster" "this" {
  cluster_identifier = local.name
  engine             = var.engine
  engine_mode        = "provisioned" # Serverless v2 uses provisioned mode + db.serverless
  engine_version     = var.engine_version
  database_name      = var.database_name
  port               = local.port

  master_username             = var.master_username
  manage_master_user_password = true # store/rotate the master secret in Secrets Manager

  db_subnet_group_name   = aws_db_subnet_group.this.name
  vpc_security_group_ids = [aws_security_group.this.id]

  storage_encrypted = var.storage_encrypted
  kms_key_id        = var.storage_encrypted ? coalesce(var.kms_key_id, try(aws_kms_key.this[0].arn, null)) : null

  iam_database_authentication_enabled = var.iam_database_authentication_enabled

  backup_retention_period      = var.backup_retention_period
  preferred_backup_window      = var.preferred_backup_window
  preferred_maintenance_window = var.preferred_maintenance_window
  copy_tags_to_snapshot        = true

  deletion_protection       = var.deletion_protection
  skip_final_snapshot       = var.skip_final_snapshot
  final_snapshot_identifier = var.skip_final_snapshot ? null : "${local.name}-final-${formatdate("YYYYMMDDhhmmss", timestamp())}"

  enabled_cloudwatch_logs_exports = var.enabled_cloudwatch_logs_exports
  apply_immediately               = var.apply_immediately

  dynamic "serverlessv2_scaling_configuration" {
    for_each = local.is_serverless ? [var.serverless_v2] : []
    content {
      min_capacity = serverlessv2_scaling_configuration.value.min_capacity
      max_capacity = serverlessv2_scaling_configuration.value.max_capacity
    }
  }

  tags = local.tags

  lifecycle {
    # final_snapshot_identifier embeds a timestamp; ignore so plans stay clean.
    ignore_changes = [final_snapshot_identifier]
  }
}

# --- Cluster instances (writer + readers) ---
resource "aws_rds_cluster_instance" "this" {
  count = var.instance_count

  identifier         = "${local.name}-${count.index}"
  cluster_identifier = aws_rds_cluster.this.id
  engine             = aws_rds_cluster.this.engine
  engine_version     = aws_rds_cluster.this.engine_version
  instance_class     = local.instance_class

  db_subnet_group_name = aws_db_subnet_group.this.name
  publicly_accessible  = false

  monitoring_interval = var.monitoring_interval
  monitoring_role_arn = var.monitoring_interval > 0 ? aws_iam_role.monitoring[0].arn : null

  performance_insights_enabled          = var.performance_insights_enabled
  performance_insights_kms_key_id       = var.performance_insights_enabled && var.storage_encrypted ? aws_rds_cluster.this.kms_key_id : null
  performance_insights_retention_period = var.performance_insights_enabled ? var.performance_insights_retention_period : null

  auto_minor_version_upgrade = var.auto_minor_version_upgrade
  apply_immediately          = var.apply_immediately

  tags = local.tags
}

variables.tf

variable "name_prefix" {
  type        = string
  description = "Prefix for all cluster resource names (e.g. \"orders-prod\"). Becomes \"<prefix>-aurora\"."

  validation {
    condition     = can(regex("^[a-z][a-z0-9-]{1,30}$", var.name_prefix))
    error_message = "name_prefix must be 2-31 chars, lowercase alphanumeric or hyphen, starting with a letter."
  }
}

variable "engine" {
  type        = string
  description = "Aurora engine: aurora-postgresql or aurora-mysql."
  default     = "aurora-postgresql"

  validation {
    condition     = contains(["aurora-postgresql", "aurora-mysql"], var.engine)
    error_message = "engine must be aurora-postgresql or aurora-mysql."
  }
}

variable "engine_version" {
  type        = string
  description = "Engine version, e.g. \"16.4\" for PostgreSQL or \"8.0.mysql_aurora.3.07.1\" for MySQL."
  default     = "16.4"
}

variable "database_name" {
  type        = string
  description = "Name of the initial database created in the cluster."
  default     = "appdb"
}

variable "master_username" {
  type        = string
  description = "Master (admin) username. The password is generated and stored in Secrets Manager via manage_master_user_password."
  default     = "dbadmin"
}

variable "vpc_id" {
  type        = string
  description = "VPC in which the cluster security group is created."
}

variable "subnet_ids" {
  type        = list(string)
  description = "Subnet IDs (ideally private, in >= 2 AZs) for the DB subnet group."

  validation {
    condition     = length(var.subnet_ids) >= 2
    error_message = "Aurora requires subnets in at least two Availability Zones."
  }
}

variable "allowed_cidr_blocks" {
  type        = list(string)
  description = "CIDR blocks allowed to connect on the DB port. Keep this tight (app subnets only)."
  default     = []
}

variable "port" {
  type        = number
  description = "Override the DB port. Null = engine default (5432 PostgreSQL / 3306 MySQL)."
  default     = null
}

variable "instance_class" {
  type        = string
  description = "Instance class for provisioned mode (e.g. db.r6g.large). Ignored when serverless_v2 is set."
  default     = "db.r6g.large"
}

variable "instance_count" {
  type        = number
  description = "Number of cluster instances. 1 = writer only; >= 2 adds reader replica(s) for HA."
  default     = 2

  validation {
    condition     = var.instance_count >= 1 && var.instance_count <= 15
    error_message = "instance_count must be between 1 and 15 (Aurora supports up to 15 readers + 1 writer)."
  }
}

variable "serverless_v2" {
  type = object({
    min_capacity = number
    max_capacity = number
  })
  description = "Set to enable Aurora Serverless v2 (instances use db.serverless). Capacity in ACUs, e.g. { min_capacity = 0.5, max_capacity = 8 }. Null = provisioned."
  default     = null

  validation {
    condition = var.serverless_v2 == null || (
      try(var.serverless_v2.min_capacity, 0) >= 0 &&
      try(var.serverless_v2.max_capacity, 0) <= 256 &&
      try(var.serverless_v2.min_capacity, 1) <= try(var.serverless_v2.max_capacity, 0)
    )
    error_message = "Serverless v2 capacity must be 0-256 ACUs and min_capacity <= max_capacity."
  }
}

variable "storage_encrypted" {
  type        = bool
  description = "Encrypt the shared cluster storage with KMS. Strongly recommended; cannot be changed after creation."
  default     = true
}

variable "kms_key_id" {
  type        = string
  description = "ARN of an existing KMS key for storage encryption. Null = the module creates a dedicated key."
  default     = null
}

variable "iam_database_authentication_enabled" {
  type        = bool
  description = "Allow IAM-token authentication to the database (in addition to native auth)."
  default     = true
}

variable "backup_retention_period" {
  type        = number
  description = "Days to retain automated backups (1-35)."
  default     = 7

  validation {
    condition     = var.backup_retention_period >= 1 && var.backup_retention_period <= 35
    error_message = "backup_retention_period must be between 1 and 35 days."
  }
}

variable "preferred_backup_window" {
  type        = string
  description = "Daily backup window in UTC, format hh24:mi-hh24:mi."
  default     = "03:00-04:00"
}

variable "preferred_maintenance_window" {
  type        = string
  description = "Weekly maintenance window in UTC, format ddd:hh24:mi-ddd:hh24:mi."
  default     = "sun:04:30-sun:05:30"
}

variable "enabled_cloudwatch_logs_exports" {
  type        = list(string)
  description = "Log types to export to CloudWatch. PostgreSQL: [\"postgresql\"]; MySQL: [\"audit\",\"error\",\"slowquery\"]."
  default     = ["postgresql"]
}

variable "monitoring_interval" {
  type        = number
  description = "Enhanced Monitoring granularity in seconds (0 disables; valid: 0,1,5,10,15,30,60)."
  default     = 60

  validation {
    condition     = contains([0, 1, 5, 10, 15, 30, 60], var.monitoring_interval)
    error_message = "monitoring_interval must be one of 0, 1, 5, 10, 15, 30, 60."
  }
}

variable "performance_insights_enabled" {
  type        = bool
  description = "Enable Performance Insights on each instance."
  default     = true
}

variable "performance_insights_retention_period" {
  type        = number
  description = "Performance Insights retention in days (7 = free tier, or 31-month multiples up to 731)."
  default     = 7
}

variable "auto_minor_version_upgrade" {
  type        = bool
  description = "Apply minor engine upgrades automatically during the maintenance window."
  default     = true
}

variable "deletion_protection" {
  type        = bool
  description = "Block accidental deletion of the cluster. Keep true in production."
  default     = true
}

variable "skip_final_snapshot" {
  type        = bool
  description = "Skip the final snapshot on destroy. Keep false in production."
  default     = false
}

variable "apply_immediately" {
  type        = bool
  description = "Apply modifications immediately instead of during the maintenance window (may cause downtime)."
  default     = false
}

variable "tags" {
  type        = map(string)
  description = "Additional tags applied to every resource."
  default     = {}
}

outputs.tf

output "cluster_id" {
  description = "The RDS cluster identifier."
  value       = aws_rds_cluster.this.id
}

output "cluster_arn" {
  description = "ARN of the Aurora cluster."
  value       = aws_rds_cluster.this.arn
}

output "cluster_resource_id" {
  description = "Immutable cluster resource id (use in IAM rds-db:connect policies)."
  value       = aws_rds_cluster.this.cluster_resource_id
}

output "writer_endpoint" {
  description = "Cluster (writer) endpoint — use for read/write connections."
  value       = aws_rds_cluster.this.endpoint
}

output "reader_endpoint" {
  description = "Load-balanced reader endpoint — use for read-only connections."
  value       = aws_rds_cluster.this.reader_endpoint
}

output "port" {
  description = "Port the database listens on."
  value       = aws_rds_cluster.this.port
}

output "database_name" {
  description = "Name of the initial database."
  value       = aws_rds_cluster.this.database_name
}

output "master_user_secret_arn" {
  description = "ARN of the Secrets Manager secret holding the master password."
  value       = try(aws_rds_cluster.this.master_user_secret[0].secret_arn, null)
}

output "security_group_id" {
  description = "ID of the cluster security group (attach app rules or reference downstream)."
  value       = aws_security_group.this.id
}

output "kms_key_arn" {
  description = "ARN of the KMS key used for storage encryption (module-created or supplied)."
  value       = aws_rds_cluster.this.kms_key_id
}

output "instance_identifiers" {
  description = "Identifiers of all cluster instances (writer + readers)."
  value       = aws_rds_cluster_instance.this[*].identifier
}

How to use it

module "aurora_cluster" {
  source = "git::https://dev.azure.com/teknohut/kloudvin/_git/terraform-modules//terraform-module-aws-aurora?ref=v1.0.0"

  name_prefix    = "orders-prod"
  engine         = "aurora-postgresql"
  engine_version = "16.4"
  database_name  = "orders"

  vpc_id     = module.network.vpc_id
  subnet_ids = module.network.private_subnet_ids

  # Only the application tier may reach the database.
  allowed_cidr_blocks = [module.network.app_subnet_cidr]

  # Provisioned writer + one reader on Graviton.
  instance_class = "db.r6g.large"
  instance_count = 2

  backup_retention_period         = 14
  enabled_cloudwatch_logs_exports = ["postgresql"]
  deletion_protection             = true

  tags = {
    Team        = "commerce"
    Environment = "prod"
    CostCenter  = "CC-4412"
  }
}

# Downstream: hand the writer endpoint + master secret to the app task definition.
resource "aws_ssm_parameter" "db_endpoint" {
  name  = "/orders/prod/db/writer_endpoint"
  type  = "String"
  value = module.aurora_cluster.writer_endpoint
}

# Grant the ECS task role permission to fetch the rotated master credentials.
data "aws_iam_policy_document" "read_db_secret" {
  statement {
    actions   = ["secretsmanager:GetSecretValue"]
    resources = [module.aurora_cluster.master_user_secret_arn]
  }
}

For a spiky internal service, swap the two instance_* inputs for a Serverless v2 range — everything else stays identical:

  serverless_v2 = {
    min_capacity = 0.5
    max_capacity = 8
  }
  instance_count = 2 # one writer + one reader, both db.serverless

With Terragrunt

Terragrunt keeps this module DRY across environments — define the backend and provider once in a root config, then a thin terragrunt.hcl per environment supplies only the inputs that differ.

1. Root configlive/terragrunt.hcl (inherited by every module):

remote_state {
  backend = "s3"
  generate = { path = "backend.tf", if_exists = "overwrite" }
  config = {
    # ...s3 state bucket/container + key per path...
  }
}

2. Module configlive/prod/aurora/terragrunt.hcl:

include "root" {
  path = find_in_parent_folders()
}

terraform {
  source = "git::https://dev.azure.com/teknohut/kloudvin/_git/terraform-modules//terraform-module-aws-aurora?ref=v1.0.0"
}

inputs = {
  name_prefix = "..."
  vpc_id = "..."
  subnet_ids = ["...", "..."]
}

3. Deploy one environment, or roll out all modules together:

cd live/prod/aurora && terragrunt apply        # this module
terragrunt run-all apply                      # every module under live/prod

Why Terragrunt here: the backend and provider live in one place instead of being copy-pasted into every module; inputs is overridden per environment (dev / stage / prod) without forking the module; and run-all orchestrates dependencies across modules. Reach for it once you have more than one environment or more than a handful of modules — for a single stack, the plain Quickstart above is enough.

Inputs

Name Type Default Required Description
name_prefix string Yes Prefix for resource names; becomes <prefix>-aurora.
engine string "aurora-postgresql" No aurora-postgresql or aurora-mysql.
engine_version string "16.4" No Engine version string.
database_name string "appdb" No Initial database name.
master_username string "dbadmin" No Admin user; password managed in Secrets Manager.
vpc_id string Yes VPC for the cluster security group.
subnet_ids list(string) Yes Subnets (>= 2 AZs) for the DB subnet group.
allowed_cidr_blocks list(string) [] No CIDRs allowed on the DB port.
port number null No Override port; null = engine default.
instance_class string "db.r6g.large" No Provisioned instance class (ignored for Serverless v2).
instance_count number 2 No Number of cluster instances (1–15).
serverless_v2 object({min_capacity, max_capacity}) null No Enable Serverless v2 with an ACU range.
storage_encrypted bool true No KMS-encrypt cluster storage.
kms_key_id string null No Existing KMS key ARN; null = module creates one.
iam_database_authentication_enabled bool true No Allow IAM-token DB auth.
backup_retention_period number 7 No Automated backup retention (1–35 days).
preferred_backup_window string "03:00-04:00" No UTC backup window.
preferred_maintenance_window string "sun:04:30-sun:05:30" No UTC maintenance window.
enabled_cloudwatch_logs_exports list(string) ["postgresql"] No Log types exported to CloudWatch.
monitoring_interval number 60 No Enhanced Monitoring seconds (0 disables).
performance_insights_enabled bool true No Enable Performance Insights per instance.
performance_insights_retention_period number 7 No PI retention in days.
auto_minor_version_upgrade bool true No Auto-apply minor upgrades in maintenance window.
deletion_protection bool true No Block accidental cluster deletion.
skip_final_snapshot bool false No Skip final snapshot on destroy.
apply_immediately bool false No Apply changes now vs. maintenance window.
tags map(string) {} No Extra tags for all resources.

Outputs

Name Description
cluster_id The RDS cluster identifier.
cluster_arn ARN of the Aurora cluster.
cluster_resource_id Immutable resource id for rds-db:connect IAM policies.
writer_endpoint Cluster (writer) endpoint for read/write traffic.
reader_endpoint Load-balanced reader endpoint for read-only traffic.
port Database listening port.
database_name Name of the initial database.
master_user_secret_arn Secrets Manager ARN holding the master password.
security_group_id ID of the cluster security group.
kms_key_arn KMS key ARN used for storage encryption.
instance_identifiers Identifiers of writer + reader instances.

Enterprise scenario

A commerce platform team runs the orders service on Aurora PostgreSQL across three AZs. They consume this module with instance_count = 3 (one writer, two readers) so the checkout API can route read-heavy catalog and order-history queries to the reader_endpoint while writes go to the writer_endpoint, and deletion_protection, backup_retention_period = 14, and IAM database authentication are enforced by default so an audit never finds an unencrypted or unprotected production database. Their non-prod stacks reuse the same module call but pass serverless_v2 = { min_capacity = 0.5, max_capacity = 4 }, so QA and staging clusters scale down to a fraction of an ACU overnight — cutting the lower-environment database bill by roughly 60% without any code divergence from production.

Best practices

TerraformAWSAurora ClusterModuleIaC
Need this built for real?

Vinod is a Senior Cloud Architect (22+ yrs) — available for Azure / AWS / GCP architecture, landing zones, and migrations.

Work with me

Comments

Keep Reading