IaC AWS

Terraform Module: DynamoDB Accelerator (DAX) — Encrypted, Multi-Node Microsecond Caching Without the Footguns

Quick take — A reusable hashicorp/aws ~> 5.0 Terraform module for aws_dax_cluster covering encryption at rest, TLS endpoint encryption, a multi-node replication factor, an IAM role for DynamoDB access, and a private subnet group — production defaults baked in. New here? Jump to the Quickstart below to deploy it in minutes; read on for how it works and when to reach for it.

Quickstart (copy-paste)

Minimal, runnable configuration — drop this in a .tf file and fill in the "..." placeholders (each required input is commented):

provider "aws" {
  region = "us-east-1"
}

module "dax" {
  source = "git::https://dev.azure.com/teknohut/kloudvin/_git/terraform-modules//terraform-module-aws-dax?ref=v1.0.0"

  cluster_name       = "..."           # DAX cluster name (lowercased by AWS).
  subnet_ids         = ["...", "..."]  # >= 2 private subnets across >= 2 AZs.
  security_group_ids = ["..."]         # Security groups allowing port 8111/9111 from clients.
  dynamodb_table_arns = ["..."]        # Tables DAX is allowed to read/write on your behalf.
}

Then terraform init && terraform apply. Every other input has a sensible default — see Inputs below to override behaviour.

What this module is

DynamoDB Accelerator (DAX) (aws_dax_cluster) is a fully managed, in-memory write-through cache that sits in front of DynamoDB and turns single-digit-millisecond reads into microsecond reads for cache hits. Applications talk to DAX using a drop-in DAX client instead of the DynamoDB SDK, and DAX transparently serves cached item and query results while forwarding writes through to the underlying tables. It runs as a cluster of nodes spread across Availability Zones, and at runtime it assumes an IAM role to access DynamoDB on your behalf — so the cache, not your application, holds the DynamoDB permissions.

The resource is small but its defaults are the wrong ones for production. replication_factor can be set to 1, which gives you a single-node cluster with no failover — a node loss means a cold cache and an instant load spike on DynamoDB. server_side_encryption defaults to off, so cached items sit unencrypted at rest, and cluster_endpoint_encryption_type defaults to NONE, so the client→DAX connection is plaintext. The IAM role is mandatory and easy to over-scope: a lazy dynamodb:* on * hands the cache far more than it needs. Finally, DAX must live in a subnet group of private subnets, which is a separate resource people forget, and tuning the cache TTLs requires a parameter group, a third resource.

Wrapping all of this in a module encodes the correct posture once: a multi-node replication_factor (defaulting to 3 for AZ-resilient production clusters), server_side_encryption { enabled = true }, cluster_endpoint_encryption_type = "TLS", a private subnet group, a tunable parameter group for item/query TTLs, and a least-privilege IAM role scoped to exactly the DynamoDB tables you name. App teams hand the module a name, subnets, security groups, and a list of table ARNs, and they get an encrypted, highly available cache that passes a security review.

When to use it

Reach for DynamoDB on-demand with no cache when your read pattern is unpredictable and uncacheable, or when strong read-after-write consistency on every read is required — DAX serves eventually consistent reads from cache, and strongly consistent reads bypass the cache entirely, so a workload that is overwhelmingly strongly-consistent gets little benefit. DAX shines specifically for repeated eventually-consistent reads of a hot working set.

Module structure

terraform-module-aws-dax/
├── versions.tf      # provider + Terraform version pins
├── main.tf          # IAM role, subnet group, parameter group, cluster
├── variables.tf     # var-driven inputs with validations
└── outputs.tf       # cluster ARN, endpoints, role ARN, and key attributes

versions.tf

terraform {
  required_version = ">= 1.5.0"

  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "~> 5.0"
    }
  }
}

main.tf

locals {
  tags = merge(
    {
      "Name"      = var.cluster_name
      "ManagedBy" = "terraform"
      "Module"    = "terraform-module-aws-dax"
    },
    var.tags,
  )

  # Use the caller-supplied role if given, otherwise create a least-privilege
  # role scoped to exactly the named DynamoDB table ARNs.
  create_role = var.iam_role_arn == null
  role_arn    = local.create_role ? aws_iam_role.this[0].arn : var.iam_role_arn
}

# Trust policy: only the DAX service may assume this role.
data "aws_iam_policy_document" "assume" {
  count = local.create_role ? 1 : 0

  statement {
    actions = ["sts:AssumeRole"]
    principals {
      type        = "Service"
      identifiers = ["dax.amazonaws.com"]
    }
  }
}

resource "aws_iam_role" "this" {
  count = local.create_role ? 1 : 0

  name_prefix        = "${var.cluster_name}-dax-"
  assume_role_policy = data.aws_iam_policy_document.assume[0].json
  tags               = local.tags
}

# Least-privilege DynamoDB access, scoped to the named tables and their indexes.
data "aws_iam_policy_document" "dynamodb" {
  count = local.create_role ? 1 : 0

  statement {
    sid    = "DaxDynamoDBAccess"
    effect = "Allow"
    actions = [
      "dynamodb:GetItem",
      "dynamodb:BatchGetItem",
      "dynamodb:Query",
      "dynamodb:Scan",
      "dynamodb:PutItem",
      "dynamodb:UpdateItem",
      "dynamodb:DeleteItem",
      "dynamodb:BatchWriteItem",
      "dynamodb:ConditionCheckItem",
      "dynamodb:DescribeTable",
    ]
    resources = concat(
      var.dynamodb_table_arns,
      [for arn in var.dynamodb_table_arns : "${arn}/index/*"],
    )
  }
}

resource "aws_iam_role_policy" "dynamodb" {
  count = local.create_role ? 1 : 0

  name   = "${var.cluster_name}-dynamodb-access"
  role   = aws_iam_role.this[0].id
  policy = data.aws_iam_policy_document.dynamodb[0].json
}

# DAX must live in private subnets across multiple AZs.
resource "aws_dax_subnet_group" "this" {
  name        = "${var.cluster_name}-subnet-group"
  description = "Subnet group for ${var.cluster_name} managed by Terraform"
  subnet_ids  = var.subnet_ids
}

# Parameter group controls item/query TTLs for the cache.
resource "aws_dax_parameter_group" "this" {
  name        = "${var.cluster_name}-params"
  description = "Parameter group for ${var.cluster_name} managed by Terraform"

  dynamic "parameters" {
    for_each = var.parameters
    content {
      name  = parameters.value.name
      value = parameters.value.value
    }
  }
}

resource "aws_dax_cluster" "this" {
  cluster_name       = var.cluster_name
  node_type          = var.node_type
  replication_factor = var.replication_factor

  iam_role_arn = local.role_arn

  subnet_group_name   = aws_dax_subnet_group.this.name
  security_group_ids  = var.security_group_ids
  parameter_group_name = aws_dax_parameter_group.this.name
  availability_zones  = var.availability_zones

  # Encryption on by default: at rest and in transit (TLS) to the endpoint.
  server_side_encryption {
    enabled = true
  }
  cluster_endpoint_encryption_type = "TLS"

  maintenance_window     = var.maintenance_window
  notification_topic_arn = var.notification_topic_arn
  description            = var.description

  tags = local.tags
}

variables.tf

variable "cluster_name" {
  description = "DAX cluster name (AWS lowercases it). Letters, digits, hyphens; starts with a letter."
  type        = string

  validation {
    condition     = can(regex("^[a-z][a-z0-9-]{0,19}$", var.cluster_name)) && !can(regex("--|-$", var.cluster_name))
    error_message = "cluster_name must start with a letter, be 1-20 lowercase alphanumeric/hyphen chars, and not end in or contain consecutive hyphens."
  }
}

variable "node_type" {
  description = "Compute/memory capacity per node, e.g. dax.t3.small, dax.r5.large."
  type        = string
  default     = "dax.t3.small"

  validation {
    condition     = can(regex("^dax\\.", var.node_type))
    error_message = "node_type must start with 'dax.' (e.g. dax.t3.small)."
  }
}

variable "replication_factor" {
  description = "Number of nodes in the cluster. Use >= 3 in production for AZ-resilient failover."
  type        = number
  default     = 3

  validation {
    condition     = var.replication_factor >= 1 && var.replication_factor <= 10
    error_message = "replication_factor must be between 1 and 10; use >= 3 for production."
  }
}

variable "subnet_ids" {
  description = "At least two private subnet IDs across two AZs for the DAX subnet group."
  type        = list(string)

  validation {
    condition     = length(var.subnet_ids) >= 2
    error_message = "subnet_ids must contain at least two subnets across two availability zones."
  }
}

variable "security_group_ids" {
  description = "Security group IDs for the cluster (allow client ingress on 8111 plaintext / 9111 TLS)."
  type        = list(string)

  validation {
    condition     = length(var.security_group_ids) > 0
    error_message = "At least one security group ID is required."
  }
}

variable "dynamodb_table_arns" {
  description = "DynamoDB table ARNs DAX is allowed to access. Used to build the least-privilege role."
  type        = list(string)

  validation {
    condition     = length(var.dynamodb_table_arns) > 0
    error_message = "At least one DynamoDB table ARN is required to scope the DAX role."
  }
}

variable "iam_role_arn" {
  description = "Existing IAM role ARN for DynamoDB access. Null lets the module create a least-privilege role."
  type        = string
  default     = null
}

variable "availability_zones" {
  description = "AZs to place nodes in. Empty lets DAX spread nodes automatically."
  type        = list(string)
  default     = []
}

variable "parameters" {
  description = "DAX parameter group parameters, e.g. record-ttl-millis and query-ttl-millis."
  type = list(object({
    name  = string
    value = string
  }))
  default = [
    { name = "record-ttl-millis", value = "300000" },
    { name = "query-ttl-millis", value = "300000" },
  ]
}

variable "maintenance_window" {
  description = "Weekly UTC maintenance window, format ddd:hh24:mi-ddd:hh24:mi (>= 60 min)."
  type        = string
  default     = "sun:05:00-sun:06:00"
}

variable "notification_topic_arn" {
  description = "Optional SNS topic ARN for DAX cluster notifications."
  type        = string
  default     = null
}

variable "description" {
  description = "Free-text description for the DAX cluster."
  type        = string
  default     = "Managed by Terraform (kloudvin terraform-module-aws-dax)."
}

variable "tags" {
  description = "Additional tags merged onto the cluster."
  type        = map(string)
  default     = {}
}

outputs.tf

output "cluster_arn" {
  description = "ARN of the DAX cluster."
  value       = aws_dax_cluster.this.arn
}

output "cluster_name" {
  description = "Name of the DAX cluster."
  value       = aws_dax_cluster.this.cluster_name
}

output "configuration_endpoint" {
  description = "Configuration endpoint (DNS:port) the DAX client connects to."
  value       = aws_dax_cluster.this.configuration_endpoint
}

output "cluster_address" {
  description = "DNS name of the cluster without the port appended."
  value       = aws_dax_cluster.this.cluster_address
}

output "port" {
  description = "Port used by the configuration endpoint."
  value       = aws_dax_cluster.this.port
}

output "nodes" {
  description = "List of node objects (id, address, port, availability_zone)."
  value       = aws_dax_cluster.this.nodes
}

output "iam_role_arn" {
  description = "ARN of the IAM role DAX assumes to access DynamoDB."
  value       = local.role_arn
}

output "subnet_group_name" {
  description = "Name of the DAX subnet group."
  value       = aws_dax_subnet_group.this.name
}

output "parameter_group_name" {
  description = "Name of the associated DAX parameter group."
  value       = aws_dax_parameter_group.this.name
}

How to use it

module "dax" {
  source = "git::https://dev.azure.com/teknohut/kloudvin/_git/terraform-modules//terraform-module-aws-dax?ref=v1.0.0"

  cluster_name       = "catalog-prod"
  node_type          = "dax.r5.large"
  replication_factor = 3 # one node per AZ for failover

  subnet_ids         = aws_db_subnet_group.private.subnet_ids
  security_group_ids = [aws_security_group.dax.id]

  # The module builds a least-privilege role scoped to exactly these tables.
  dynamodb_table_arns = [
    aws_dynamodb_table.products.arn,
    aws_dynamodb_table.categories.arn,
  ]

  # Cache hot catalog reads for five minutes.
  parameters = [
    { name = "record-ttl-millis", value = "300000" },
    { name = "query-ttl-millis", value = "60000" },
  ]

  maintenance_window     = "sun:05:00-sun:06:00"
  notification_topic_arn = aws_sns_topic.dax_alerts.arn

  tags = {
    Environment = "prod"
    Team        = "catalog"
    CostCenter  = "CAT-3310"
  }
}

# Downstream: hand the configuration endpoint to an ECS service so the DAX
# client connects to the cache instead of DynamoDB directly.
resource "aws_ssm_parameter" "dax_endpoint" {
  name  = "/catalog/prod/dax/endpoint"
  type  = "String"
  value = module.dax.configuration_endpoint
}

# Application tasks need their OWN permission to talk to the DAX cluster,
# separate from the role DAX uses to reach DynamoDB.
resource "aws_iam_role_policy" "app_dax_access" {
  name = "catalog-app-dax-access"
  role = aws_iam_role.catalog_task.id

  policy = jsonencode({
    Version = "2012-10-17"
    Statement = [{
      Effect = "Allow"
      Action = [
        "dax:GetItem",
        "dax:BatchGetItem",
        "dax:Query",
        "dax:Scan",
        "dax:PutItem",
        "dax:UpdateItem",
      ]
      Resource = module.dax.cluster_arn
    }]
  })
}

Pin the module with ?ref=<tag> so a cluster never silently picks up a breaking module change — changing replication_factor or encryption settings can force node-level changes.

With Terragrunt

Terragrunt keeps this module DRY across environments — define the backend and provider once in a root config, then a thin terragrunt.hcl per environment supplies only the inputs that differ.

1. Root configlive/terragrunt.hcl (inherited by every module):

remote_state {
  backend = "s3"
  generate = { path = "backend.tf", if_exists = "overwrite" }
  config = {
    # ...s3 state bucket/container + key per path...
  }
}

2. Module configlive/prod/dax/terragrunt.hcl:

include "root" {
  path = find_in_parent_folders()
}

terraform {
  source = "git::https://dev.azure.com/teknohut/kloudvin/_git/terraform-modules//terraform-module-aws-dax?ref=v1.0.0"
}

inputs = {
  cluster_name = "..."
  subnet_ids = ["...", "..."]
  security_group_ids = ["..."]
  dynamodb_table_arns = ["..."]
}

3. Deploy one environment, or roll out all modules together:

cd live/prod/dax && terragrunt apply        # this module
terragrunt run-all apply                      # every module under live/prod

Why Terragrunt here: the backend and provider live in one place instead of being copy-pasted into every module; inputs is overridden per environment (dev / stage / prod) without forking the module; and run-all orchestrates dependencies across modules. Reach for it once you have more than one environment or more than a handful of modules — for a single stack, the plain Quickstart above is enough.

Inputs

Name Type Default Required Description
cluster_name string Yes DAX cluster name (lowercased by AWS).
subnet_ids list(string) Yes >= 2 private subnets across >= 2 AZs.
security_group_ids list(string) Yes Security groups for the cluster nodes.
dynamodb_table_arns list(string) Yes Table ARNs the DAX role may access.
node_type string dax.t3.small No Compute/memory per node (dax.*).
replication_factor number 3 No Number of nodes; use >= 3 in prod.
iam_role_arn string null No Existing role; null creates a least-privilege role.
availability_zones list(string) [] No AZs for nodes; empty spreads automatically.
parameters list(object) record/query TTL 300000 No Parameter group entries (name/value).
maintenance_window string sun:05:00-sun:06:00 No Weekly UTC maintenance window.
notification_topic_arn string null No SNS topic for cluster notifications.
description string “Managed by Terraform…” No Free-text cluster description.
tags map(string) {} No Additional tags merged onto the cluster.

Outputs

Name Description
cluster_arn ARN of the DAX cluster.
cluster_name Name of the DAX cluster.
configuration_endpoint Configuration endpoint (DNS:port) for the DAX client.
cluster_address DNS name of the cluster without the port.
port Port used by the configuration endpoint.
nodes List of node objects (id, address, port, AZ).
iam_role_arn ARN of the role DAX assumes to access DynamoDB.
subnet_group_name Name of the DAX subnet group.
parameter_group_name Name of the associated parameter group.

Enterprise scenario

A retail platform serves a product catalog whose items are read tens of thousands of times per second during peak sale events but updated only a few times an hour. Hot keys on a handful of best-sellers kept pushing DynamoDB into throttling and inflating on-demand read costs, so the catalog team published this module at v1.0.0 and put a three-node dax.r5.large cluster in front of the products and categories tables across three AZs. The module enforces server_side_encryption and cluster_endpoint_encryption_type = "TLS", so cached items are encrypted at rest and every client connection is TLS; the generated role can touch only those two tables and their indexes, while the application’s task role separately holds the dax:* permissions to reach the cluster. With a five-minute record TTL, cache hits return in microseconds, DynamoDB read costs dropped sharply, and a security review confirmed no single-node or unencrypted DAX clusters and no over-scoped DynamoDB roles across the estate.

Best practices

TerraformAWSDAXModuleIaC
Need this built for real?

Vinod is a Senior Cloud Architect (22+ yrs) — available for Azure / AWS / GCP architecture, landing zones, and migrations.

Work with me

Comments

Keep Reading