Quick take — Provision AWS MemoryDB for Redis with Terraform: a reusable module wiring up the cluster, subnet group, parameter group, encryption, ACL auth, and Multi-AZ sharding for hashicorp/aws ~> 5.0. New here? Jump to the Quickstart below to deploy it in minutes; read on for how it works and when to reach for it.
Quickstart (copy-paste)
Minimal, runnable configuration — drop this in a .tf file and fill in the "..." placeholders (each required input is commented):
provider "aws" {
region = "us-east-1"
}
module "memorydb" {
source = "git::https://dev.azure.com/teknohut/kloudvin/_git/terraform-modules//terraform-module-aws-memorydb?ref=v1.0.0"
name = "..." # Cluster name; prefix for subnet/parameter group and ACL.
subnet_ids = ["...", "..."] # Private subnet IDs across >=2 AZs.
security_group_ids = ["...", "..."] # Security groups controlling cluster access.
}
Then terraform init && terraform apply. Every other input has a sensible default — see Inputs below to override behaviour.
What this module is
Amazon MemoryDB for Redis is a Redis-compatible, durable in-memory database. Unlike ElastiCache for Redis — which is fundamentally a cache that can lose data on failover — MemoryDB persists every write to a Multi-AZ transactional log before acknowledging it, giving you microsecond reads, single-digit-millisecond writes, and 11 nines of durability. That makes it a primary database for session stores, leaderboards, feature flags, and microservice state, not just a cache in front of one.
The catch is that a production MemoryDB deployment is never a single resource. You need a aws_memorydb_subnet_group pinned to private subnets, a aws_memorydb_parameter_group to tune eviction and keyspace notifications, a aws_memorydb_acl plus aws_memorydb_user for Redis RBAC auth, encryption-at-rest with KMS, TLS in transit, and the right shard/replica topology for your durability and throughput targets. Wrapping all of that in one module means every cluster across your estate is encrypted, ACL-authenticated, Multi-AZ, and tagged the same way — and a team consumes it with a dozen lines of HCL instead of re-deriving the wiring each time.
When to use it
- You need a durable Redis (data survives node and AZ failure) as a system of record — session state, shopping carts, rate-limit counters, real-time leaderboards — not a throwaway cache.
- You want Redis Cluster-mode horizontal scaling (sharding across
num_shards) with in-shard read replicas for HA, managed for you. - You require encryption at rest (KMS), TLS in transit (always on for MemoryDB), and Redis ACL/RBAC authentication to satisfy compliance.
- You are standardizing many clusters across dev/stage/prod and want one audited, version-pinned module rather than copy-pasted resources.
Reach for ElastiCache instead if you genuinely need a disposable cache and the lower cost that comes with not paying for the durable transaction log, or if you need Memcached/Valkey specifically.
Module structure
terraform-module-aws-memorydb/
├── versions.tf
├── main.tf
├── variables.tf
└── outputs.tf
versions.tf
terraform {
required_version = ">= 1.5.0"
required_providers {
aws = {
source = "hashicorp/aws"
version = "~> 5.0"
}
}
}
main.tf
locals {
# MemoryDB requires the engine ACL to exist before the cluster references it.
# "open-access" is the AWS-managed default ACL; we only create a custom ACL/user
# when an auth password is supplied.
create_user = var.user_password != null
acl_name = local.create_user ? aws_memorydb_acl.this[0].name : "open-access"
}
resource "aws_memorydb_subnet_group" "this" {
name = "${var.name}-subnets"
description = "Subnet group for the ${var.name} MemoryDB cluster"
subnet_ids = var.subnet_ids
tags = var.tags
}
resource "aws_memorydb_parameter_group" "this" {
name = "${var.name}-params"
description = "Parameter group for the ${var.name} MemoryDB cluster"
family = var.parameter_group_family
dynamic "parameter" {
for_each = var.parameters
content {
name = parameter.key
value = parameter.value
}
}
tags = var.tags
lifecycle {
create_before_destroy = true
}
}
# Redis RBAC: an access string scoped user, gated behind a custom ACL.
resource "aws_memorydb_user" "this" {
count = local.create_user ? 1 : 0
user_name = "${var.name}-app"
access_string = var.user_access_string
authentication_mode {
type = "password"
passwords = [var.user_password]
}
tags = var.tags
}
resource "aws_memorydb_acl" "this" {
count = local.create_user ? 1 : 0
name = "${var.name}-acl"
user_names = [aws_memorydb_user.this[0].user_name]
tags = var.tags
lifecycle {
create_before_destroy = true
}
}
resource "aws_memorydb_cluster" "this" {
name = var.name
description = var.description
node_type = var.node_type
num_shards = var.num_shards
num_replicas_per_shard = var.num_replicas_per_shard
engine_version = var.engine_version
subnet_group_name = aws_memorydb_subnet_group.this.name
parameter_group_name = aws_memorydb_parameter_group.this.name
security_group_ids = var.security_group_ids
acl_name = local.acl_name
port = var.port
tls_enabled = true # MemoryDB enforces in-transit encryption; always on.
kms_key_arn = var.kms_key_arn
maintenance_window = var.maintenance_window
snapshot_window = var.snapshot_window
snapshot_retention_limit = var.snapshot_retention_limit
final_snapshot_name = var.final_snapshot_name
auto_minor_version_upgrade = var.auto_minor_version_upgrade
tags = var.tags
depends_on = [aws_memorydb_acl.this]
}
variables.tf
variable "name" {
description = "Name of the MemoryDB cluster. Used as the prefix for the subnet/parameter group and ACL."
type = string
validation {
condition = can(regex("^[a-z][a-z0-9-]{0,38}$", var.name))
error_message = "name must be 1-39 chars, lowercase, start with a letter, and contain only a-z, 0-9 and hyphens."
}
}
variable "description" {
description = "Human-readable description of the cluster."
type = string
default = "Managed by Terraform"
}
variable "node_type" {
description = "Compute/memory node type, e.g. db.r7g.large or db.t4g.small."
type = string
default = "db.t4g.small"
}
variable "num_shards" {
description = "Number of shards (partitions). Increase to scale data and write throughput horizontally."
type = number
default = 1
validation {
condition = var.num_shards >= 1 && var.num_shards <= 500
error_message = "num_shards must be between 1 and 500."
}
}
variable "num_replicas_per_shard" {
description = "Read replicas per shard (0-5). At least 1 is required for Multi-AZ failover."
type = number
default = 1
validation {
condition = var.num_replicas_per_shard >= 0 && var.num_replicas_per_shard <= 5
error_message = "num_replicas_per_shard must be between 0 and 5."
}
}
variable "engine_version" {
description = "Redis engine version for MemoryDB (e.g. 7.1)."
type = string
default = "7.1"
}
variable "port" {
description = "TCP port the cluster listens on."
type = number
default = 6379
}
variable "subnet_ids" {
description = "Private subnet IDs (spanning >= 2 AZs for HA) for the subnet group."
type = list(string)
validation {
condition = length(var.subnet_ids) >= 2
error_message = "Provide at least two subnet IDs in different AZs for a Multi-AZ cluster."
}
}
variable "security_group_ids" {
description = "Security group IDs controlling network access to the cluster."
type = list(string)
}
variable "kms_key_arn" {
description = "ARN of a customer-managed KMS key for encryption at rest. Null uses the AWS-owned key."
type = string
default = null
}
variable "parameter_group_family" {
description = "Parameter group family, must match the engine version (e.g. memorydb_redis7)."
type = string
default = "memorydb_redis7"
}
variable "parameters" {
description = "Map of MemoryDB parameters to set, e.g. { maxmemory-policy = \"allkeys-lru\" }."
type = map(string)
default = {}
}
variable "user_access_string" {
description = "Redis ACL access string for the app user (only used when user_password is set)."
type = string
default = "on ~* &* +@all"
}
variable "user_password" {
description = "Password for the app ACL user. When set, a custom user + ACL are created; when null, the open-access ACL is used."
type = string
default = null
sensitive = true
validation {
condition = var.user_password == null || length(var.user_password) >= 16
error_message = "user_password must be at least 16 characters when provided."
}
}
variable "maintenance_window" {
description = "Weekly maintenance window, e.g. sun:05:00-sun:06:00."
type = string
default = "sun:05:00-sun:06:00"
}
variable "snapshot_window" {
description = "Daily window for automatic snapshots, e.g. 03:00-04:00."
type = string
default = "03:00-04:00"
}
variable "snapshot_retention_limit" {
description = "Days to retain automatic snapshots (0 disables them)."
type = number
default = 7
validation {
condition = var.snapshot_retention_limit >= 0 && var.snapshot_retention_limit <= 35
error_message = "snapshot_retention_limit must be between 0 and 35 days."
}
}
variable "final_snapshot_name" {
description = "Name of the snapshot taken when the cluster is destroyed. Null skips it."
type = string
default = null
}
variable "auto_minor_version_upgrade" {
description = "Whether to apply minor engine upgrades automatically during maintenance."
type = bool
default = true
}
variable "tags" {
description = "Tags applied to all created resources."
type = map(string)
default = {}
}
outputs.tf
output "cluster_id" {
description = "The name/ID of the MemoryDB cluster."
value = aws_memorydb_cluster.this.id
}
output "cluster_arn" {
description = "ARN of the MemoryDB cluster."
value = aws_memorydb_cluster.this.arn
}
output "cluster_endpoint_address" {
description = "Cluster (configuration) endpoint hostname for client connections."
value = aws_memorydb_cluster.this.cluster_endpoint[0].address
}
output "cluster_endpoint_port" {
description = "Port of the cluster endpoint."
value = aws_memorydb_cluster.this.cluster_endpoint[0].port
}
output "shards" {
description = "Details of the shards (name, slots, node membership) in the cluster."
value = aws_memorydb_cluster.this.shards
}
output "subnet_group_name" {
description = "Name of the created subnet group."
value = aws_memorydb_subnet_group.this.name
}
output "parameter_group_name" {
description = "Name of the created parameter group."
value = aws_memorydb_parameter_group.this.name
}
output "acl_name" {
description = "Name of the ACL attached to the cluster (custom or open-access)."
value = local.acl_name
}
How to use it
module "memorydb_for_redis" {
source = "git::https://dev.azure.com/teknohut/kloudvin/_git/terraform-modules//terraform-module-aws-memorydb?ref=v1.0.0"
name = "orders-sessions"
description = "Durable session + cart store for the orders service"
node_type = "db.r7g.large"
num_shards = 3
num_replicas_per_shard = 2 # primary + 2 replicas per shard => Multi-AZ
engine_version = "7.1"
subnet_ids = module.network.private_subnet_ids
security_group_ids = [aws_security_group.memorydb.id]
kms_key_arn = aws_kms_key.memorydb.arn
parameter_group_family = "memorydb_redis7"
parameters = {
"maxmemory-policy" = "volatile-lru"
"notify-keyspace-events" = "Ex" # expired-key events for TTL-driven workflows
}
user_access_string = "on ~* &* +@all -@dangerous"
user_password = var.memorydb_app_password # 16+ chars, from a secret store
snapshot_retention_limit = 14
final_snapshot_name = "orders-sessions-final"
tags = {
Environment = "prod"
Service = "orders"
ManagedBy = "terraform"
}
}
# Downstream: hand the cluster endpoint to the app as a Secrets Manager secret.
resource "aws_secretsmanager_secret_version" "redis_url" {
secret_id = aws_secretsmanager_secret.redis_url.id
secret_string = jsonencode({
host = module.memorydb_for_redis.cluster_endpoint_address
port = module.memorydb_for_redis.cluster_endpoint_port
tls = true
})
}
With Terragrunt
Terragrunt keeps this module DRY across environments — define the backend and provider once in a root config, then a thin terragrunt.hcl per environment supplies only the inputs that differ.
1. Root config — live/terragrunt.hcl (inherited by every module):
remote_state {
backend = "s3"
generate = { path = "backend.tf", if_exists = "overwrite" }
config = {
# ...s3 state bucket/container + key per path...
}
}
2. Module config — live/prod/memorydb/terragrunt.hcl:
include "root" {
path = find_in_parent_folders()
}
terraform {
source = "git::https://dev.azure.com/teknohut/kloudvin/_git/terraform-modules//terraform-module-aws-memorydb?ref=v1.0.0"
}
inputs = {
name = "..."
subnet_ids = ["...", "..."]
security_group_ids = ["...", "..."]
}
3. Deploy one environment, or roll out all modules together:
cd live/prod/memorydb && terragrunt apply # this module
terragrunt run-all apply # every module under live/prod
Why Terragrunt here: the backend and provider live in one place instead of being copy-pasted into every module; inputs is overridden per environment (dev / stage / prod) without forking the module; and run-all orchestrates dependencies across modules. Reach for it once you have more than one environment or more than a handful of modules — for a single stack, the plain Quickstart above is enough.
Inputs
| Name | Type | Default | Required | Description |
|---|---|---|---|---|
name |
string |
— | yes | Cluster name; prefix for subnet/parameter group and ACL. |
description |
string |
"Managed by Terraform" |
no | Human-readable cluster description. |
node_type |
string |
"db.t4g.small" |
no | Node compute/memory size (e.g. db.r7g.large). |
num_shards |
number |
1 |
no | Number of shards (1-500); scales data and write throughput. |
num_replicas_per_shard |
number |
1 |
no | Read replicas per shard (0-5); >=1 enables Multi-AZ failover. |
engine_version |
string |
"7.1" |
no | MemoryDB Redis engine version. |
port |
number |
6379 |
no | TCP listener port. |
subnet_ids |
list(string) |
— | yes | Private subnet IDs across >=2 AZs. |
security_group_ids |
list(string) |
— | yes | Security groups controlling cluster access. |
kms_key_arn |
string |
null |
no | Customer-managed KMS key ARN for at-rest encryption. |
parameter_group_family |
string |
"memorydb_redis7" |
no | Parameter group family matching the engine version. |
parameters |
map(string) |
{} |
no | MemoryDB parameter overrides (e.g. maxmemory-policy). |
user_access_string |
string |
"on ~* &* +@all" |
no | Redis ACL access string for the app user. |
user_password |
string |
null |
no | App ACL user password (16+ chars); set to create a custom ACL/user. |
maintenance_window |
string |
"sun:05:00-sun:06:00" |
no | Weekly maintenance window. |
snapshot_window |
string |
"03:00-04:00" |
no | Daily automatic snapshot window. |
snapshot_retention_limit |
number |
7 |
no | Snapshot retention in days (0-35; 0 disables). |
final_snapshot_name |
string |
null |
no | Snapshot name taken on cluster destroy. |
auto_minor_version_upgrade |
bool |
true |
no | Auto-apply minor engine upgrades in maintenance. |
tags |
map(string) |
{} |
no | Tags applied to all resources. |
Outputs
| Name | Description |
|---|---|
cluster_id |
The name/ID of the MemoryDB cluster. |
cluster_arn |
ARN of the MemoryDB cluster. |
cluster_endpoint_address |
Cluster (configuration) endpoint hostname for clients. |
cluster_endpoint_port |
Port of the cluster endpoint. |
shards |
Shard details (name, slots, node membership). |
subnet_group_name |
Name of the created subnet group. |
parameter_group_name |
Name of the created parameter group. |
acl_name |
Name of the ACL attached to the cluster (custom or open-access). |
Enterprise scenario
A retail platform moves its checkout session and shopping-cart state off a self-managed Redis cluster that lost carts during every AZ event. They deploy this module with num_shards = 3 and num_replicas_per_shard = 2 on db.r7g.large nodes, a customer-managed KMS key for PCI scope, and a scoped ACL user (-@dangerous) so the app can never run FLUSHALL. Because MemoryDB durably logs every write across three AZs, an AZ failure now triggers a sub-second automatic failover with zero lost carts, and the 14-day snapshot retention satisfies the auditors’ point-in-time recovery requirement.
Best practices
- Always run with replicas for HA. Set
num_replicas_per_shard >= 1so each shard spans multiple AZs; a 0-replica cluster has no automatic failover and defeats the point of choosing a durable store. - Use customer-managed KMS keys and ACL auth in production. Pass a
kms_key_arnfor at-rest encryption you control, and supply auser_passwordso the cluster uses a scoped RBAC user instead of theopen-accessACL — never expose a MemoryDB cluster with open access on a routable network. - Tune
maxmemory-policyto the workload. Usenoevictionwhen MemoryDB is your system of record (you want writes to fail rather than silently drop data), orvolatile-lru/allkeys-lruonly when it behaves like a cache and keys carry TTLs. - Right-size with Graviton and scale by sharding. Prefer
r7g/t4g(Graviton) node types for better price-performance, and scale write throughput and dataset size by increasingnum_shardsrather than jumping to an oversized single node. - Keep snapshots and a final snapshot. A
snapshot_retention_limitof 7-35 days plus afinal_snapshot_nameprotects against accidentalterraform destroyand gives you point-in-time recovery without manual backup jobs. - Name and tag consistently. Drive
namefromservice-purpose(e.g.orders-sessions) so the cluster, subnet group, parameter group, and ACL are instantly traceable, and tag every cluster withEnvironment/Service/ManagedByfor cost allocation and ownership.