Quick take — A reusable hashicorp/google ~> 5.0 module for google_alloydb_cluster and google_alloydb_instance: PSA-only networking, a regional HA primary, continuous backup for PITR, an optional read-pool, and CMEK. New here? Jump to the Quickstart below to deploy it in minutes; read on for how it works and when to reach for it.
Quickstart (copy-paste)
Minimal, runnable configuration — drop this in a .tf file and fill in the "..." placeholders (each required input is commented):
provider "google" {
project = "my-project"
region = "us-central1"
}
module "alloydb" {
source = "git::https://dev.azure.com/teknohut/kloudvin/_git/terraform-modules//terraform-module-gcp-alloydb?ref=v1.0.0"
project_id = "..." # Project ID that hosts the AlloyDB cluster.
region = "..." # Region for the cluster and its backups.
cluster_id = "..." # Cluster ID; also the prefix for instance IDs (2–63 char…
network = "..." # VPC self-link for Private Service Access.
initial_user_password = "..." # Superuser password (sensitive; source from Secret Manag…
}
Then terraform init && terraform apply. Every other input has a sensible default — see Inputs below to override behaviour.
What this module is
AlloyDB is GCP’s fully managed, PostgreSQL-compatible database built for demanding transactional and hybrid (HTAP) workloads. Unlike Cloud SQL, AlloyDB splits the control surface into two distinct resources: a cluster (google_alloydb_cluster) that owns the storage layer, the network attachment, backups, encryption, and the database version, and one or more instances (google_alloydb_instance) that provide the compute that serves queries. A cluster is useless without at least one PRIMARY instance; you then add READ_POOL instances for horizontal read scaling against the same underlying storage, with node counts of 1–20.
The architectural details that make AlloyDB different from a stock Postgres also make it easy to misconfigure. AlloyDB has no public IP option at all — it is reachable only over a VPC through Private Service Access (Service Networking peering) or a Private Service Connect endpoint, so the network_config.network (or psc_config) is mandatory, not optional. A production cluster wants a regional HA primary (availability_type = "REGIONAL") for an automatic standby, continuous backup enabled so you get point-in-time recovery down to the second within the recovery window (this is separate from scheduled automated_backup_policy snapshots), deletion protection, and CMEK if your compliance baseline forbids Google-managed keys. The initial superuser password is set on the cluster, so it belongs in Secret Manager, never as a literal in HCL.
This module wraps the cluster, its primary, and an optional read pool into one opinionated, variable-driven block. It defaults to PSA-only networking with a REGIONAL primary, continuous backup with a configurable recovery window, scheduled automated backups, and deletion protection on. It optionally enables CMEK and creates a read-pool instance with a chosen node count, then emits the cluster name, primary connection IP, and read-pool IP as outputs so a GKE workload, a Cloud Run service, or a Secret Manager secret can consume them.
When to use it
- You need a managed, PostgreSQL-compatible database with better price-performance than stock Postgres for transactional or mixed HTAP workloads, and you want one reviewed, hardened shape instead of bespoke cluster/instance blocks per team.
- Security baselines require no public surface — AlloyDB is private-only by design, reachable over a VPC via Private Service Access or PSC, and this module makes that the enforced default.
- You want regional high availability (an automatic standby in a second zone) and continuous backup with point-in-time recovery to be the default for production, alongside scheduled snapshot backups, rather than toggles someone remembers later.
- You need to scale reads horizontally with a read pool (1–20 nodes) that shares the primary’s storage with low replication lag, provisioned from the same module call.
- Compliance requires customer-managed encryption keys (CMEK) on both the cluster storage and its backups, wired to a Cloud KMS key.
Reach for Cloud SQL instead when you want SQL Server/MySQL or the lowest-cost small Postgres instance, Spanner when you need horizontal write scaling beyond a single primary and global consistency, or BigQuery when the workload is purely analytical rather than transactional.
Module structure
terraform-module-gcp-alloydb/
├── versions.tf # provider + required_version pins
├── main.tf # cluster, primary instance, optional read pool
├── variables.tf # var-driven inputs with validation
└── outputs.tf # cluster/instance ids, names, connection IPs
versions.tf
terraform {
required_version = ">= 1.5.0"
required_providers {
google = {
source = "hashicorp/google"
version = "~> 5.0"
}
}
}
main.tf
locals {
# Use a CMEK block only when a KMS key is supplied.
cmek = var.kms_key_name == null ? [] : [1]
}
resource "google_alloydb_cluster" "this" {
project = var.project_id
cluster_id = var.cluster_id
location = var.region
# AlloyDB is private-only: it must be attached to a VPC via
# Private Service Access (Service Networking peering).
network_config {
network = var.network
}
database_version = var.database_version
# Initial superuser. The password should come from Secret Manager /
# a sensitive variable — never a literal committed to HCL.
initial_user {
user = var.initial_user
password = var.initial_user_password
}
# Continuous backup gives point-in-time recovery to the second within
# the recovery window. This is distinct from scheduled snapshots below.
continuous_backup_config {
enabled = true
recovery_window_days = var.continuous_backup_recovery_window_days
dynamic "encryption_config" {
for_each = local.cmek
content {
kms_key_name = var.kms_key_name
}
}
}
# Scheduled snapshot backups, retained by count.
automated_backup_policy {
location = var.region
backup_window = "3600s"
enabled = var.automated_backup_enabled
weekly_schedule {
days_of_week = ["MONDAY", "TUESDAY", "WEDNESDAY", "THURSDAY", "FRIDAY", "SATURDAY", "SUNDAY"]
start_times {
hours = var.backup_start_hour
minutes = 0
seconds = 0
nanos = 0
}
}
quantity_based_retention {
count = var.automated_backup_retention_count
}
dynamic "encryption_config" {
for_each = local.cmek
content {
kms_key_name = var.kms_key_name
}
}
}
# CMEK for the cluster's primary storage.
dynamic "encryption_config" {
for_each = local.cmek
content {
kms_key_name = var.kms_key_name
}
}
deletion_policy = var.deletion_protection ? "DEFAULT" : "FORCE"
labels = var.labels
lifecycle {
# Protect the superuser password from being read back as a diff.
ignore_changes = [initial_user[0].password]
}
}
# The PRIMARY instance: the compute that serves reads and writes.
resource "google_alloydb_instance" "primary" {
cluster = google_alloydb_cluster.this.name
instance_id = "${var.cluster_id}-primary"
instance_type = "PRIMARY"
# REGIONAL gives an automatic standby in a second zone (HA);
# ZONAL is single-zone and cheaper for non-prod.
availability_type = var.availability_type
machine_config {
cpu_count = var.primary_cpu_count
}
database_flags = var.database_flags
labels = var.labels
}
# Optional READ_POOL for horizontal read scaling against the same storage.
resource "google_alloydb_instance" "read_pool" {
count = var.read_pool_node_count > 0 ? 1 : 0
cluster = google_alloydb_cluster.this.name
instance_id = "${var.cluster_id}-read-pool"
instance_type = "READ_POOL"
read_pool_config {
node_count = var.read_pool_node_count
}
machine_config {
cpu_count = var.read_pool_cpu_count
}
labels = var.labels
# The primary must exist first so the cluster is fully initialised.
depends_on = [google_alloydb_instance.primary]
}
variables.tf
variable "project_id" {
description = "Project ID that hosts the AlloyDB cluster."
type = string
}
variable "region" {
description = "Region for the cluster and its backups (e.g. asia-south1)."
type = string
}
variable "cluster_id" {
description = "Cluster ID. Also used as the prefix for instance IDs."
type = string
validation {
condition = can(regex("^[a-z][a-z0-9-]{0,61}[a-z0-9]$", var.cluster_id))
error_message = "cluster_id must be 2-63 chars, lowercase letters, digits, or hyphens, and start with a letter."
}
}
variable "network" {
description = "Self-link of the VPC for Private Service Access, e.g. projects/PROJECT/global/networks/NETWORK."
type = string
}
variable "database_version" {
description = "PostgreSQL major version for the cluster."
type = string
default = "POSTGRES_15"
validation {
condition = contains(["POSTGRES_14", "POSTGRES_15", "POSTGRES_16"], var.database_version)
error_message = "database_version must be one of POSTGRES_14, POSTGRES_15, or POSTGRES_16."
}
}
variable "initial_user" {
description = "Name of the initial superuser created on the cluster."
type = string
default = "postgres"
}
variable "initial_user_password" {
description = "Password for the initial superuser. Source from Secret Manager / a sensitive var, not a literal."
type = string
sensitive = true
}
variable "availability_type" {
description = "Primary availability: REGIONAL (HA, automatic standby) or ZONAL (single zone)."
type = string
default = "REGIONAL"
validation {
condition = contains(["REGIONAL", "ZONAL"], var.availability_type)
error_message = "availability_type must be REGIONAL or ZONAL."
}
}
variable "primary_cpu_count" {
description = "vCPU count for the primary instance. AlloyDB requires 2, 4, 8, 16, 32, 64, or 96."
type = number
default = 4
validation {
condition = contains([2, 4, 8, 16, 32, 64, 96], var.primary_cpu_count)
error_message = "primary_cpu_count must be one of 2, 4, 8, 16, 32, 64, or 96."
}
}
variable "read_pool_node_count" {
description = "Number of nodes in the read pool (1-20). Set to 0 to create no read pool."
type = number
default = 0
validation {
condition = var.read_pool_node_count >= 0 && var.read_pool_node_count <= 20
error_message = "read_pool_node_count must be between 0 and 20."
}
}
variable "read_pool_cpu_count" {
description = "vCPU count per read-pool node."
type = number
default = 4
validation {
condition = contains([2, 4, 8, 16, 32, 64, 96], var.read_pool_cpu_count)
error_message = "read_pool_cpu_count must be one of 2, 4, 8, 16, 32, 64, or 96."
}
}
variable "continuous_backup_recovery_window_days" {
description = "Days of continuous backup retained for point-in-time recovery (1-35)."
type = number
default = 14
validation {
condition = var.continuous_backup_recovery_window_days >= 1 && var.continuous_backup_recovery_window_days <= 35
error_message = "continuous_backup_recovery_window_days must be between 1 and 35."
}
}
variable "automated_backup_enabled" {
description = "Whether scheduled (snapshot) automated backups are enabled."
type = bool
default = true
}
variable "backup_start_hour" {
description = "Hour of day (UTC, 0-23) for the scheduled backup window to start."
type = number
default = 18
validation {
condition = var.backup_start_hour >= 0 && var.backup_start_hour <= 23
error_message = "backup_start_hour must be between 0 and 23."
}
}
variable "automated_backup_retention_count" {
description = "Number of scheduled automated backups to retain."
type = number
default = 14
}
variable "database_flags" {
description = "Map of PostgreSQL flags applied to the primary, e.g. { \"max_connections\" = \"200\" }."
type = map(string)
default = {}
}
variable "kms_key_name" {
description = "Cloud KMS key for CMEK on cluster storage and backups. Null uses Google-managed keys."
type = string
default = null
}
variable "deletion_protection" {
description = "When true, the cluster cannot be destroyed without first relaxing the deletion policy."
type = bool
default = true
}
variable "labels" {
description = "Labels applied to the cluster and instances."
type = map(string)
default = {}
}
outputs.tf
output "cluster_id" {
description = "Fully qualified AlloyDB cluster resource ID."
value = google_alloydb_cluster.this.id
}
output "cluster_name" {
description = "Cluster name (projects/.../locations/.../clusters/...)."
value = google_alloydb_cluster.this.name
}
output "primary_instance_id" {
description = "Fully qualified ID of the primary instance."
value = google_alloydb_instance.primary.id
}
output "primary_ip_address" {
description = "Private IP address clients use to connect to the primary."
value = google_alloydb_instance.primary.ip_address
}
output "read_pool_ip_address" {
description = "Private IP address of the read pool, or null when no read pool is created."
value = try(google_alloydb_instance.read_pool[0].ip_address, null)
}
output "database_version" {
description = "PostgreSQL version actually running on the cluster."
value = google_alloydb_cluster.this.database_version
}
How to use it
module "alloydb" {
source = "git::https://dev.azure.com/teknohut/kloudvin/_git/terraform-modules//terraform-module-gcp-alloydb?ref=v1.0.0"
project_id = "kv-payments-prod"
region = "asia-south1"
cluster_id = "payments-core"
# VPC already configured for Private Service Access (Service Networking peering).
network = "projects/kv-payments-prod/global/networks/core-vpc"
database_version = "POSTGRES_16"
# Superuser password pulled from Secret Manager, never inlined.
initial_user = "postgres"
initial_user_password = data.google_secret_manager_secret_version.alloydb_pw.secret_data
# HA primary plus a 3-node read pool for reporting/read traffic.
availability_type = "REGIONAL"
primary_cpu_count = 8
read_pool_node_count = 3
read_pool_cpu_count = 4
# 30-day PITR window and CMEK for a regulated workload.
continuous_backup_recovery_window_days = 30
kms_key_name = "projects/kv-payments-prod/locations/asia-south1/keyRings/db/cryptoKeys/alloydb"
database_flags = {
"max_connections" = "400"
"alloydb.enable_pg_cron" = "on"
}
deletion_protection = true
labels = {
team = "payments"
environment = "prod"
}
}
data "google_secret_manager_secret_version" "alloydb_pw" {
secret = "alloydb-payments-core-superuser"
}
# Downstream: publish the primary's private IP to the app's runtime config
# so a GKE workload connects over the VPC with no public exposure.
resource "google_secret_manager_secret_version" "db_host" {
secret = "payments-core-db-host"
secret_data = module.alloydb.primary_ip_address
}
With Terragrunt
Terragrunt keeps this module DRY across environments — define the backend and provider once in a root config, then a thin terragrunt.hcl per environment supplies only the inputs that differ.
1. Root config — live/terragrunt.hcl (inherited by every module):
remote_state {
backend = "gcs"
generate = { path = "backend.tf", if_exists = "overwrite" }
config = {
# ...gcs state bucket/container + key per path...
}
}
2. Module config — live/prod/alloydb/terragrunt.hcl:
include "root" {
path = find_in_parent_folders()
}
terraform {
source = "git::https://dev.azure.com/teknohut/kloudvin/_git/terraform-modules//terraform-module-gcp-alloydb?ref=v1.0.0"
}
inputs = {
project_id = "..."
region = "..."
cluster_id = "..."
network = "..."
initial_user_password = "..."
}
3. Deploy one environment, or roll out all modules together:
cd live/prod/alloydb && terragrunt apply # this module
terragrunt run-all apply # every module under live/prod
Why Terragrunt here: the backend and provider live in one place instead of being copy-pasted into every module; inputs is overridden per environment (dev / stage / prod) without forking the module; and run-all orchestrates dependencies across modules. Reach for it once you have more than one environment or more than a handful of modules — for a single stack, the plain Quickstart above is enough.
Inputs
| Name | Type | Default | Required | Description |
|---|---|---|---|---|
| project_id | string | — | Yes | Project ID that hosts the AlloyDB cluster. |
| region | string | — | Yes | Region for the cluster and its backups. |
| cluster_id | string | — | Yes | Cluster ID; also the prefix for instance IDs (2–63 chars, validated). |
| network | string | — | Yes | VPC self-link for Private Service Access. |
| database_version | string | POSTGRES_15 |
No | PostgreSQL major version (14/15/16, validated). |
| initial_user | string | postgres |
No | Name of the initial superuser. |
| initial_user_password | string | — | Yes | Superuser password (sensitive; source from Secret Manager). |
| availability_type | string | REGIONAL |
No | REGIONAL (HA) or ZONAL for the primary. |
| primary_cpu_count | number | 4 |
No | vCPUs for the primary (2/4/8/16/32/64/96, validated). |
| read_pool_node_count | number | 0 |
No | Read-pool node count (0–20); 0 creates no pool. |
| read_pool_cpu_count | number | 4 |
No | vCPUs per read-pool node (validated). |
| continuous_backup_recovery_window_days | number | 14 |
No | PITR window in days (1–35). |
| automated_backup_enabled | bool | true |
No | Enable scheduled snapshot backups. |
| backup_start_hour | number | 18 |
No | UTC hour (0–23) the backup window starts. |
| automated_backup_retention_count | number | 14 |
No | Number of scheduled backups to retain. |
| database_flags | map(string) | {} |
No | PostgreSQL flags applied to the primary. |
| kms_key_name | string | null |
No | Cloud KMS key for CMEK; null uses Google-managed keys. |
| deletion_protection | bool | true |
No | Block cluster destroy unless the deletion policy is relaxed. |
| labels | map(string) | {} |
No | Labels applied to the cluster and instances. |
Outputs
| Name | Description |
|---|---|
| cluster_id | Fully qualified AlloyDB cluster resource ID. |
| cluster_name | Cluster name (projects/.../locations/.../clusters/...). |
| primary_instance_id | Fully qualified ID of the primary instance. |
| primary_ip_address | Private IP clients use to connect to the primary. |
| read_pool_ip_address | Private IP of the read pool, or null when none is created. |
| database_version | PostgreSQL version actually running on the cluster. |
Enterprise scenario
A payments platform runs its core ledger on a single REGIONAL AlloyDB primary in asia-south1 for sub-millisecond, strongly consistent writes, while the finance and analytics teams hammer a 3-node read pool for end-of-day reconciliation reports without ever touching the write path. Continuous backup is set to a 30-day recovery window so the platform can satisfy an auditor’s “restore the ledger to 14:32 on the 3rd” request to the second, and CMEK ties both cluster storage and every backup to a Cloud KMS key the security team controls. Because the cluster is PSA-only with deletion protection on, the database has no public IP to scan and cannot be torn down by a stray terraform destroy.
Best practices
- Keep the superuser password out of state and HCL. Source
initial_user_passwordfrom Secret Manager and keep theignore_changes = [initial_user[0].password]lifecycle rule so a rotated password doesn’t surface as a perpetual diff. Prefer AlloyDB IAM database authentication for application identities over the static superuser. - Run
REGIONALfor prod,ZONALfor everything else. The automatic standby is what gives you the HA SLA; downgrading non-prod clusters toZONALand a smallerprimary_cpu_countis the single biggest AlloyDB cost lever. - Treat continuous backup and scheduled backups as different tools. Continuous backup (
recovery_window_days) is your PITR safety net; theautomated_backup_policysnapshots are your retained restore points. Size the recovery window to your real RPO/audit requirement — every extra day costs storage. - Right-size the read pool, don’t over-provision the primary. Offload reporting and read-heavy traffic to
READ_POOLnodes (scale 1–20) rather than buying a bigger primary; the pool shares the primary’s storage with low lag and is cheaper to grow and shrink. - Enforce CMEK and private networking as policy. Set
kms_key_namefor any regulated workload so storage and backups use a key you control, and rely on the module’s PSA-onlynetwork_config— AlloyDB has no public IP, so the VPC peering and firewall posture are your entire perimeter. - Name and label consistently. Derive instance IDs from
cluster_id(the module does this) and applyteam/environmentlabels so cost, backups, and IAM all line up per workload across projects.