Terraform Module: GCP Cloud Spanner — Globally Consistent SQL with Autoscaling in One Module

Quick take — A reusable hashicorp/google Terraform module for Cloud Spanner: provision regional or multi-region instances, processing-unit autoscaling, databases with version retention, and deletion protection from typed variables. New here? Jump to the Quickstart below to deploy it in minutes; read on for how it works and when to reach for it.

Quickstart (copy-paste)

Minimal, runnable configuration — drop this in a .tf file and fill in the "..." placeholders (each required input is commented):

provider "google" {
  project = "my-project"
  region  = "us-central1"
}

module "spanner" {
  source = "git::https://dev.azure.com/teknohut/kloudvin/_git/terraform-modules//terraform-module-gcp-spanner?ref=v1.0.0"

  project_id    = "..."  # GCP project ID that owns the instance.
  instance_name = "..."  # Instance ID, 6–30 chars, lowercase alphanumeric/hyphen,…
  config        = "..."  # Region (auto-prefixed `regional-`) or full/multi-region…
}

Then terraform init && terraform apply. Every other input has a sensible default — see Inputs below to override behaviour.

What this module is

Cloud Spanner is Google Cloud’s fully managed, horizontally scalable relational database. It is the rare system that gives you strongly consistent, externally consistent (TrueTime-backed) ACID transactions and ANSI SQL, while sharding and replicating your data across zones — or across continents in a multi-region configuration — with a published 99.999% availability SLA. You pay for compute in processing units (1000 PU = 1 node) plus storage, not for a fixed VM, so the capacity model is unlike any other GCP database.

That billing and capacity model is exactly why you want a module. A raw google_spanner_instance forces every team to make the same easily-botched decisions: pick a config (a regional-* versus nam-eur-asia1 multi-region name — get the prefix wrong and the apply fails), choose between fixed num_nodes and processing_units (mixing the two is a conflict error), decide whether to attach an autoscaler, set per-database version_retention_period and deletion_protection, and wire the database ddl and database_dialect. This module wraps google_spanner_instance, an optional google_spanner_instance_iam_member grant, and a set of google_spanner_database resources behind typed, validated variables so a consuming team passes intent — “multi-region, autoscale 1–10 nodes of headroom, two GoogleSQL databases, protected” — and gets a correct instance every time.

When to use it

You need a relational database that stays strongly consistent while scaling writes beyond what a single primary (Cloud SQL / AlloyDB primary) can serve.
You are building globally-distributed services (payments ledger, inventory, gaming, ad-serving) that require multi-region reads/writes with bounded staleness or strong reads.
You want capacity to track load automatically via the managed autoscaler instead of paging an engineer to add nodes.
You are standardizing Spanner across many teams and need consistent deletion protection, retention windows, and IAM baked in.
Reach for Cloud SQL or AlloyDB instead when a single regional Postgres/MySQL primary is sufficient and Spanner’s minimum spend (a 100-PU instance runs continuously) is not justified.

Module structure

terraform-module-gcp-spanner/
├── versions.tf      # provider + Terraform version pins
├── main.tf          # spanner_instance + autoscaler + databases + IAM
├── variables.tf     # typed, validated inputs
└── outputs.tf       # instance id/name + database ids

versions.tf

terraform {
  required_version = ">= 1.5.0"

  required_providers {
    google = {
      source  = "hashicorp/google"
      version = "~> 5.0"
    }
  }
}

main.tf

locals {
  # Spanner config names are like "regional-us-central1" or "nam-eur-asia1".
  # If the caller passes a bare region ("us-central1") for a regional instance,
  # normalise it to the "regional-" form.
  spanner_config = (
    length(regexall("^(regional-|nam|eur|asia|nam-eur)", var.config)) > 0
    ? var.config
    : "regional-${var.config}"
  )

  use_autoscaling = var.autoscaling != null
}

resource "google_spanner_instance" "this" {
  project      = var.project_id
  name         = var.instance_name
  display_name = coalesce(var.display_name, var.instance_name)
  config       = local.spanner_config

  # Exactly one of these three capacity modes is active:
  #  - autoscaling_config (managed autoscaler), or
  #  - processing_units (fixed PU), or
  #  - num_nodes (fixed nodes).
  num_nodes        = local.use_autoscaling ? null : var.num_nodes
  processing_units = local.use_autoscaling ? null : var.processing_units

  edition                      = var.edition
  default_backup_schedule_type = var.default_backup_schedule_type

  force_destroy = var.force_destroy
  labels        = var.labels

  dynamic "autoscaling_config" {
    for_each = local.use_autoscaling ? [var.autoscaling] : []
    content {
      autoscaling_limits {
        min_processing_units = autoscaling_config.value.min_processing_units
        max_processing_units = autoscaling_config.value.max_processing_units
      }
      autoscaling_targets {
        high_priority_cpu_utilization_percent = autoscaling_config.value.target_high_priority_cpu_utilization_percent
        storage_utilization_percent           = autoscaling_config.value.target_storage_utilization_percent
      }
    }
  }
}

resource "google_spanner_database" "this" {
  for_each = var.databases

  project  = var.project_id
  instance = google_spanner_instance.this.name
  name     = each.key

  database_dialect         = each.value.database_dialect
  version_retention_period = each.value.version_retention_period
  ddl                      = each.value.ddl

  # Per-database deletion protection. Independent of the Terraform
  # lifecycle guard below so apps can be protected even in dev instances.
  deletion_protection = each.value.deletion_protection

  dynamic "encryption_config" {
    for_each = each.value.kms_key_name == null ? [] : [each.value.kms_key_name]
    content {
      kms_key_name = encryption_config.value
    }
  }
}

# Optional project/instance-level IAM grant (e.g. give a service account
# databaseUser on every database in this instance).
resource "google_spanner_instance_iam_member" "this" {
  for_each = var.iam_members

  project  = var.project_id
  instance = google_spanner_instance.this.name
  role     = each.value.role
  member   = each.value.member
}

variables.tf

variable "project_id" {
  type        = string
  description = "GCP project ID that owns the Spanner instance."
}

variable "instance_name" {
  type        = string
  description = "Instance ID (lowercase letters, numbers, hyphens; 6-30 chars)."

  validation {
    condition     = can(regex("^[a-z][a-z0-9-]{4,28}[a-z0-9]$", var.instance_name))
    error_message = "instance_name must be 6-30 chars, lowercase alphanumeric or hyphen, starting with a letter."
  }
}

variable "display_name" {
  type        = string
  default     = null
  description = "Human-readable name shown in the console. Defaults to instance_name."
}

variable "config" {
  type        = string
  description = "Instance config. A region like 'us-central1' (auto-prefixed to 'regional-us-central1') or a full config name such as 'regional-europe-west1' or a multi-region name like 'nam-eur-asia1'."
}

variable "edition" {
  type        = string
  default     = "STANDARD"
  description = "Spanner edition: STANDARD or ENTERPRISE or ENTERPRISE_PLUS."

  validation {
    condition     = contains(["STANDARD", "ENTERPRISE", "ENTERPRISE_PLUS"], var.edition)
    error_message = "edition must be STANDARD, ENTERPRISE, or ENTERPRISE_PLUS."
  }
}

variable "num_nodes" {
  type        = number
  default     = null
  description = "Fixed node count. Set this OR processing_units OR autoscaling (mutually exclusive). 1 node = 1000 PU."
}

variable "processing_units" {
  type        = number
  default     = null
  description = "Fixed processing units (multiples of 100 below 1000, then multiples of 1000). Set this OR num_nodes OR autoscaling."

  validation {
    condition     = var.processing_units == null || (var.processing_units >= 100 && (var.processing_units < 1000 ? var.processing_units % 100 == 0 : var.processing_units % 1000 == 0))
    error_message = "processing_units must be >=100, in steps of 100 below 1000, and multiples of 1000 at or above 1000."
  }
}

variable "autoscaling" {
  description = "Enable the managed autoscaler. When set, num_nodes/processing_units are ignored."
  default     = null
  type = object({
    min_processing_units                         = number
    max_processing_units                         = number
    target_high_priority_cpu_utilization_percent = optional(number, 65)
    target_storage_utilization_percent           = optional(number, 95)
  })

  validation {
    condition = var.autoscaling == null || (
      var.autoscaling.min_processing_units >= 100 &&
      var.autoscaling.max_processing_units >= var.autoscaling.min_processing_units
    )
    error_message = "autoscaling.min_processing_units must be >=100 and max must be >= min."
  }

  validation {
    condition = var.autoscaling == null || (
      var.autoscaling.target_high_priority_cpu_utilization_percent >= 10 &&
      var.autoscaling.target_high_priority_cpu_utilization_percent <= 90
    )
    error_message = "autoscaling target CPU utilization must be between 10 and 90 percent."
  }
}

variable "default_backup_schedule_type" {
  type        = string
  default     = "AUTOMATIC"
  description = "Default backup schedule for new databases: NONE or AUTOMATIC."

  validation {
    condition     = contains(["NONE", "AUTOMATIC"], var.default_backup_schedule_type)
    error_message = "default_backup_schedule_type must be NONE or AUTOMATIC."
  }
}

variable "force_destroy" {
  type        = bool
  default     = false
  description = "Allow Terraform to delete the instance even when it still contains backups. Keep false in production."
}

variable "labels" {
  type        = map(string)
  default     = {}
  description = "Labels applied to the instance for cost allocation and ownership."
}

variable "databases" {
  description = "Map of database name => settings. Each database is created inside the instance."
  default     = {}
  type = map(object({
    database_dialect         = optional(string, "GOOGLE_STANDARD_SQL")
    version_retention_period = optional(string, "1h")
    ddl                      = optional(list(string), [])
    deletion_protection      = optional(bool, true)
    kms_key_name             = optional(string)
  }))

  validation {
    condition = alltrue([
      for db in values(var.databases) :
      contains(["GOOGLE_STANDARD_SQL", "POSTGRESQL"], db.database_dialect)
    ])
    error_message = "Each database_dialect must be GOOGLE_STANDARD_SQL or POSTGRESQL."
  }
}

variable "iam_members" {
  description = "Optional instance-level IAM bindings, keyed by an arbitrary label."
  default     = {}
  type = map(object({
    role   = string
    member = string
  }))
}

outputs.tf

output "instance_id" {
  description = "Fully-qualified Spanner instance ID (projects/<p>/instances/<name>)."
  value       = google_spanner_instance.this.id
}

output "instance_name" {
  description = "Short Spanner instance name (the instance ID segment)."
  value       = google_spanner_instance.this.name
}

output "instance_config" {
  description = "Resolved instance config name actually applied (e.g. regional-us-central1)."
  value       = google_spanner_instance.this.config
}

output "instance_state" {
  description = "Current state of the instance (CREATING / READY)."
  value       = google_spanner_instance.this.state
}

output "database_ids" {
  description = "Map of database name => fully-qualified database ID."
  value       = { for k, db in google_spanner_database.this : k => db.id }
}

output "database_states" {
  description = "Map of database name => state (CREATING / READY)."
  value       = { for k, db in google_spanner_database.this : k => db.state }
}

How to use it

module "cloud_spanner" {
  source = "git::https://dev.azure.com/teknohut/kloudvin/_git/terraform-modules//terraform-module-gcp-spanner?ref=v1.0.0"

  project_id    = "kloudvin-ledger-prod"
  instance_name = "ledger-prod"
  display_name  = "Ledger (Production)"

  # Multi-region North America + Europe + Asia for global low-latency reads.
  config = "nam-eur-asia1"

  edition = "ENTERPRISE"

  # Let Spanner size itself between 2 and 10 nodes of capacity.
  autoscaling = {
    min_processing_units                         = 2000
    max_processing_units                         = 10000
    target_high_priority_cpu_utilization_percent = 60
  }

  databases = {
    ledger = {
      database_dialect         = "GOOGLE_STANDARD_SQL"
      version_retention_period = "168h" # 7 days of point-in-time recovery
      deletion_protection      = true
      kms_key_name             = "projects/kloudvin-ledger-prod/locations/us/keyRings/spanner/cryptoKeys/ledger"
      ddl = [
        "CREATE TABLE accounts (account_id STRING(36) NOT NULL, balance_micros INT64 NOT NULL, updated_at TIMESTAMP NOT NULL OPTIONS (allow_commit_timestamp = true)) PRIMARY KEY (account_id)",
        "CREATE TABLE entries (entry_id STRING(36) NOT NULL, account_id STRING(36) NOT NULL, amount_micros INT64 NOT NULL) PRIMARY KEY (account_id, entry_id), INTERLEAVE IN PARENT accounts ON DELETE CASCADE",
      ]
    }
  }

  iam_members = {
    app_sa = {
      role   = "roles/spanner.databaseUser"
      member = "serviceAccount:ledger-api@kloudvin-ledger-prod.iam.gserviceaccount.com"
    }
  }

  labels = {
    team        = "payments"
    environment = "production"
    cost-center = "ledger"
  }
}

# Downstream: feed the database ID into a Cloud Run service as an env var.
resource "google_cloud_run_v2_service" "ledger_api" {
  name     = "ledger-api"
  location = "us-central1"
  project  = "kloudvin-ledger-prod"

  template {
    containers {
      image = "us-docker.pkg.dev/kloudvin-ledger-prod/api/ledger:latest"

      env {
        name  = "SPANNER_DATABASE"
        value = module.cloud_spanner.database_ids["ledger"]
      }
    }
  }
}

With Terragrunt

Terragrunt keeps this module DRY across environments — define the backend and provider once in a root config, then a thin terragrunt.hcl per environment supplies only the inputs that differ.

1. Root config — live/terragrunt.hcl (inherited by every module):

remote_state {
  backend = "gcs"
  generate = { path = "backend.tf", if_exists = "overwrite" }
  config = {
    # ...gcs state bucket/container + key per path...
  }
}

2. Module config — live/prod/spanner/terragrunt.hcl:

include "root" {
  path = find_in_parent_folders()
}

terraform {
  source = "git::https://dev.azure.com/teknohut/kloudvin/_git/terraform-modules//terraform-module-gcp-spanner?ref=v1.0.0"
}

inputs = {
  project_id = "..."
  instance_name = "..."
  config = "..."
}

3. Deploy one environment, or roll out all modules together:

cd live/prod/spanner && terragrunt apply        # this module
terragrunt run-all apply                      # every module under live/prod

Why Terragrunt here: the backend and provider live in one place instead of being copy-pasted into every module; inputs is overridden per environment (dev / stage / prod) without forking the module; and run-all orchestrates dependencies across modules. Reach for it once you have more than one environment or more than a handful of modules — for a single stack, the plain Quickstart above is enough.

Inputs

Name	Type	Default	Required	Description
`project_id`	`string`	—	yes	GCP project ID that owns the instance.
`instance_name`	`string`	—	yes	Instance ID, 6–30 chars, lowercase alphanumeric/hyphen, leading letter.
`display_name`	`string`	`null`	no	Console display name; defaults to `instance_name`.
`config`	`string`	—	yes	Region (auto-prefixed `regional-`) or full/multi-region config name.
`edition`	`string`	`"STANDARD"`	no	`STANDARD`, `ENTERPRISE`, or `ENTERPRISE_PLUS`.
`num_nodes`	`number`	`null`	no	Fixed node count (mutually exclusive with `processing_units`/`autoscaling`).
`processing_units`	`number`	`null`	no	Fixed PU; steps of 100 (<1000) then 1000.
`autoscaling`	`object`	`null`	no	Managed autoscaler min/max PU and utilization targets.
`default_backup_schedule_type`	`string`	`"AUTOMATIC"`	no	Default backup schedule for new databases: `NONE` or `AUTOMATIC`.
`force_destroy`	`bool`	`false`	no	Permit deleting the instance while backups exist.
`labels`	`map(string)`	`{}`	no	Instance labels for cost allocation.
`databases`	`map(object)`	`{}`	no	Databases to create with dialect, retention, DDL, deletion protection, CMEK.
`iam_members`	`map(object)`	`{}`	no	Optional instance-level IAM role bindings.

Outputs

Name	Description
`instance_id`	Fully-qualified instance ID (`projects/<p>/instances/<name>`).
`instance_name`	Short instance name segment.
`instance_config`	Resolved config name applied (e.g. `regional-us-central1`).
`instance_state`	Instance state (`CREATING` / `READY`).
`database_ids`	Map of database name => fully-qualified database ID.
`database_states`	Map of database name => state (`CREATING` / `READY`).

Enterprise scenario

A fintech running a global double-entry ledger needs sub-second reads in three continents and zero tolerance for split-brain on balances. They deploy this module once per environment with config = "nam-eur-asia1", the ENTERPRISE edition, and autoscaling from 2000 to 10000 processing units so capacity follows their month-end settlement spikes without an on-call engineer adding nodes. The accounts/entries tables are interleaved for locality, every database carries deletion_protection = true plus a 7-day version_retention_period for point-in-time recovery, and a CMEK key satisfies their data-residency auditors — all expressed as variables, reviewed in one pull request, and reproduced identically in staging.

Best practices

Prefer processing units and autoscaling over num_nodes. PU give finer granularity (100 PU steps) and the managed autoscaler keeps you off the pager; never set num_nodes and processing_units together — Spanner rejects it, which is why this module makes the three capacity modes mutually exclusive.
Keep deletion_protection on and force_destroy off in production. Spanner data is your system of record; the per-database guard plus instance-level force_destroy = false stops a stray terraform destroy or a removed map key from wiping a ledger.
Tune version_retention_period deliberately. Longer retention (up to 7 days, 168h) widens your point-in-time recovery window but increases storage cost and can slow some operations — set it per database based on RPO, not a blanket default.
Choose the config for latency and cost, not habit. Multi-region (nam-eur-asia1) buys global strong consistency at a real premium; a single regional-* config is far cheaper, so only pay for multi-region where cross-continent writes genuinely require it.
Grant least privilege via iam_members. Give application service accounts roles/spanner.databaseUser (read/write data) rather than databaseAdmin; reserve admin roles for CI/CD and break-glass, and attach CMEK keys where compliance demands customer-managed encryption.
Label every instance for cost allocation. Spanner bills continuously for compute plus storage, so team/environment/cost-center labels are how finance attributes a five-figure monthly line item back to the owning service.