Terraform Module: GCP Bigtable — production-grade wide-column store in one block

Quick take — A reusable hashicorp/google Terraform module for GCP Bigtable: autoscaling SSD/HDD clusters, multi-cluster replication, deletion protection, and tables with column families and GC policies. New here? Jump to the Quickstart below to deploy it in minutes; read on for how it works and when to reach for it.

Quickstart (copy-paste)

Minimal, runnable configuration — drop this in a .tf file and fill in the "..." placeholders (each required input is commented):

provider "google" {
  project = "my-project"
  region  = "us-central1"
}

module "bigtable" {
  source = "git::https://dev.azure.com/teknohut/kloudvin/_git/terraform-modules//terraform-module-gcp-bigtable?ref=v1.0.0"

  project_id     = "..."           # GCP project ID hosting the Bigtable instance.
  app            = "..."           # Workload short name used in the instance name (validate…
  environment    = "..."           # One of `dev`, `staging`, `prod`, `sandbox`.
  location_short = "..."           # Cosmetic region/zone token for naming (e.g. `euw1`).
  clusters       = ["...", "..."]  # 1–8 clusters; each with zone, `storage_type`, and fixed…
}

Then terraform init && terraform apply. Every other input has a sensible default — see Inputs below to override behaviour.

What this module is

Cloud Bigtable is GCP’s fully managed, petabyte-scale, wide-column NoSQL database — the same engine that backs Search, Maps, and Gmail. It is built for workloads that need single-digit-millisecond reads/writes at very high throughput: time-series telemetry, IoT ingestion, ad-tech and fraud-scoring feature stores, financial tick data, and any HBase-compatible application. Unlike Firestore or Spanner, Bigtable has no SQL layer and no secondary indexes; you design a single row key and read by key or key-range, and you pay for provisioned nodes (or autoscaling) plus storage.

The raw resource graph is fiddly. A real Bigtable deployment is never just an instance — it is an instance plus one or more google_bigtable_instance.cluster blocks (each pinned to a zone, with a storage type and either fixed num_nodes or an autoscaling_config), plus google_bigtable_table resources that declare column families, plus a google_bigtable_gc_policy per family so old cell versions actually get garbage-collected. Get the cluster/zone/replication wiring wrong and you either over-provision (Bigtable’s biggest cost trap) or accidentally tear down a stateful cluster on the next apply.

This module wraps google_bigtable_instance and its companions behind clean, validated variables. You declare clusters as a list of objects, tables as a map with their column families and GC rules, and the module handles lifecycle guards, deletion protection, and consistent app-env-region naming so every team ships Bigtable the same safe way.

When to use it

You run high-throughput, low-latency key/value or time-series workloads (IoT, observability metrics, clickstream, feature stores) that have outgrown Firestore/Memorystore but don’t need SQL or joins.
You need an HBase-compatible backend and want to lift Apache HBase / Cassandra workloads onto a managed service.
You want multi-cluster replication across zones or regions for HA and read locality, with eventual consistency and automatic failover via app profiles.
You want autoscaling Bigtable nodes tied to CPU/storage utilization so you stop paying for idle capacity overnight.
You are standardizing a platform and want every Bigtable instance to carry the same labels, deletion protection, and table/GC conventions.

Reach for Spanner instead if you need strong consistency with SQL and transactions, or Firestore for document/mobile-sync workloads. Bigtable shines when the access pattern is “give me this row key (range) as fast as possible, at scale.”

Module structure

terraform-module-gcp-bigtable/
├── versions.tf      # provider + Terraform version pins
├── main.tf          # google_bigtable_instance + tables + GC policies + app profile
├── variables.tf     # var-driven inputs with validation
└── outputs.tf       # instance id/name, cluster ids, table ids

versions.tf

terraform {
  required_version = ">= 1.5.0"

  required_providers {
    google = {
      source  = "hashicorp/google"
      version = "~> 5.0"
    }
  }
}

main.tf

locals {
  # Consistent app-env-region naming, e.g. "telemetry-prod-euw1".
  instance_name = "${var.app}-${var.environment}-${var.location_short}"

  # Bigtable autoscaling and fixed sizing are mutually exclusive per cluster.
  # We normalize the cluster list once so the resource block stays readable.
  clusters = {
    for c in var.clusters : c.cluster_id => c
  }
}

resource "google_bigtable_instance" "this" {
  project      = var.project_id
  name         = local.instance_name
  display_name = coalesce(var.display_name, local.instance_name)

  # DEVELOPMENT instances are single-node and cheaper but cannot replicate
  # or autoscale. PRODUCTION is the default for anything real.
  instance_type = var.instance_type

  # Guards against `terraform destroy` / accidental recreation of a stateful DB.
  deletion_protection = var.deletion_protection

  dynamic "cluster" {
    for_each = local.clusters
    content {
      cluster_id   = cluster.value.cluster_id
      zone         = cluster.value.zone
      storage_type = cluster.value.storage_type

      # Fixed sizing: only set when autoscaling is NOT configured.
      num_nodes = cluster.value.autoscaling == null ? cluster.value.num_nodes : null

      # Customer-managed encryption key (optional, per cluster).
      dynamic "kms_key_name" {
        for_each = cluster.value.kms_key_name == null ? [] : [cluster.value.kms_key_name]
        content {}
      }

      dynamic "autoscaling_config" {
        for_each = cluster.value.autoscaling == null ? [] : [cluster.value.autoscaling]
        content {
          min_nodes      = autoscaling_config.value.min_nodes
          max_nodes      = autoscaling_config.value.max_nodes
          cpu_target     = autoscaling_config.value.cpu_target
          storage_target = autoscaling_config.value.storage_target
        }
      }
    }
  }

  labels = var.labels

  lifecycle {
    # num_nodes drifts when autoscaling is active; ignore it so apply is a no-op.
    ignore_changes = [cluster[0].num_nodes]
  }
}

# CMEK has to be passed via the cluster block in google ~> 5.0; the
# kms_key_name argument lives directly on the cluster, so we set it inline.
# (Handled above through cluster.value.kms_key_name when present.)

resource "google_bigtable_table" "this" {
  for_each = var.tables

  project       = var.project_id
  instance_name = google_bigtable_instance.this.name
  name          = each.key

  # Optional initial row-key splits for pre-warming / avoiding hotspotting.
  dynamic "split_keys" {
    for_each = length(each.value.split_keys) > 0 ? [each.value.split_keys] : []
    content {}
  }

  dynamic "column_family" {
    for_each = each.value.column_families
    content {
      family = column_family.value
    }
  }

  # Tables are stateful; do not let a config tweak silently drop one.
  deletion_protection = var.table_deletion_protection ? "PROTECTED" : "UNPROTECTED"

  lifecycle {
    prevent_destroy = false
  }
}

# One GC policy per (table, column family). Bigtable will not garbage-collect
# old cell versions unless a policy says so — critical for cost on time-series.
resource "google_bigtable_gc_policy" "this" {
  for_each = {
    for gc in flatten([
      for table_name, table in var.tables : [
        for family, rule in table.gc_rules : {
          key          = "${table_name}.${family}"
          table_name   = table_name
          family       = family
          max_age_days = rule.max_age_days
          max_versions = rule.max_versions
        }
      ]
    ]) : gc.key => gc
  }

  project         = var.project_id
  instance_name   = google_bigtable_instance.this.name
  table           = each.value.table_name
  column_family   = each.value.family
  deletion_policy = "ABANDON"

  dynamic "max_age" {
    for_each = each.value.max_age_days == null ? [] : [each.value.max_age_days]
    content {
      duration = "${max_age.value * 24}h"
    }
  }

  dynamic "max_version" {
    for_each = each.value.max_versions == null ? [] : [each.value.max_versions]
    content {
      number = max_version.value
    }
  }

  depends_on = [google_bigtable_table.this]
}

# App profile controls routing for multi-cluster reads/writes. Single-cluster
# routing pins traffic to one cluster (lower latency, no replication conflicts);
# multi-cluster routing load-balances and fails over automatically.
resource "google_bigtable_app_profile" "this" {
  count = var.app_profile == null ? 0 : 1

  project        = var.project_id
  instance       = google_bigtable_instance.this.name
  app_profile_id = var.app_profile.id
  description    = var.app_profile.description

  multi_cluster_routing_use_any = var.app_profile.routing == "multi_cluster"

  dynamic "single_cluster_routing" {
    for_each = var.app_profile.routing == "single_cluster" ? [var.app_profile] : []
    content {
      cluster_id                 = single_cluster_routing.value.cluster_id
      allow_transactional_writes = single_cluster_routing.value.allow_transactional_writes
    }
  }

  ignore_warnings = true
}

variables.tf

variable "project_id" {
  description = "GCP project ID that will host the Bigtable instance."
  type        = string
}

variable "app" {
  description = "Application/workload short name, used in the instance name (e.g. \"telemetry\")."
  type        = string

  validation {
    condition     = can(regex("^[a-z][a-z0-9-]{1,20}$", var.app))
    error_message = "app must be lowercase alphanumeric/hyphen, 2-21 chars, starting with a letter."
  }
}

variable "environment" {
  description = "Deployment environment (dev, staging, prod, ...)."
  type        = string

  validation {
    condition     = contains(["dev", "staging", "prod", "sandbox"], var.environment)
    error_message = "environment must be one of: dev, staging, prod, sandbox."
  }
}

variable "location_short" {
  description = "Short region/zone token for naming, e.g. \"euw1\", \"use4\". Cosmetic only."
  type        = string
}

variable "display_name" {
  description = "Human-friendly instance display name. Defaults to the generated instance name."
  type        = string
  default     = null
}

variable "instance_type" {
  description = "Bigtable instance type: PRODUCTION (replicable, autoscalable) or DEVELOPMENT (single node, cheap)."
  type        = string
  default     = "PRODUCTION"

  validation {
    condition     = contains(["PRODUCTION", "DEVELOPMENT"], var.instance_type)
    error_message = "instance_type must be PRODUCTION or DEVELOPMENT."
  }
}

variable "deletion_protection" {
  description = "Prevent the instance from being destroyed by Terraform. Keep true in prod."
  type        = bool
  default     = true
}

variable "clusters" {
  description = <<-EOT
    List of Bigtable clusters. Each cluster lives in one zone. Provide either a
    fixed num_nodes OR an autoscaling block (not both). Multiple clusters enable
    replication across zones/regions.
  EOT
  type = list(object({
    cluster_id   = string
    zone         = string
    storage_type = optional(string, "SSD")
    num_nodes    = optional(number, 1)
    kms_key_name = optional(string)
    autoscaling = optional(object({
      min_nodes      = number
      max_nodes      = number
      cpu_target     = number
      storage_target = optional(number, 2560)
    }))
  }))

  validation {
    condition     = length(var.clusters) >= 1 && length(var.clusters) <= 8
    error_message = "Provide between 1 and 8 clusters (Bigtable's replication limit)."
  }

  validation {
    condition     = alltrue([for c in var.clusters : contains(["SSD", "HDD"], c.storage_type)])
    error_message = "Each cluster storage_type must be SSD or HDD."
  }

  validation {
    condition = alltrue([
      for c in var.clusters : c.autoscaling == null ? true :
      (c.autoscaling.cpu_target >= 10 && c.autoscaling.cpu_target <= 80)
    ])
    error_message = "autoscaling.cpu_target must be between 10 and 80 percent."
  }
}

variable "tables" {
  description = <<-EOT
    Map of tables to create, keyed by table name. Each table declares its
    column_families, optional row-key split_keys for pre-splitting, and gc_rules
    (max_age_days and/or max_versions) per family.
  EOT
  type = map(object({
    column_families = optional(list(string), [])
    split_keys      = optional(list(string), [])
    gc_rules = optional(map(object({
      max_age_days = optional(number)
      max_versions = optional(number)
    })), {})
  }))
  default = {}
}

variable "table_deletion_protection" {
  description = "Mark created tables PROTECTED so a config change cannot drop them."
  type        = bool
  default     = true
}

variable "app_profile" {
  description = <<-EOT
    Optional custom app profile controlling read/write routing. routing is
    "multi_cluster" (auto failover/load-balance) or "single_cluster" (pin to one
    cluster, required for transactional single-row writes).
  EOT
  type = object({
    id                         = string
    description                = optional(string, "Managed by Terraform")
    routing                    = string
    cluster_id                 = optional(string)
    allow_transactional_writes = optional(bool, false)
  })
  default = null

  validation {
    condition     = var.app_profile == null ? true : contains(["multi_cluster", "single_cluster"], var.app_profile.routing)
    error_message = "app_profile.routing must be multi_cluster or single_cluster."
  }

  validation {
    condition     = var.app_profile == null ? true : (var.app_profile.routing != "single_cluster" || var.app_profile.cluster_id != null)
    error_message = "single_cluster routing requires app_profile.cluster_id to be set."
  }
}

variable "labels" {
  description = "Labels applied to the Bigtable instance."
  type        = map(string)
  default     = {}
}

outputs.tf

output "instance_id" {
  description = "Fully qualified Bigtable instance ID (projects/<project>/instances/<name>)."
  value       = google_bigtable_instance.this.id
}

output "instance_name" {
  description = "Bigtable instance name (used in client connection strings and CLI)."
  value       = google_bigtable_instance.this.name
}

output "cluster_ids" {
  description = "Map of cluster_id => zone for every cluster in the instance."
  value       = { for c in var.clusters : c.cluster_id => c.zone }
}

output "table_ids" {
  description = "Map of table name => fully qualified table ID."
  value       = { for name, t in google_bigtable_table.this : name => t.id }
}

output "table_names" {
  description = "List of created table names."
  value       = keys(google_bigtable_table.this)
}

output "app_profile_id" {
  description = "App profile ID, or null if no custom profile was created."
  value       = try(google_bigtable_app_profile.this[0].app_profile_id, null)
}

How to use it

The example below provisions a replicated, autoscaling telemetry instance with one events table, then wires a Dataflow / app-tier service account and a downstream consumer that reads the instance name from the module output.

module "bigtable" {
  source = "git::https://dev.azure.com/teknohut/kloudvin/_git/terraform-modules//terraform-module-gcp-bigtable?ref=v1.0.0"

  project_id     = "kv-data-prod"
  app            = "telemetry"
  environment    = "prod"
  location_short = "euw1"

  instance_type       = "PRODUCTION"
  deletion_protection = true

  clusters = [
    {
      cluster_id   = "telemetry-prod-euw1-c1"
      zone         = "europe-west1-b"
      storage_type = "SSD"
      autoscaling = {
        min_nodes  = 3
        max_nodes  = 20
        cpu_target = 60
      }
    },
    {
      # Second cluster in another zone = replication + read locality.
      cluster_id   = "telemetry-prod-euw1-c2"
      zone         = "europe-west1-c"
      storage_type = "SSD"
      autoscaling = {
        min_nodes  = 3
        max_nodes  = 20
        cpu_target = 60
      }
    },
  ]

  tables = {
    "device_events" = {
      column_families = ["raw", "agg"]
      # Pre-split on a reversed-device-id prefix to avoid write hotspotting.
      split_keys = ["1", "3", "5", "7", "9", "b", "d", "f"]
      gc_rules = {
        raw = { max_age_days = 30 }            # drop raw cells after 30 days
        agg = { max_versions = 1 }             # keep only the latest aggregate
      }
    }
  }

  app_profile = {
    id      = "ingest"
    routing = "multi_cluster"
  }

  labels = {
    team        = "data-platform"
    cost-center = "kv-1042"
    workload    = "telemetry"
  }
}

# Downstream: grant the ingestion SA write access on this exact instance,
# using the module output rather than a hardcoded name.
resource "google_bigtable_instance_iam_member" "ingest_writer" {
  project  = "kv-data-prod"
  instance = module.bigtable.instance_name
  role     = "roles/bigtable.user"
  member   = "serviceAccount:telemetry-ingest@kv-data-prod.iam.gserviceaccount.com"
}

With Terragrunt

Terragrunt keeps this module DRY across environments — define the backend and provider once in a root config, then a thin terragrunt.hcl per environment supplies only the inputs that differ.

1. Root config — live/terragrunt.hcl (inherited by every module):

remote_state {
  backend = "gcs"
  generate = { path = "backend.tf", if_exists = "overwrite" }
  config = {
    # ...gcs state bucket/container + key per path...
  }
}

2. Module config — live/prod/bigtable/terragrunt.hcl:

include "root" {
  path = find_in_parent_folders()
}

terraform {
  source = "git::https://dev.azure.com/teknohut/kloudvin/_git/terraform-modules//terraform-module-gcp-bigtable?ref=v1.0.0"
}

inputs = {
  project_id = "..."
  app = "..."
  environment = "..."
  location_short = "..."
  clusters = ["...", "..."]
}

3. Deploy one environment, or roll out all modules together:

cd live/prod/bigtable && terragrunt apply        # this module
terragrunt run-all apply                      # every module under live/prod

Why Terragrunt here: the backend and provider live in one place instead of being copy-pasted into every module; inputs is overridden per environment (dev / stage / prod) without forking the module; and run-all orchestrates dependencies across modules. Reach for it once you have more than one environment or more than a handful of modules — for a single stack, the plain Quickstart above is enough.

Inputs

Name	Type	Default	Required	Description
`project_id`	`string`	—	Yes	GCP project ID hosting the Bigtable instance.
`app`	`string`	—	Yes	Workload short name used in the instance name (validated lowercase).
`environment`	`string`	—	Yes	One of `dev`, `staging`, `prod`, `sandbox`.
`location_short`	`string`	—	Yes	Cosmetic region/zone token for naming (e.g. `euw1`).
`display_name`	`string`	`null`	No	Human-friendly display name; defaults to the generated instance name.
`instance_type`	`string`	`"PRODUCTION"`	No	`PRODUCTION` or `DEVELOPMENT`.
`deletion_protection`	`bool`	`true`	No	Block `terraform destroy` on the instance.
`clusters`	`list(object)`	—	Yes	1–8 clusters; each with zone, `storage_type`, and fixed `num_nodes` or an `autoscaling` block.
`tables`	`map(object)`	`{}`	No	Tables keyed by name, with `column_families`, `split_keys`, and per-family `gc_rules`.
`table_deletion_protection`	`bool`	`true`	No	Create tables as `PROTECTED`.
`app_profile`	`object`	`null`	No	Optional app profile: `multi_cluster` or `single_cluster` routing.
`labels`	`map(string)`	`{}`	No	Labels applied to the instance.

Outputs

Name	Description
`instance_id`	Fully qualified instance ID (`projects/<project>/instances/<name>`).
`instance_name`	Instance name used in client connections and `cbt` CLI.
`cluster_ids`	Map of `cluster_id => zone` for every cluster.
`table_ids`	Map of table name => fully qualified table ID.
`table_names`	List of created table names.
`app_profile_id`	Custom app profile ID, or `null` if none was created.

Enterprise scenario

A connected-vehicle platform ingests ~400k telemetry messages/second from a global fleet. They deploy this module per region with a two-cluster, autoscaling (min_nodes = 6, max_nodes = 40, cpu_target = 60) PRODUCTION instance and a device_events table pre-split on a reversed-VIN row key to spread writes evenly. A 30-day max_age GC policy on the raw family keeps storage flat while a daily aggregation job writes a max_versions = 1 rollup family for the dashboards. A multi_cluster app profile gives them automatic zone failover during maintenance, and instance_name feeds the Dataflow pipeline and per-team IAM bindings — so onboarding a new region is a copy-paste of one module block plus a cluster_id.

Best practices

Design the row key first, then pre-split. Bigtable performance lives and dies on key design — avoid sequential/timestamp prefixes that hotspot a single node. Use split_keys (and field-promotion/salting/reversed-ID patterns) to distribute writes across tablets from day one.
Always attach a GC policy per column family. Without max_age or max_versions, old cell versions accumulate forever and storage cost grows unbounded — set gc_rules on every family, especially time-series tables.
Prefer autoscaling and right-size cpu_target. Nodes are the dominant cost. Set min_nodes to your floor, a generous max_nodes ceiling, and cpu_target around 60 (keep below ~70 to leave headroom for replication and tail latency). Keep ignore_changes on num_nodes so autoscaling drift doesn’t churn the plan.
Use replication for HA, and choose routing deliberately. Two+ clusters across zones give failover and read locality, but use a single_cluster app profile for workloads that need transactional single-row writes or must avoid cross-cluster read-your-writes surprises.
Lock down statefulness and encryption. Keep deletion_protection = true and table_deletion_protection = true in prod, and pass a CMEK kms_key_name per cluster where compliance requires customer-managed keys.
Standardize naming and labels. The app-env-region instance name plus team/cost-center labels make Bigtable spend attributable and instances greppable across projects — non-negotiable once you run more than a couple of clusters.