IaC GCP

Terraform Module: GCP Bigtable — production-grade wide-column store in one block

Quick take — A reusable hashicorp/google Terraform module for GCP Bigtable: autoscaling SSD/HDD clusters, multi-cluster replication, deletion protection, and tables with column families and GC policies. New here? Jump to the Quickstart below to deploy it in minutes; read on for how it works and when to reach for it.

Quickstart (copy-paste)

Minimal, runnable configuration — drop this in a .tf file and fill in the "..." placeholders (each required input is commented):

provider "google" {
  project = "my-project"
  region  = "us-central1"
}

module "bigtable" {
  source = "git::https://dev.azure.com/teknohut/kloudvin/_git/terraform-modules//terraform-module-gcp-bigtable?ref=v1.0.0"

  project_id     = "..."           # GCP project ID hosting the Bigtable instance.
  app            = "..."           # Workload short name used in the instance name (validate…
  environment    = "..."           # One of `dev`, `staging`, `prod`, `sandbox`.
  location_short = "..."           # Cosmetic region/zone token for naming (e.g. `euw1`).
  clusters       = ["...", "..."]  # 1–8 clusters; each with zone, `storage_type`, and fixed…
}

Then terraform init && terraform apply. Every other input has a sensible default — see Inputs below to override behaviour.

What this module is

Cloud Bigtable is GCP’s fully managed, petabyte-scale, wide-column NoSQL database — the same engine that backs Search, Maps, and Gmail. It is built for workloads that need single-digit-millisecond reads/writes at very high throughput: time-series telemetry, IoT ingestion, ad-tech and fraud-scoring feature stores, financial tick data, and any HBase-compatible application. Unlike Firestore or Spanner, Bigtable has no SQL layer and no secondary indexes; you design a single row key and read by key or key-range, and you pay for provisioned nodes (or autoscaling) plus storage.

The raw resource graph is fiddly. A real Bigtable deployment is never just an instance — it is an instance plus one or more google_bigtable_instance.cluster blocks (each pinned to a zone, with a storage type and either fixed num_nodes or an autoscaling_config), plus google_bigtable_table resources that declare column families, plus a google_bigtable_gc_policy per family so old cell versions actually get garbage-collected. Get the cluster/zone/replication wiring wrong and you either over-provision (Bigtable’s biggest cost trap) or accidentally tear down a stateful cluster on the next apply.

This module wraps google_bigtable_instance and its companions behind clean, validated variables. You declare clusters as a list of objects, tables as a map with their column families and GC rules, and the module handles lifecycle guards, deletion protection, and consistent app-env-region naming so every team ships Bigtable the same safe way.

When to use it

Reach for Spanner instead if you need strong consistency with SQL and transactions, or Firestore for document/mobile-sync workloads. Bigtable shines when the access pattern is “give me this row key (range) as fast as possible, at scale.”

Module structure

terraform-module-gcp-bigtable/
├── versions.tf      # provider + Terraform version pins
├── main.tf          # google_bigtable_instance + tables + GC policies + app profile
├── variables.tf     # var-driven inputs with validation
└── outputs.tf       # instance id/name, cluster ids, table ids

versions.tf

terraform {
  required_version = ">= 1.5.0"

  required_providers {
    google = {
      source  = "hashicorp/google"
      version = "~> 5.0"
    }
  }
}

main.tf

locals {
  # Consistent app-env-region naming, e.g. "telemetry-prod-euw1".
  instance_name = "${var.app}-${var.environment}-${var.location_short}"

  # Bigtable autoscaling and fixed sizing are mutually exclusive per cluster.
  # We normalize the cluster list once so the resource block stays readable.
  clusters = {
    for c in var.clusters : c.cluster_id => c
  }
}

resource "google_bigtable_instance" "this" {
  project      = var.project_id
  name         = local.instance_name
  display_name = coalesce(var.display_name, local.instance_name)

  # DEVELOPMENT instances are single-node and cheaper but cannot replicate
  # or autoscale. PRODUCTION is the default for anything real.
  instance_type = var.instance_type

  # Guards against `terraform destroy` / accidental recreation of a stateful DB.
  deletion_protection = var.deletion_protection

  dynamic "cluster" {
    for_each = local.clusters
    content {
      cluster_id   = cluster.value.cluster_id
      zone         = cluster.value.zone
      storage_type = cluster.value.storage_type

      # Fixed sizing: only set when autoscaling is NOT configured.
      num_nodes = cluster.value.autoscaling == null ? cluster.value.num_nodes : null

      # Customer-managed encryption key (optional, per cluster).
      dynamic "kms_key_name" {
        for_each = cluster.value.kms_key_name == null ? [] : [cluster.value.kms_key_name]
        content {}
      }

      dynamic "autoscaling_config" {
        for_each = cluster.value.autoscaling == null ? [] : [cluster.value.autoscaling]
        content {
          min_nodes      = autoscaling_config.value.min_nodes
          max_nodes      = autoscaling_config.value.max_nodes
          cpu_target     = autoscaling_config.value.cpu_target
          storage_target = autoscaling_config.value.storage_target
        }
      }
    }
  }

  labels = var.labels

  lifecycle {
    # num_nodes drifts when autoscaling is active; ignore it so apply is a no-op.
    ignore_changes = [cluster[0].num_nodes]
  }
}

# CMEK has to be passed via the cluster block in google ~> 5.0; the
# kms_key_name argument lives directly on the cluster, so we set it inline.
# (Handled above through cluster.value.kms_key_name when present.)

resource "google_bigtable_table" "this" {
  for_each = var.tables

  project       = var.project_id
  instance_name = google_bigtable_instance.this.name
  name          = each.key

  # Optional initial row-key splits for pre-warming / avoiding hotspotting.
  dynamic "split_keys" {
    for_each = length(each.value.split_keys) > 0 ? [each.value.split_keys] : []
    content {}
  }

  dynamic "column_family" {
    for_each = each.value.column_families
    content {
      family = column_family.value
    }
  }

  # Tables are stateful; do not let a config tweak silently drop one.
  deletion_protection = var.table_deletion_protection ? "PROTECTED" : "UNPROTECTED"

  lifecycle {
    prevent_destroy = false
  }
}

# One GC policy per (table, column family). Bigtable will not garbage-collect
# old cell versions unless a policy says so — critical for cost on time-series.
resource "google_bigtable_gc_policy" "this" {
  for_each = {
    for gc in flatten([
      for table_name, table in var.tables : [
        for family, rule in table.gc_rules : {
          key          = "${table_name}.${family}"
          table_name   = table_name
          family       = family
          max_age_days = rule.max_age_days
          max_versions = rule.max_versions
        }
      ]
    ]) : gc.key => gc
  }

  project         = var.project_id
  instance_name   = google_bigtable_instance.this.name
  table           = each.value.table_name
  column_family   = each.value.family
  deletion_policy = "ABANDON"

  dynamic "max_age" {
    for_each = each.value.max_age_days == null ? [] : [each.value.max_age_days]
    content {
      duration = "${max_age.value * 24}h"
    }
  }

  dynamic "max_version" {
    for_each = each.value.max_versions == null ? [] : [each.value.max_versions]
    content {
      number = max_version.value
    }
  }

  depends_on = [google_bigtable_table.this]
}

# App profile controls routing for multi-cluster reads/writes. Single-cluster
# routing pins traffic to one cluster (lower latency, no replication conflicts);
# multi-cluster routing load-balances and fails over automatically.
resource "google_bigtable_app_profile" "this" {
  count = var.app_profile == null ? 0 : 1

  project        = var.project_id
  instance       = google_bigtable_instance.this.name
  app_profile_id = var.app_profile.id
  description    = var.app_profile.description

  multi_cluster_routing_use_any = var.app_profile.routing == "multi_cluster"

  dynamic "single_cluster_routing" {
    for_each = var.app_profile.routing == "single_cluster" ? [var.app_profile] : []
    content {
      cluster_id                 = single_cluster_routing.value.cluster_id
      allow_transactional_writes = single_cluster_routing.value.allow_transactional_writes
    }
  }

  ignore_warnings = true
}

variables.tf

variable "project_id" {
  description = "GCP project ID that will host the Bigtable instance."
  type        = string
}

variable "app" {
  description = "Application/workload short name, used in the instance name (e.g. \"telemetry\")."
  type        = string

  validation {
    condition     = can(regex("^[a-z][a-z0-9-]{1,20}$", var.app))
    error_message = "app must be lowercase alphanumeric/hyphen, 2-21 chars, starting with a letter."
  }
}

variable "environment" {
  description = "Deployment environment (dev, staging, prod, ...)."
  type        = string

  validation {
    condition     = contains(["dev", "staging", "prod", "sandbox"], var.environment)
    error_message = "environment must be one of: dev, staging, prod, sandbox."
  }
}

variable "location_short" {
  description = "Short region/zone token for naming, e.g. \"euw1\", \"use4\". Cosmetic only."
  type        = string
}

variable "display_name" {
  description = "Human-friendly instance display name. Defaults to the generated instance name."
  type        = string
  default     = null
}

variable "instance_type" {
  description = "Bigtable instance type: PRODUCTION (replicable, autoscalable) or DEVELOPMENT (single node, cheap)."
  type        = string
  default     = "PRODUCTION"

  validation {
    condition     = contains(["PRODUCTION", "DEVELOPMENT"], var.instance_type)
    error_message = "instance_type must be PRODUCTION or DEVELOPMENT."
  }
}

variable "deletion_protection" {
  description = "Prevent the instance from being destroyed by Terraform. Keep true in prod."
  type        = bool
  default     = true
}

variable "clusters" {
  description = <<-EOT
    List of Bigtable clusters. Each cluster lives in one zone. Provide either a
    fixed num_nodes OR an autoscaling block (not both). Multiple clusters enable
    replication across zones/regions.
  EOT
  type = list(object({
    cluster_id   = string
    zone         = string
    storage_type = optional(string, "SSD")
    num_nodes    = optional(number, 1)
    kms_key_name = optional(string)
    autoscaling = optional(object({
      min_nodes      = number
      max_nodes      = number
      cpu_target     = number
      storage_target = optional(number, 2560)
    }))
  }))

  validation {
    condition     = length(var.clusters) >= 1 && length(var.clusters) <= 8
    error_message = "Provide between 1 and 8 clusters (Bigtable's replication limit)."
  }

  validation {
    condition     = alltrue([for c in var.clusters : contains(["SSD", "HDD"], c.storage_type)])
    error_message = "Each cluster storage_type must be SSD or HDD."
  }

  validation {
    condition = alltrue([
      for c in var.clusters : c.autoscaling == null ? true :
      (c.autoscaling.cpu_target >= 10 && c.autoscaling.cpu_target <= 80)
    ])
    error_message = "autoscaling.cpu_target must be between 10 and 80 percent."
  }
}

variable "tables" {
  description = <<-EOT
    Map of tables to create, keyed by table name. Each table declares its
    column_families, optional row-key split_keys for pre-splitting, and gc_rules
    (max_age_days and/or max_versions) per family.
  EOT
  type = map(object({
    column_families = optional(list(string), [])
    split_keys      = optional(list(string), [])
    gc_rules = optional(map(object({
      max_age_days = optional(number)
      max_versions = optional(number)
    })), {})
  }))
  default = {}
}

variable "table_deletion_protection" {
  description = "Mark created tables PROTECTED so a config change cannot drop them."
  type        = bool
  default     = true
}

variable "app_profile" {
  description = <<-EOT
    Optional custom app profile controlling read/write routing. routing is
    "multi_cluster" (auto failover/load-balance) or "single_cluster" (pin to one
    cluster, required for transactional single-row writes).
  EOT
  type = object({
    id                         = string
    description                = optional(string, "Managed by Terraform")
    routing                    = string
    cluster_id                 = optional(string)
    allow_transactional_writes = optional(bool, false)
  })
  default = null

  validation {
    condition     = var.app_profile == null ? true : contains(["multi_cluster", "single_cluster"], var.app_profile.routing)
    error_message = "app_profile.routing must be multi_cluster or single_cluster."
  }

  validation {
    condition     = var.app_profile == null ? true : (var.app_profile.routing != "single_cluster" || var.app_profile.cluster_id != null)
    error_message = "single_cluster routing requires app_profile.cluster_id to be set."
  }
}

variable "labels" {
  description = "Labels applied to the Bigtable instance."
  type        = map(string)
  default     = {}
}

outputs.tf

output "instance_id" {
  description = "Fully qualified Bigtable instance ID (projects/<project>/instances/<name>)."
  value       = google_bigtable_instance.this.id
}

output "instance_name" {
  description = "Bigtable instance name (used in client connection strings and CLI)."
  value       = google_bigtable_instance.this.name
}

output "cluster_ids" {
  description = "Map of cluster_id => zone for every cluster in the instance."
  value       = { for c in var.clusters : c.cluster_id => c.zone }
}

output "table_ids" {
  description = "Map of table name => fully qualified table ID."
  value       = { for name, t in google_bigtable_table.this : name => t.id }
}

output "table_names" {
  description = "List of created table names."
  value       = keys(google_bigtable_table.this)
}

output "app_profile_id" {
  description = "App profile ID, or null if no custom profile was created."
  value       = try(google_bigtable_app_profile.this[0].app_profile_id, null)
}

How to use it

The example below provisions a replicated, autoscaling telemetry instance with one events table, then wires a Dataflow / app-tier service account and a downstream consumer that reads the instance name from the module output.

module "bigtable" {
  source = "git::https://dev.azure.com/teknohut/kloudvin/_git/terraform-modules//terraform-module-gcp-bigtable?ref=v1.0.0"

  project_id     = "kv-data-prod"
  app            = "telemetry"
  environment    = "prod"
  location_short = "euw1"

  instance_type       = "PRODUCTION"
  deletion_protection = true

  clusters = [
    {
      cluster_id   = "telemetry-prod-euw1-c1"
      zone         = "europe-west1-b"
      storage_type = "SSD"
      autoscaling = {
        min_nodes  = 3
        max_nodes  = 20
        cpu_target = 60
      }
    },
    {
      # Second cluster in another zone = replication + read locality.
      cluster_id   = "telemetry-prod-euw1-c2"
      zone         = "europe-west1-c"
      storage_type = "SSD"
      autoscaling = {
        min_nodes  = 3
        max_nodes  = 20
        cpu_target = 60
      }
    },
  ]

  tables = {
    "device_events" = {
      column_families = ["raw", "agg"]
      # Pre-split on a reversed-device-id prefix to avoid write hotspotting.
      split_keys = ["1", "3", "5", "7", "9", "b", "d", "f"]
      gc_rules = {
        raw = { max_age_days = 30 }            # drop raw cells after 30 days
        agg = { max_versions = 1 }             # keep only the latest aggregate
      }
    }
  }

  app_profile = {
    id      = "ingest"
    routing = "multi_cluster"
  }

  labels = {
    team        = "data-platform"
    cost-center = "kv-1042"
    workload    = "telemetry"
  }
}

# Downstream: grant the ingestion SA write access on this exact instance,
# using the module output rather than a hardcoded name.
resource "google_bigtable_instance_iam_member" "ingest_writer" {
  project  = "kv-data-prod"
  instance = module.bigtable.instance_name
  role     = "roles/bigtable.user"
  member   = "serviceAccount:telemetry-ingest@kv-data-prod.iam.gserviceaccount.com"
}

With Terragrunt

Terragrunt keeps this module DRY across environments — define the backend and provider once in a root config, then a thin terragrunt.hcl per environment supplies only the inputs that differ.

1. Root configlive/terragrunt.hcl (inherited by every module):

remote_state {
  backend = "gcs"
  generate = { path = "backend.tf", if_exists = "overwrite" }
  config = {
    # ...gcs state bucket/container + key per path...
  }
}

2. Module configlive/prod/bigtable/terragrunt.hcl:

include "root" {
  path = find_in_parent_folders()
}

terraform {
  source = "git::https://dev.azure.com/teknohut/kloudvin/_git/terraform-modules//terraform-module-gcp-bigtable?ref=v1.0.0"
}

inputs = {
  project_id = "..."
  app = "..."
  environment = "..."
  location_short = "..."
  clusters = ["...", "..."]
}

3. Deploy one environment, or roll out all modules together:

cd live/prod/bigtable && terragrunt apply        # this module
terragrunt run-all apply                      # every module under live/prod

Why Terragrunt here: the backend and provider live in one place instead of being copy-pasted into every module; inputs is overridden per environment (dev / stage / prod) without forking the module; and run-all orchestrates dependencies across modules. Reach for it once you have more than one environment or more than a handful of modules — for a single stack, the plain Quickstart above is enough.

Inputs

Name Type Default Required Description
project_id string Yes GCP project ID hosting the Bigtable instance.
app string Yes Workload short name used in the instance name (validated lowercase).
environment string Yes One of dev, staging, prod, sandbox.
location_short string Yes Cosmetic region/zone token for naming (e.g. euw1).
display_name string null No Human-friendly display name; defaults to the generated instance name.
instance_type string "PRODUCTION" No PRODUCTION or DEVELOPMENT.
deletion_protection bool true No Block terraform destroy on the instance.
clusters list(object) Yes 1–8 clusters; each with zone, storage_type, and fixed num_nodes or an autoscaling block.
tables map(object) {} No Tables keyed by name, with column_families, split_keys, and per-family gc_rules.
table_deletion_protection bool true No Create tables as PROTECTED.
app_profile object null No Optional app profile: multi_cluster or single_cluster routing.
labels map(string) {} No Labels applied to the instance.

Outputs

Name Description
instance_id Fully qualified instance ID (projects/<project>/instances/<name>).
instance_name Instance name used in client connections and cbt CLI.
cluster_ids Map of cluster_id => zone for every cluster.
table_ids Map of table name => fully qualified table ID.
table_names List of created table names.
app_profile_id Custom app profile ID, or null if none was created.

Enterprise scenario

A connected-vehicle platform ingests ~400k telemetry messages/second from a global fleet. They deploy this module per region with a two-cluster, autoscaling (min_nodes = 6, max_nodes = 40, cpu_target = 60) PRODUCTION instance and a device_events table pre-split on a reversed-VIN row key to spread writes evenly. A 30-day max_age GC policy on the raw family keeps storage flat while a daily aggregation job writes a max_versions = 1 rollup family for the dashboards. A multi_cluster app profile gives them automatic zone failover during maintenance, and instance_name feeds the Dataflow pipeline and per-team IAM bindings — so onboarding a new region is a copy-paste of one module block plus a cluster_id.

Best practices

TerraformGCPBigtableModuleIaC
Need this built for real?

Vinod is a Senior Cloud Architect (22+ yrs) — available for Azure / AWS / GCP architecture, landing zones, and migrations.

Work with me

Comments

Keep Reading