Terraform Module: GCP GKE Node Pool — Decoupled, Auto-Repairing Worker Capacity for Your Clusters

Quick take — A production-ready Terraform module for google_container_node_pool on hashicorp/google ~> 5.0: autoscaling, surge upgrades, Workload Identity, Spot nodes, taints, and least-privilege node service accounts. New here? Jump to the Quickstart below to deploy it in minutes; read on for how it works and when to reach for it.

Quickstart (copy-paste)

Minimal, runnable configuration — drop this in a .tf file and fill in the "..." placeholders (each required input is commented):

provider "google" {
  project = "my-project"
  region  = "us-central1"
}

module "gke_node_pool" {
  source = "git::https://dev.azure.com/teknohut/kloudvin/_git/terraform-modules//terraform-module-gcp-gke-node-pool?ref=v1.0.0"

  name         = "..."  # Node pool name; lowercase, DNS-compatible, <= 40 chars.
  project_id   = "..."  # GCP project ID owning the cluster.
  location     = "..."  # Region (regional pool) or zone (zonal pool).
  cluster_name = "..."  # Existing GKE cluster to attach the pool to.
}

Then terraform init && terraform apply. Every other input has a sensible default — see Inputs below to override behaviour.

What this module is

A GKE node pool is a group of nodes within a Google Kubernetes Engine cluster that all share the same configuration: machine type, disk, OAuth scopes, labels, taints, and lifecycle behaviour. The cluster control plane is managed by Google, but the worker capacity that actually runs your Pods lives in node pools — and almost every real workload needs more than one. You typically run a small general-purpose pool for system add-ons, a larger pool for stateless apps, and perhaps a Spot or GPU pool for batch and ML.

The reason to wrap google_container_node_pool in a reusable module — rather than defining it inline next to the cluster — is decoupling and repetition. Node pools are the part of GKE you change most often: you resize them, swap machine families during cost reviews, add Spot capacity, roll new node images, and occasionally recreate them entirely. Keeping the pool in its own module lets you add, scale, or destroy worker capacity without ever touching the google_container_cluster resource (and risking control-plane churn). It also forces consistency: every pool created through this module gets autoscaling bounds, surge-upgrade settings, auto-repair/auto-upgrade, Workload Identity metadata, a least-privilege service account, and shielded-node defaults — instead of each team hand-rolling a slightly different node_config.

When to use it

You manage GKE clusters with Terraform and want node pools as independent, composable units you can scale or replace in isolation.
You need several pools per cluster with different shapes — e.g. an on-demand apps pool plus a Spot batch pool, or a tainted GPU pool — all from one consistent definition.
You want autoscaling, surge upgrades, and auto-repair/auto-upgrade enforced as defaults rather than per-team choices.
You are standardising on Workload Identity and least-privilege node service accounts and want every pool to opt in automatically.
You explicitly do not want to use this for the cluster itself, default node pool inlined in the cluster, or for fully Autopilot clusters (Autopilot manages nodes for you — there are no google_container_node_pool resources to author).

Module structure

terraform-module-gcp-gke-node-pool/
├── versions.tf      # provider + Terraform version pins
├── main.tf          # google_container_node_pool resource
├── variables.tf     # all input variables + validations
└── outputs.tf       # id, name, instance group URLs, version

versions.tf

terraform {
  required_version = ">= 1.5.0"

  required_providers {
    google = {
      source  = "hashicorp/google"
      version = "~> 5.0"
    }
  }
}

main.tf

locals {
  # Workload Identity is only valid when we have a workload pool to bind to.
  workload_metadata_mode = var.enable_workload_identity ? "GKE_METADATA" : "GCE_METADATA"

  # GKE requires the monitoring + logging + storage scopes at minimum when a
  # custom service account is supplied; we merge the caller's extras on top.
  oauth_scopes = distinct(concat(
    ["https://www.googleapis.com/auth/cloud-platform"],
    var.additional_oauth_scopes,
  ))
}

resource "google_container_node_pool" "this" {
  provider = google

  name     = var.name
  project  = var.project_id
  location = var.location
  cluster  = var.cluster_name

  # Either a fixed size OR autoscaling — never both. node_count is omitted
  # when autoscaling is on so GKE can own the replica count.
  node_count = var.enable_autoscaling ? null : var.node_count

  # Spread nodes across the given zones for a regional pool; null lets GKE
  # use every zone in the region.
  node_locations = length(var.node_locations) > 0 ? var.node_locations : null

  max_pods_per_node = var.max_pods_per_node

  dynamic "autoscaling" {
    for_each = var.enable_autoscaling ? [1] : []
    content {
      min_node_count  = var.min_node_count
      max_node_count  = var.max_node_count
      location_policy = var.autoscaling_location_policy
    }
  }

  management {
    auto_repair  = var.auto_repair
    auto_upgrade = var.auto_upgrade
  }

  # Surge upgrades keep capacity available while nodes roll. max_surge adds
  # temporary nodes; max_unavailable bounds disruption.
  upgrade_settings {
    strategy        = var.upgrade_strategy
    max_surge       = var.upgrade_strategy == "SURGE" ? var.max_surge : null
    max_unavailable = var.upgrade_strategy == "SURGE" ? var.max_unavailable : null
  }

  node_config {
    machine_type    = var.machine_type
    image_type      = var.image_type
    disk_size_gb    = var.disk_size_gb
    disk_type       = var.disk_type
    spot            = var.spot
    service_account = var.node_service_account
    oauth_scopes    = local.oauth_scopes

    labels = var.node_labels
    tags   = var.network_tags

    # Shielded nodes: verified boot + integrity monitoring against rootkits.
    shielded_instance_config {
      enable_secure_boot          = var.enable_secure_boot
      enable_integrity_monitoring = var.enable_integrity_monitoring
    }

    # Bind the GCE metadata server to Workload Identity so Pods get GCP IAM
    # via KSA->GSA federation instead of node-level credentials.
    workload_metadata_config {
      mode = local.workload_metadata_mode
    }

    dynamic "taint" {
      for_each = var.node_taints
      content {
        key    = taint.value.key
        value  = taint.value.value
        effect = taint.value.effect
      }
    }

    metadata = merge(
      { "disable-legacy-endpoints" = "true" },
      var.node_metadata,
    )

    resource_labels = var.resource_labels
  }

  lifecycle {
    # node_count drifts constantly when autoscaling is on; ignore it so plans
    # stay clean. The autoscaler — not Terraform — owns the live count.
    ignore_changes = [node_config[0].labels, initial_node_count]
  }

  timeouts {
    create = "30m"
    update = "30m"
    delete = "30m"
  }
}

variables.tf

variable "name" {
  description = "Name of the node pool. Should be short and DNS-compatible."
  type        = string

  validation {
    condition     = can(regex("^[a-z][a-z0-9-]{0,38}[a-z0-9]$", var.name))
    error_message = "name must be lowercase alphanumeric/hyphens, start with a letter, and be <= 40 chars."
  }
}

variable "project_id" {
  description = "GCP project ID that owns the cluster."
  type        = string
}

variable "location" {
  description = "Cluster location: a region (e.g. asia-south1) for regional pools or a zone for zonal."
  type        = string
}

variable "cluster_name" {
  description = "Name of the existing GKE cluster to attach this node pool to."
  type        = string
}

variable "node_locations" {
  description = "Optional explicit list of zones to spread nodes across. Empty uses all zones in the region."
  type        = list(string)
  default     = []
}

variable "machine_type" {
  description = "Compute Engine machine type for nodes (e.g. e2-standard-4, n2-standard-8)."
  type        = string
  default     = "e2-standard-4"
}

variable "image_type" {
  description = "Node image type. COS_CONTAINERD is the supported default for GKE."
  type        = string
  default     = "COS_CONTAINERD"

  validation {
    condition     = contains(["COS_CONTAINERD", "UBUNTU_CONTAINERD", "COS"], var.image_type)
    error_message = "image_type must be one of COS_CONTAINERD, UBUNTU_CONTAINERD, or COS."
  }
}

variable "disk_size_gb" {
  description = "Boot disk size per node in GB."
  type        = number
  default     = 100

  validation {
    condition     = var.disk_size_gb >= 30
    error_message = "disk_size_gb must be at least 30 GB for GKE nodes."
  }
}

variable "disk_type" {
  description = "Boot disk type (pd-standard, pd-balanced, pd-ssd)."
  type        = string
  default     = "pd-balanced"

  validation {
    condition     = contains(["pd-standard", "pd-balanced", "pd-ssd"], var.disk_type)
    error_message = "disk_type must be pd-standard, pd-balanced, or pd-ssd."
  }
}

variable "spot" {
  description = "Run nodes as Spot VMs (cheaper, preemptible). Use only for fault-tolerant workloads."
  type        = bool
  default     = false
}

variable "node_count" {
  description = "Fixed node count per zone when autoscaling is disabled."
  type        = number
  default     = 1
}

variable "enable_autoscaling" {
  description = "Enable the cluster autoscaler for this pool. When true, node_count is ignored."
  type        = bool
  default     = true
}

variable "min_node_count" {
  description = "Minimum nodes per zone when autoscaling is enabled."
  type        = number
  default     = 1
}

variable "max_node_count" {
  description = "Maximum nodes per zone when autoscaling is enabled."
  type        = number
  default     = 5

  validation {
    condition     = var.max_node_count >= var.min_node_count
    error_message = "max_node_count must be greater than or equal to min_node_count."
  }
}

variable "autoscaling_location_policy" {
  description = "Autoscaler placement policy: BALANCED (spread) or ANY (best-effort, Spot-friendly)."
  type        = string
  default     = "BALANCED"

  validation {
    condition     = contains(["BALANCED", "ANY"], var.autoscaling_location_policy)
    error_message = "autoscaling_location_policy must be BALANCED or ANY."
  }
}

variable "max_pods_per_node" {
  description = "Maximum Pods per node. Lower values conserve the cluster's IP range."
  type        = number
  default     = 110
}

variable "auto_repair" {
  description = "Automatically repair unhealthy nodes."
  type        = bool
  default     = true
}

variable "auto_upgrade" {
  description = "Automatically upgrade nodes to match the control plane version."
  type        = bool
  default     = true
}

variable "upgrade_strategy" {
  description = "Node upgrade strategy: SURGE (rolling with extra capacity) or BLUE_GREEN."
  type        = string
  default     = "SURGE"

  validation {
    condition     = contains(["SURGE", "BLUE_GREEN"], var.upgrade_strategy)
    error_message = "upgrade_strategy must be SURGE or BLUE_GREEN."
  }
}

variable "max_surge" {
  description = "Extra nodes added during a SURGE upgrade."
  type        = number
  default     = 1
}

variable "max_unavailable" {
  description = "Nodes that may be unavailable during a SURGE upgrade."
  type        = number
  default     = 0
}

variable "node_service_account" {
  description = "Email of the least-privilege IAM service account the nodes run as. Required for prod."
  type        = string
  default     = null
}

variable "additional_oauth_scopes" {
  description = "Extra OAuth scopes beyond cloud-platform. Usually empty when using Workload Identity."
  type        = list(string)
  default     = []
}

variable "enable_workload_identity" {
  description = "Bind nodes to the GKE metadata server for Workload Identity (KSA->GSA federation)."
  type        = bool
  default     = true
}

variable "enable_secure_boot" {
  description = "Enable Shielded VM Secure Boot on nodes."
  type        = bool
  default     = true
}

variable "enable_integrity_monitoring" {
  description = "Enable Shielded VM integrity monitoring on nodes."
  type        = bool
  default     = true
}

variable "node_labels" {
  description = "Kubernetes labels applied to nodes (for nodeSelector/affinity)."
  type        = map(string)
  default     = {}
}

variable "node_taints" {
  description = "Kubernetes taints to keep general workloads off specialised pools."
  type = list(object({
    key    = string
    value  = string
    effect = string
  }))
  default = []

  validation {
    condition = alltrue([
      for t in var.node_taints : contains(["NO_SCHEDULE", "PREFER_NO_SCHEDULE", "NO_EXECUTE"], t.effect)
    ])
    error_message = "Each taint effect must be NO_SCHEDULE, PREFER_NO_SCHEDULE, or NO_EXECUTE."
  }
}

variable "network_tags" {
  description = "GCE network tags on node VMs (for firewall rule targeting)."
  type        = list(string)
  default     = []
}

variable "node_metadata" {
  description = "Additional GCE instance metadata key/value pairs for nodes."
  type        = map(string)
  default     = {}
}

variable "resource_labels" {
  description = "GCE resource labels applied to node VMs for billing/cost allocation."
  type        = map(string)
  default     = {}
}

outputs.tf

output "id" {
  description = "Fully qualified node pool ID (projects/.../nodePools/...)."
  value       = google_container_node_pool.this.id
}

output "name" {
  description = "Name of the node pool."
  value       = google_container_node_pool.this.name
}

output "version" {
  description = "Kubernetes version currently running on the node pool."
  value       = google_container_node_pool.this.version
}

output "instance_group_urls" {
  description = "Managed instance group URLs backing the node pool (one per zone)."
  value       = google_container_node_pool.this.instance_group_urls
}

output "managed_instance_group_urls" {
  description = "Managed instance group manager URLs for the node pool."
  value       = google_container_node_pool.this.managed_instance_group_urls
}

output "service_account" {
  description = "Service account email the nodes run as (resolved or default)."
  value       = try(google_container_node_pool.this.node_config[0].service_account, null)
}

How to use it

A typical cluster consumes this module twice: an on-demand apps pool for steady traffic and a tainted Spot batch pool for fault-tolerant jobs.

module "gke_apps_pool" {
  source = "git::https://dev.azure.com/teknohut/kloudvin/_git/terraform-modules//terraform-module-gcp-gke-node-pool?ref=v1.0.0"

  name         = "apps"
  project_id   = "kloudvin-prod"
  location     = "asia-south1"
  cluster_name = "kloudvin-prod-gke"

  machine_type        = "n2-standard-8"
  enable_autoscaling  = true
  min_node_count      = 2
  max_node_count      = 12
  disk_type           = "pd-ssd"

  node_service_account     = google_service_account.gke_nodes.email
  enable_workload_identity = true

  node_labels     = { team = "platform", tier = "general" }
  resource_labels = { cost-center = "platform", env = "prod" }
}

module "gke_batch_pool" {
  source = "git::https://dev.azure.com/teknohut/kloudvin/_git/terraform-modules//terraform-module-gcp-gke-node-pool?ref=v1.0.0"

  name         = "batch-spot"
  project_id   = "kloudvin-prod"
  location     = "asia-south1"
  cluster_name = "kloudvin-prod-gke"

  machine_type                = "c2-standard-16"
  spot                        = true
  enable_autoscaling          = true
  min_node_count              = 0
  max_node_count              = 20
  autoscaling_location_policy = "ANY"

  node_service_account     = google_service_account.gke_nodes.email
  enable_workload_identity = true

  node_taints = [{
    key    = "workload-type"
    value  = "batch"
    effect = "NO_SCHEDULE"
  }]

  resource_labels = { cost-center = "data-eng", env = "prod" }
}

# Least-privilege node identity shared by both pools.
resource "google_service_account" "gke_nodes" {
  account_id   = "gke-prod-nodes"
  display_name = "GKE prod node pool service account"
  project      = "kloudvin-prod"
}

# Downstream reference: target a firewall rule at this pool's MIGs, and surface
# the node SA so IAM bindings elsewhere can grant it access.
output "apps_pool_migs" {
  value = module.gke_apps_pool.instance_group_urls
}

output "node_runtime_sa" {
  value = module.gke_apps_pool.service_account
}

With Terragrunt

Terragrunt keeps this module DRY across environments — define the backend and provider once in a root config, then a thin terragrunt.hcl per environment supplies only the inputs that differ.

1. Root config — live/terragrunt.hcl (inherited by every module):

remote_state {
  backend = "gcs"
  generate = { path = "backend.tf", if_exists = "overwrite" }
  config = {
    # ...gcs state bucket/container + key per path...
  }
}

2. Module config — live/prod/gke_node_pool/terragrunt.hcl:

include "root" {
  path = find_in_parent_folders()
}

terraform {
  source = "git::https://dev.azure.com/teknohut/kloudvin/_git/terraform-modules//terraform-module-gcp-gke-node-pool?ref=v1.0.0"
}

inputs = {
  name = "..."
  project_id = "..."
  location = "..."
  cluster_name = "..."
}

3. Deploy one environment, or roll out all modules together:

cd live/prod/gke_node_pool && terragrunt apply        # this module
terragrunt run-all apply                      # every module under live/prod

Why Terragrunt here: the backend and provider live in one place instead of being copy-pasted into every module; inputs is overridden per environment (dev / stage / prod) without forking the module; and run-all orchestrates dependencies across modules. Reach for it once you have more than one environment or more than a handful of modules — for a single stack, the plain Quickstart above is enough.

Inputs

Name	Type	Default	Required	Description
name	string	—	Yes	Node pool name; lowercase, DNS-compatible, <= 40 chars.
project_id	string	—	Yes	GCP project ID owning the cluster.
location	string	—	Yes	Region (regional pool) or zone (zonal pool).
cluster_name	string	—	Yes	Existing GKE cluster to attach the pool to.
node_locations	list(string)	`[]`	No	Explicit zones to spread nodes across; empty = all region zones.
machine_type	string	`e2-standard-4`	No	Compute Engine machine type for nodes.
image_type	string	`COS_CONTAINERD`	No	Node image type (COS_CONTAINERD/UBUNTU_CONTAINERD/COS).
disk_size_gb	number	`100`	No	Boot disk size per node in GB (>= 30).
disk_type	string	`pd-balanced`	No	Boot disk type (pd-standard/pd-balanced/pd-ssd).
spot	bool	`false`	No	Run nodes as Spot VMs for fault-tolerant workloads.
node_count	number	`1`	No	Fixed node count per zone when autoscaling is off.
enable_autoscaling	bool	`true`	No	Enable the cluster autoscaler for this pool.
min_node_count	number	`1`	No	Minimum nodes per zone when autoscaling.
max_node_count	number	`5`	No	Maximum nodes per zone when autoscaling (>= min).
autoscaling_location_policy	string	`BALANCED`	No	Autoscaler placement: BALANCED or ANY.
max_pods_per_node	number	`110`	No	Maximum Pods per node; lower conserves IP range.
auto_repair	bool	`true`	No	Automatically repair unhealthy nodes.
auto_upgrade	bool	`true`	No	Auto-upgrade nodes to match the control plane.
upgrade_strategy	string	`SURGE`	No	Upgrade strategy: SURGE or BLUE_GREEN.
max_surge	number	`1`	No	Extra nodes added during a SURGE upgrade.
max_unavailable	number	`0`	No	Nodes allowed unavailable during a SURGE upgrade.
node_service_account	string	`null`	No	Least-privilege IAM SA email for nodes (set in prod).
additional_oauth_scopes	list(string)	`[]`	No	Extra OAuth scopes beyond cloud-platform.
enable_workload_identity	bool	`true`	No	Bind nodes to GKE metadata server for Workload Identity.
enable_secure_boot	bool	`true`	No	Enable Shielded VM Secure Boot.
enable_integrity_monitoring	bool	`true`	No	Enable Shielded VM integrity monitoring.
node_labels	map(string)	`{}`	No	Kubernetes labels on nodes for nodeSelector/affinity.
node_taints	list(object)	`[]`	No	Taints to keep general workloads off the pool.
network_tags	list(string)	`[]`	No	GCE network tags for firewall targeting.
node_metadata	map(string)	`{}`	No	Additional GCE instance metadata for nodes.
resource_labels	map(string)	`{}`	No	GCE resource labels for billing/cost allocation.

Outputs

Name	Description
id	Fully qualified node pool ID (projects/…/nodePools/…).
name	Name of the node pool.
version	Kubernetes version currently running on the pool.
instance_group_urls	Managed instance group URLs backing the pool (one per zone).
managed_instance_group_urls	Managed instance group manager URLs for the pool.
service_account	Service account email the nodes run as.

Enterprise scenario

A fintech running a regional GKE cluster in asia-south1 uses this module to split capacity by risk and cost. Real-time payment APIs land on an on-demand n2-standard-8 apps pool with min_node_count = 3 and BLUE_GREEN upgrades so a bad node image can be rolled back instantly without surge churn during settlement windows. Overnight reconciliation and fraud-model retraining run on a tainted c2-standard-16 Spot pool that scales from 0 to 20 with location_policy = ANY, cutting batch compute spend by roughly 70 percent while the NO_SCHEDULE taint guarantees latency-sensitive Pods never land there. Because the pools are separate module instances, the platform team resizes or recreates the Spot pool during a machine-family migration without ever re-planning the payment-critical pool or the cluster itself.

Best practices

Always pass a dedicated node_service_account. Letting nodes use the Compute Engine default SA grants Editor on the whole project; a purpose-built SA with only the roles its Pods need (plus Workload Identity for app-level IAM) is the single biggest security win here.
Keep autoscaling and node_count mutually exclusive. This module nulls node_count when enable_autoscaling is true so the autoscaler owns the live count — never set a fixed count and expect autoscaling, and let the ignore_changes lifecycle keep plans clean.
Use Spot pools with taints and min_node_count = 0 for batch. Spot nodes can be reclaimed in seconds; isolate fault-tolerant workloads onto them with a NO_SCHEDULE taint and location_policy = ANY, and let the pool scale to zero so you pay nothing when idle.
Tune surge upgrades for your disruption budget. max_surge = 1 / max_unavailable = 0 keeps full capacity during rolls (good for stateless apps); for latency-critical pools prefer BLUE_GREEN so you can soak and roll back. Pair both with PodDisruptionBudgets.
Right-size max_pods_per_node to protect your IP space. GKE allocates a /24-equivalent per node from the Pod CIDR by default (110 Pods); lowering it on pools that run few large Pods can be the difference between a cluster that scales and one that exhausts its alias IP range.
Name pools by role, label by cost. Short role-based names (apps, batch-spot, gpu-ml) keep nodeSelectors readable, while resource_labels such as cost-center and env give Finance clean per-pool billing attribution without a separate tagging pass.