Quick take — A production-ready Terraform module for google_container_node_pool on hashicorp/google ~> 5.0: autoscaling, surge upgrades, Workload Identity, Spot nodes, taints, and least-privilege node service accounts. New here? Jump to the Quickstart below to deploy it in minutes; read on for how it works and when to reach for it.
Quickstart (copy-paste)
Minimal, runnable configuration — drop this in a .tf file and fill in the "..." placeholders (each required input is commented):
provider "google" {
project = "my-project"
region = "us-central1"
}
module "gke_node_pool" {
source = "git::https://dev.azure.com/teknohut/kloudvin/_git/terraform-modules//terraform-module-gcp-gke-node-pool?ref=v1.0.0"
name = "..." # Node pool name; lowercase, DNS-compatible, <= 40 chars.
project_id = "..." # GCP project ID owning the cluster.
location = "..." # Region (regional pool) or zone (zonal pool).
cluster_name = "..." # Existing GKE cluster to attach the pool to.
}
Then terraform init && terraform apply. Every other input has a sensible default — see Inputs below to override behaviour.
What this module is
A GKE node pool is a group of nodes within a Google Kubernetes Engine cluster that all share the same configuration: machine type, disk, OAuth scopes, labels, taints, and lifecycle behaviour. The cluster control plane is managed by Google, but the worker capacity that actually runs your Pods lives in node pools — and almost every real workload needs more than one. You typically run a small general-purpose pool for system add-ons, a larger pool for stateless apps, and perhaps a Spot or GPU pool for batch and ML.
The reason to wrap google_container_node_pool in a reusable module — rather than defining it inline next to the cluster — is decoupling and repetition. Node pools are the part of GKE you change most often: you resize them, swap machine families during cost reviews, add Spot capacity, roll new node images, and occasionally recreate them entirely. Keeping the pool in its own module lets you add, scale, or destroy worker capacity without ever touching the google_container_cluster resource (and risking control-plane churn). It also forces consistency: every pool created through this module gets autoscaling bounds, surge-upgrade settings, auto-repair/auto-upgrade, Workload Identity metadata, a least-privilege service account, and shielded-node defaults — instead of each team hand-rolling a slightly different node_config.
When to use it
- You manage GKE clusters with Terraform and want node pools as independent, composable units you can scale or replace in isolation.
- You need several pools per cluster with different shapes — e.g. an on-demand
appspool plus a Spotbatchpool, or a tainted GPU pool — all from one consistent definition. - You want autoscaling, surge upgrades, and auto-repair/auto-upgrade enforced as defaults rather than per-team choices.
- You are standardising on Workload Identity and least-privilege node service accounts and want every pool to opt in automatically.
- You explicitly do not want to use this for the cluster itself, default node pool inlined in the cluster, or for fully Autopilot clusters (Autopilot manages nodes for you — there are no
google_container_node_poolresources to author).
Module structure
terraform-module-gcp-gke-node-pool/
├── versions.tf # provider + Terraform version pins
├── main.tf # google_container_node_pool resource
├── variables.tf # all input variables + validations
└── outputs.tf # id, name, instance group URLs, version
versions.tf
terraform {
required_version = ">= 1.5.0"
required_providers {
google = {
source = "hashicorp/google"
version = "~> 5.0"
}
}
}
main.tf
locals {
# Workload Identity is only valid when we have a workload pool to bind to.
workload_metadata_mode = var.enable_workload_identity ? "GKE_METADATA" : "GCE_METADATA"
# GKE requires the monitoring + logging + storage scopes at minimum when a
# custom service account is supplied; we merge the caller's extras on top.
oauth_scopes = distinct(concat(
["https://www.googleapis.com/auth/cloud-platform"],
var.additional_oauth_scopes,
))
}
resource "google_container_node_pool" "this" {
provider = google
name = var.name
project = var.project_id
location = var.location
cluster = var.cluster_name
# Either a fixed size OR autoscaling — never both. node_count is omitted
# when autoscaling is on so GKE can own the replica count.
node_count = var.enable_autoscaling ? null : var.node_count
# Spread nodes across the given zones for a regional pool; null lets GKE
# use every zone in the region.
node_locations = length(var.node_locations) > 0 ? var.node_locations : null
max_pods_per_node = var.max_pods_per_node
dynamic "autoscaling" {
for_each = var.enable_autoscaling ? [1] : []
content {
min_node_count = var.min_node_count
max_node_count = var.max_node_count
location_policy = var.autoscaling_location_policy
}
}
management {
auto_repair = var.auto_repair
auto_upgrade = var.auto_upgrade
}
# Surge upgrades keep capacity available while nodes roll. max_surge adds
# temporary nodes; max_unavailable bounds disruption.
upgrade_settings {
strategy = var.upgrade_strategy
max_surge = var.upgrade_strategy == "SURGE" ? var.max_surge : null
max_unavailable = var.upgrade_strategy == "SURGE" ? var.max_unavailable : null
}
node_config {
machine_type = var.machine_type
image_type = var.image_type
disk_size_gb = var.disk_size_gb
disk_type = var.disk_type
spot = var.spot
service_account = var.node_service_account
oauth_scopes = local.oauth_scopes
labels = var.node_labels
tags = var.network_tags
# Shielded nodes: verified boot + integrity monitoring against rootkits.
shielded_instance_config {
enable_secure_boot = var.enable_secure_boot
enable_integrity_monitoring = var.enable_integrity_monitoring
}
# Bind the GCE metadata server to Workload Identity so Pods get GCP IAM
# via KSA->GSA federation instead of node-level credentials.
workload_metadata_config {
mode = local.workload_metadata_mode
}
dynamic "taint" {
for_each = var.node_taints
content {
key = taint.value.key
value = taint.value.value
effect = taint.value.effect
}
}
metadata = merge(
{ "disable-legacy-endpoints" = "true" },
var.node_metadata,
)
resource_labels = var.resource_labels
}
lifecycle {
# node_count drifts constantly when autoscaling is on; ignore it so plans
# stay clean. The autoscaler — not Terraform — owns the live count.
ignore_changes = [node_config[0].labels, initial_node_count]
}
timeouts {
create = "30m"
update = "30m"
delete = "30m"
}
}
variables.tf
variable "name" {
description = "Name of the node pool. Should be short and DNS-compatible."
type = string
validation {
condition = can(regex("^[a-z][a-z0-9-]{0,38}[a-z0-9]$", var.name))
error_message = "name must be lowercase alphanumeric/hyphens, start with a letter, and be <= 40 chars."
}
}
variable "project_id" {
description = "GCP project ID that owns the cluster."
type = string
}
variable "location" {
description = "Cluster location: a region (e.g. asia-south1) for regional pools or a zone for zonal."
type = string
}
variable "cluster_name" {
description = "Name of the existing GKE cluster to attach this node pool to."
type = string
}
variable "node_locations" {
description = "Optional explicit list of zones to spread nodes across. Empty uses all zones in the region."
type = list(string)
default = []
}
variable "machine_type" {
description = "Compute Engine machine type for nodes (e.g. e2-standard-4, n2-standard-8)."
type = string
default = "e2-standard-4"
}
variable "image_type" {
description = "Node image type. COS_CONTAINERD is the supported default for GKE."
type = string
default = "COS_CONTAINERD"
validation {
condition = contains(["COS_CONTAINERD", "UBUNTU_CONTAINERD", "COS"], var.image_type)
error_message = "image_type must be one of COS_CONTAINERD, UBUNTU_CONTAINERD, or COS."
}
}
variable "disk_size_gb" {
description = "Boot disk size per node in GB."
type = number
default = 100
validation {
condition = var.disk_size_gb >= 30
error_message = "disk_size_gb must be at least 30 GB for GKE nodes."
}
}
variable "disk_type" {
description = "Boot disk type (pd-standard, pd-balanced, pd-ssd)."
type = string
default = "pd-balanced"
validation {
condition = contains(["pd-standard", "pd-balanced", "pd-ssd"], var.disk_type)
error_message = "disk_type must be pd-standard, pd-balanced, or pd-ssd."
}
}
variable "spot" {
description = "Run nodes as Spot VMs (cheaper, preemptible). Use only for fault-tolerant workloads."
type = bool
default = false
}
variable "node_count" {
description = "Fixed node count per zone when autoscaling is disabled."
type = number
default = 1
}
variable "enable_autoscaling" {
description = "Enable the cluster autoscaler for this pool. When true, node_count is ignored."
type = bool
default = true
}
variable "min_node_count" {
description = "Minimum nodes per zone when autoscaling is enabled."
type = number
default = 1
}
variable "max_node_count" {
description = "Maximum nodes per zone when autoscaling is enabled."
type = number
default = 5
validation {
condition = var.max_node_count >= var.min_node_count
error_message = "max_node_count must be greater than or equal to min_node_count."
}
}
variable "autoscaling_location_policy" {
description = "Autoscaler placement policy: BALANCED (spread) or ANY (best-effort, Spot-friendly)."
type = string
default = "BALANCED"
validation {
condition = contains(["BALANCED", "ANY"], var.autoscaling_location_policy)
error_message = "autoscaling_location_policy must be BALANCED or ANY."
}
}
variable "max_pods_per_node" {
description = "Maximum Pods per node. Lower values conserve the cluster's IP range."
type = number
default = 110
}
variable "auto_repair" {
description = "Automatically repair unhealthy nodes."
type = bool
default = true
}
variable "auto_upgrade" {
description = "Automatically upgrade nodes to match the control plane version."
type = bool
default = true
}
variable "upgrade_strategy" {
description = "Node upgrade strategy: SURGE (rolling with extra capacity) or BLUE_GREEN."
type = string
default = "SURGE"
validation {
condition = contains(["SURGE", "BLUE_GREEN"], var.upgrade_strategy)
error_message = "upgrade_strategy must be SURGE or BLUE_GREEN."
}
}
variable "max_surge" {
description = "Extra nodes added during a SURGE upgrade."
type = number
default = 1
}
variable "max_unavailable" {
description = "Nodes that may be unavailable during a SURGE upgrade."
type = number
default = 0
}
variable "node_service_account" {
description = "Email of the least-privilege IAM service account the nodes run as. Required for prod."
type = string
default = null
}
variable "additional_oauth_scopes" {
description = "Extra OAuth scopes beyond cloud-platform. Usually empty when using Workload Identity."
type = list(string)
default = []
}
variable "enable_workload_identity" {
description = "Bind nodes to the GKE metadata server for Workload Identity (KSA->GSA federation)."
type = bool
default = true
}
variable "enable_secure_boot" {
description = "Enable Shielded VM Secure Boot on nodes."
type = bool
default = true
}
variable "enable_integrity_monitoring" {
description = "Enable Shielded VM integrity monitoring on nodes."
type = bool
default = true
}
variable "node_labels" {
description = "Kubernetes labels applied to nodes (for nodeSelector/affinity)."
type = map(string)
default = {}
}
variable "node_taints" {
description = "Kubernetes taints to keep general workloads off specialised pools."
type = list(object({
key = string
value = string
effect = string
}))
default = []
validation {
condition = alltrue([
for t in var.node_taints : contains(["NO_SCHEDULE", "PREFER_NO_SCHEDULE", "NO_EXECUTE"], t.effect)
])
error_message = "Each taint effect must be NO_SCHEDULE, PREFER_NO_SCHEDULE, or NO_EXECUTE."
}
}
variable "network_tags" {
description = "GCE network tags on node VMs (for firewall rule targeting)."
type = list(string)
default = []
}
variable "node_metadata" {
description = "Additional GCE instance metadata key/value pairs for nodes."
type = map(string)
default = {}
}
variable "resource_labels" {
description = "GCE resource labels applied to node VMs for billing/cost allocation."
type = map(string)
default = {}
}
outputs.tf
output "id" {
description = "Fully qualified node pool ID (projects/.../nodePools/...)."
value = google_container_node_pool.this.id
}
output "name" {
description = "Name of the node pool."
value = google_container_node_pool.this.name
}
output "version" {
description = "Kubernetes version currently running on the node pool."
value = google_container_node_pool.this.version
}
output "instance_group_urls" {
description = "Managed instance group URLs backing the node pool (one per zone)."
value = google_container_node_pool.this.instance_group_urls
}
output "managed_instance_group_urls" {
description = "Managed instance group manager URLs for the node pool."
value = google_container_node_pool.this.managed_instance_group_urls
}
output "service_account" {
description = "Service account email the nodes run as (resolved or default)."
value = try(google_container_node_pool.this.node_config[0].service_account, null)
}
How to use it
A typical cluster consumes this module twice: an on-demand apps pool for steady traffic and a tainted Spot batch pool for fault-tolerant jobs.
module "gke_apps_pool" {
source = "git::https://dev.azure.com/teknohut/kloudvin/_git/terraform-modules//terraform-module-gcp-gke-node-pool?ref=v1.0.0"
name = "apps"
project_id = "kloudvin-prod"
location = "asia-south1"
cluster_name = "kloudvin-prod-gke"
machine_type = "n2-standard-8"
enable_autoscaling = true
min_node_count = 2
max_node_count = 12
disk_type = "pd-ssd"
node_service_account = google_service_account.gke_nodes.email
enable_workload_identity = true
node_labels = { team = "platform", tier = "general" }
resource_labels = { cost-center = "platform", env = "prod" }
}
module "gke_batch_pool" {
source = "git::https://dev.azure.com/teknohut/kloudvin/_git/terraform-modules//terraform-module-gcp-gke-node-pool?ref=v1.0.0"
name = "batch-spot"
project_id = "kloudvin-prod"
location = "asia-south1"
cluster_name = "kloudvin-prod-gke"
machine_type = "c2-standard-16"
spot = true
enable_autoscaling = true
min_node_count = 0
max_node_count = 20
autoscaling_location_policy = "ANY"
node_service_account = google_service_account.gke_nodes.email
enable_workload_identity = true
node_taints = [{
key = "workload-type"
value = "batch"
effect = "NO_SCHEDULE"
}]
resource_labels = { cost-center = "data-eng", env = "prod" }
}
# Least-privilege node identity shared by both pools.
resource "google_service_account" "gke_nodes" {
account_id = "gke-prod-nodes"
display_name = "GKE prod node pool service account"
project = "kloudvin-prod"
}
# Downstream reference: target a firewall rule at this pool's MIGs, and surface
# the node SA so IAM bindings elsewhere can grant it access.
output "apps_pool_migs" {
value = module.gke_apps_pool.instance_group_urls
}
output "node_runtime_sa" {
value = module.gke_apps_pool.service_account
}
With Terragrunt
Terragrunt keeps this module DRY across environments — define the backend and provider once in a root config, then a thin terragrunt.hcl per environment supplies only the inputs that differ.
1. Root config — live/terragrunt.hcl (inherited by every module):
remote_state {
backend = "gcs"
generate = { path = "backend.tf", if_exists = "overwrite" }
config = {
# ...gcs state bucket/container + key per path...
}
}
2. Module config — live/prod/gke_node_pool/terragrunt.hcl:
include "root" {
path = find_in_parent_folders()
}
terraform {
source = "git::https://dev.azure.com/teknohut/kloudvin/_git/terraform-modules//terraform-module-gcp-gke-node-pool?ref=v1.0.0"
}
inputs = {
name = "..."
project_id = "..."
location = "..."
cluster_name = "..."
}
3. Deploy one environment, or roll out all modules together:
cd live/prod/gke_node_pool && terragrunt apply # this module
terragrunt run-all apply # every module under live/prod
Why Terragrunt here: the backend and provider live in one place instead of being copy-pasted into every module; inputs is overridden per environment (dev / stage / prod) without forking the module; and run-all orchestrates dependencies across modules. Reach for it once you have more than one environment or more than a handful of modules — for a single stack, the plain Quickstart above is enough.
Inputs
| Name | Type | Default | Required | Description |
|---|---|---|---|---|
| name | string | — | Yes | Node pool name; lowercase, DNS-compatible, <= 40 chars. |
| project_id | string | — | Yes | GCP project ID owning the cluster. |
| location | string | — | Yes | Region (regional pool) or zone (zonal pool). |
| cluster_name | string | — | Yes | Existing GKE cluster to attach the pool to. |
| node_locations | list(string) | [] |
No | Explicit zones to spread nodes across; empty = all region zones. |
| machine_type | string | e2-standard-4 |
No | Compute Engine machine type for nodes. |
| image_type | string | COS_CONTAINERD |
No | Node image type (COS_CONTAINERD/UBUNTU_CONTAINERD/COS). |
| disk_size_gb | number | 100 |
No | Boot disk size per node in GB (>= 30). |
| disk_type | string | pd-balanced |
No | Boot disk type (pd-standard/pd-balanced/pd-ssd). |
| spot | bool | false |
No | Run nodes as Spot VMs for fault-tolerant workloads. |
| node_count | number | 1 |
No | Fixed node count per zone when autoscaling is off. |
| enable_autoscaling | bool | true |
No | Enable the cluster autoscaler for this pool. |
| min_node_count | number | 1 |
No | Minimum nodes per zone when autoscaling. |
| max_node_count | number | 5 |
No | Maximum nodes per zone when autoscaling (>= min). |
| autoscaling_location_policy | string | BALANCED |
No | Autoscaler placement: BALANCED or ANY. |
| max_pods_per_node | number | 110 |
No | Maximum Pods per node; lower conserves IP range. |
| auto_repair | bool | true |
No | Automatically repair unhealthy nodes. |
| auto_upgrade | bool | true |
No | Auto-upgrade nodes to match the control plane. |
| upgrade_strategy | string | SURGE |
No | Upgrade strategy: SURGE or BLUE_GREEN. |
| max_surge | number | 1 |
No | Extra nodes added during a SURGE upgrade. |
| max_unavailable | number | 0 |
No | Nodes allowed unavailable during a SURGE upgrade. |
| node_service_account | string | null |
No | Least-privilege IAM SA email for nodes (set in prod). |
| additional_oauth_scopes | list(string) | [] |
No | Extra OAuth scopes beyond cloud-platform. |
| enable_workload_identity | bool | true |
No | Bind nodes to GKE metadata server for Workload Identity. |
| enable_secure_boot | bool | true |
No | Enable Shielded VM Secure Boot. |
| enable_integrity_monitoring | bool | true |
No | Enable Shielded VM integrity monitoring. |
| node_labels | map(string) | {} |
No | Kubernetes labels on nodes for nodeSelector/affinity. |
| node_taints | list(object) | [] |
No | Taints to keep general workloads off the pool. |
| network_tags | list(string) | [] |
No | GCE network tags for firewall targeting. |
| node_metadata | map(string) | {} |
No | Additional GCE instance metadata for nodes. |
| resource_labels | map(string) | {} |
No | GCE resource labels for billing/cost allocation. |
Outputs
| Name | Description |
|---|---|
| id | Fully qualified node pool ID (projects/…/nodePools/…). |
| name | Name of the node pool. |
| version | Kubernetes version currently running on the pool. |
| instance_group_urls | Managed instance group URLs backing the pool (one per zone). |
| managed_instance_group_urls | Managed instance group manager URLs for the pool. |
| service_account | Service account email the nodes run as. |
Enterprise scenario
A fintech running a regional GKE cluster in asia-south1 uses this module to split capacity by risk and cost. Real-time payment APIs land on an on-demand n2-standard-8 apps pool with min_node_count = 3 and BLUE_GREEN upgrades so a bad node image can be rolled back instantly without surge churn during settlement windows. Overnight reconciliation and fraud-model retraining run on a tainted c2-standard-16 Spot pool that scales from 0 to 20 with location_policy = ANY, cutting batch compute spend by roughly 70 percent while the NO_SCHEDULE taint guarantees latency-sensitive Pods never land there. Because the pools are separate module instances, the platform team resizes or recreates the Spot pool during a machine-family migration without ever re-planning the payment-critical pool or the cluster itself.
Best practices
- Always pass a dedicated
node_service_account. Letting nodes use the Compute Engine default SA grants Editor on the whole project; a purpose-built SA with only the roles its Pods need (plus Workload Identity for app-level IAM) is the single biggest security win here. - Keep autoscaling and
node_countmutually exclusive. This module nullsnode_countwhenenable_autoscalingis true so the autoscaler owns the live count — never set a fixed count and expect autoscaling, and let theignore_changeslifecycle keep plans clean. - Use Spot pools with taints and
min_node_count = 0for batch. Spot nodes can be reclaimed in seconds; isolate fault-tolerant workloads onto them with aNO_SCHEDULEtaint andlocation_policy = ANY, and let the pool scale to zero so you pay nothing when idle. - Tune surge upgrades for your disruption budget.
max_surge = 1/max_unavailable = 0keeps full capacity during rolls (good for stateless apps); for latency-critical pools prefer BLUE_GREEN so you can soak and roll back. Pair both with PodDisruptionBudgets. - Right-size
max_pods_per_nodeto protect your IP space. GKE allocates a /24-equivalent per node from the Pod CIDR by default (110 Pods); lowering it on pools that run few large Pods can be the difference between a cluster that scales and one that exhausts its alias IP range. - Name pools by role, label by cost. Short role-based names (
apps,batch-spot,gpu-ml) keep nodeSelectors readable, whileresource_labelssuch ascost-centerandenvgive Finance clean per-pool billing attribution without a separate tagging pass.