Quick take — A reusable Terraform module for GCP Cloud Deploy: define a google_clouddeploy_delivery_pipeline with ordered dev/staging/prod targets, canary strategies, and required approvals as code. New here? Jump to the Quickstart below to deploy it in minutes; read on for how it works and when to reach for it.
Quickstart (copy-paste)
Minimal, runnable configuration — drop this in a .tf file and fill in the "..." placeholders (each required input is commented):
provider "google" {
project = "my-project"
region = "us-central1"
}
module "cloud_deploy" {
source = "git::https://dev.azure.com/teknohut/kloudvin/_git/terraform-modules//terraform-module-gcp-cloud-deploy?ref=v1.0.0"
project_id = "..." # GCP project ID that owns the pipeline and targets.
region = "..." # Region for the pipeline and targets (e.g. `asia-south1`…
pipeline_name = "..." # Pipeline name; prefixes each target name. Validated to …
stages = ["...", "..."] # Ordered promotion stages (first = dev, last = prod). Ea…
}
Then terraform init && terraform apply. Every other input has a sensible default — see Inputs below to override behaviour.
What this module is
Google Cloud Deploy is GCP’s managed, opinionated continuous-delivery service. Instead of hand-rolling promotion logic in your CI runner, you declare a delivery pipeline — an ordered series of targets (dev → staging → prod) — and Cloud Deploy renders your Skaffold manifests per target, creates immutable releases, and rolls them out to GKE, Cloud Run, or Anthos with optional canary strategies, approval gates, and automatic rollback. The render and deploy work runs server-side under a service account, so your CI system only ever needs to call gcloud deploy releases create.
The pieces that matter — the pipeline, each google_clouddeploy_target, the promotion ordering, who must approve a prod rollout, and the canary percentages — are exactly the things teams copy-paste and get subtly wrong. This module wraps google_clouddeploy_delivery_pipeline plus a variable-driven set of targets so every service in your estate ships through an identical, reviewed pipeline shape. You pass a list of stages; the module wires the serial_pipeline ordering, attaches per-stage deploy parameters, and emits the pipeline name your CI needs.
When to use it
- You deploy to GKE or Cloud Run and want managed progressive delivery (canary, phased rollout, verify) without scripting it in GitHub Actions or Cloud Build yourself.
- You need an auditable promotion path — dev to staging to prod — where production rollouts require a human approval recorded in GCP, not a Slack thumbs-up.
- You run many microservices that should all share one delivery contract (same stage names, same approval rules, same canary shape) generated from one module.
- You want release artifacts to be immutable and re-deployable — the same release object can be rolled out to the next target unchanged, giving you provable parity between environments.
Reach for something else if you only deploy a single static site (use a SWA/Cloud Storage pipeline) or if your rollout is a one-shot kubectl apply with no promotion concept — Cloud Deploy’s machinery is overhead there.
Module structure
terraform-module-gcp-cloud-deploy/
├── versions.tf
├── main.tf
├── variables.tf
└── outputs.tf
# versions.tf
terraform {
required_version = ">= 1.5.0"
required_providers {
google = {
source = "hashicorp/google"
version = "~> 5.0"
}
}
}
# main.tf
locals {
# Cloud Deploy requires at least one stage; the LAST stage is treated as prod
# for the purpose of attaching the production approval requirement.
prod_stage_id = var.stages[length(var.stages) - 1].target_id
common_labels = merge(
{
managed-by = "terraform"
module = "gcp-cloud-deploy"
},
var.labels,
)
}
# One target per stage (dev, staging, prod, ...). require_approval is driven
# per-stage so you can force a human gate before production.
resource "google_clouddeploy_target" "this" {
for_each = { for s in var.stages : s.target_id => s }
project = var.project_id
location = var.region
name = "${var.pipeline_name}-${each.value.target_id}"
description = "Cloud Deploy target ${each.value.target_id} for pipeline ${var.pipeline_name}"
require_approval = each.value.require_approval
# Exactly one runtime block is set per target depending on the deploy engine.
dynamic "gke" {
for_each = each.value.gke_cluster == null ? [] : [each.value.gke_cluster]
content {
cluster = gke.value
internal_ip = each.value.gke_internal_ip
}
}
dynamic "run" {
for_each = each.value.run_location == null ? [] : [each.value.run_location]
content {
location = run.value
}
}
# Per-target execution: which SA renders/deploys, and where state lives.
execution_configs {
usages = ["RENDER", "DEPLOY", "VERIFY"]
service_account = each.value.execution_service_account
artifact_storage = each.value.artifact_storage_bucket
execution_timeout = each.value.execution_timeout
}
labels = local.common_labels
}
resource "google_clouddeploy_delivery_pipeline" "this" {
project = var.project_id
location = var.region
name = var.pipeline_name
description = var.description
serial_pipeline {
dynamic "stages" {
for_each = var.stages
content {
target_id = google_clouddeploy_target.this[stages.value.target_id].name
profiles = stages.value.profiles
# Optional canary deployment strategy for this stage.
dynamic "strategy" {
for_each = stages.value.canary_percentages == null ? [] : [stages.value]
content {
canary {
runtime_config {
kubernetes {
gateway_service_mesh {
http_route = "${var.pipeline_name}-route"
service = stages.value.canary_service
deployment = stages.value.canary_deployment
pod_selector_label = "app"
}
}
}
canary_deployment {
percentages = stages.value.canary_percentages
verify = stages.value.canary_verify
}
}
}
}
# Deploy parameters injected into Skaffold render for this stage.
dynamic "deploy_parameters" {
for_each = length(stages.value.deploy_parameters) == 0 ? [] : [stages.value.deploy_parameters]
content {
values = deploy_parameters.value
}
}
}
}
}
labels = local.common_labels
# Suspend halts new rollouts without destroying the pipeline (incident freeze).
suspended = var.suspended
}
# variables.tf
variable "project_id" {
type = string
description = "GCP project ID that owns the delivery pipeline and targets."
}
variable "region" {
type = string
description = "Region for the Cloud Deploy pipeline and its targets (e.g. asia-south1)."
validation {
condition = can(regex("^[a-z]+-[a-z]+[0-9]$", var.region))
error_message = "region must be a valid GCP region such as asia-south1 or us-central1."
}
}
variable "pipeline_name" {
type = string
description = "Name of the delivery pipeline. Used as a prefix for each target name."
validation {
condition = can(regex("^[a-z]([-a-z0-9]{0,61}[a-z0-9])?$", var.pipeline_name))
error_message = "pipeline_name must be 1-63 chars, lowercase letters, digits or hyphens, starting with a letter."
}
}
variable "description" {
type = string
description = "Human-readable description of the delivery pipeline."
default = "Managed by Terraform."
}
variable "stages" {
description = <<-EOT
Ordered list of promotion stages. Order defines the promotion sequence
(first = dev, last = prod). Set exactly one of gke_cluster or run_location
per stage to select the deploy engine.
EOT
type = list(object({
target_id = string
profiles = optional(list(string), [])
require_approval = optional(bool, false)
execution_service_account = optional(string)
artifact_storage_bucket = optional(string)
execution_timeout = optional(string, "3600s")
# GKE target (mutually exclusive with run_location).
gke_cluster = optional(string)
gke_internal_ip = optional(bool, false)
# Cloud Run target (mutually exclusive with gke_cluster).
run_location = optional(string)
# Optional canary strategy (GKE gateway service mesh).
canary_percentages = optional(list(number))
canary_verify = optional(bool, false)
canary_service = optional(string)
canary_deployment = optional(string)
deploy_parameters = optional(map(string), {})
}))
validation {
condition = length(var.stages) >= 1
error_message = "At least one stage is required."
}
validation {
condition = alltrue([
for s in var.stages : (s.gke_cluster != null) != (s.run_location != null)
])
error_message = "Each stage must set exactly one of gke_cluster or run_location."
}
validation {
condition = alltrue([
for s in var.stages :
s.canary_percentages == null ? true : alltrue([for p in s.canary_percentages : p > 0 && p < 100])
])
error_message = "canary_percentages must be values strictly between 0 and 100 (100 is implied as the final phase)."
}
}
variable "suspended" {
type = bool
description = "When true, the pipeline accepts no new rollouts. Use as an incident freeze switch."
default = false
}
variable "labels" {
type = map(string)
description = "Additional labels applied to the pipeline and all targets."
default = {}
}
# outputs.tf
output "pipeline_id" {
description = "Fully qualified resource ID of the delivery pipeline."
value = google_clouddeploy_delivery_pipeline.this.id
}
output "pipeline_name" {
description = "Name of the delivery pipeline (pass this to `gcloud deploy releases create`)."
value = google_clouddeploy_delivery_pipeline.this.name
}
output "pipeline_uid" {
description = "Server-generated unique identifier of the pipeline."
value = google_clouddeploy_delivery_pipeline.this.uid
}
output "target_ids" {
description = "Map of stage target_id => fully qualified Cloud Deploy target resource ID."
value = { for k, t in google_clouddeploy_target.this : k => t.target_id }
}
output "target_names" {
description = "Map of stage target_id => target resource name."
value = { for k, t in google_clouddeploy_target.this : k => t.name }
}
How to use it
module "cloud_deploy" {
source = "git::https://dev.azure.com/teknohut/kloudvin/_git/terraform-modules//terraform-module-gcp-cloud-deploy?ref=v1.0.0"
project_id = "kloudvin-prod-1a2b"
region = "asia-south1"
pipeline_name = "checkout-api"
description = "Progressive delivery for the checkout API onto GKE."
stages = [
{
target_id = "dev"
gke_cluster = "projects/kloudvin-prod-1a2b/locations/asia-south1/clusters/dev-gke"
execution_service_account = google_service_account.deployer.email
artifact_storage_bucket = "gs://kloudvin-clouddeploy-artifacts"
profiles = ["dev"]
},
{
target_id = "staging"
gke_cluster = "projects/kloudvin-prod-1a2b/locations/asia-south1/clusters/staging-gke"
execution_service_account = google_service_account.deployer.email
artifact_storage_bucket = "gs://kloudvin-clouddeploy-artifacts"
profiles = ["staging"]
},
{
target_id = "prod"
gke_cluster = "projects/kloudvin-prod-1a2b/locations/asia-south1/clusters/prod-gke"
execution_service_account = google_service_account.deployer.email
artifact_storage_bucket = "gs://kloudvin-clouddeploy-artifacts"
profiles = ["prod"]
require_approval = true # human gate before prod
canary_percentages = [25, 50] # 25% -> 50% -> 100%
canary_verify = true
canary_service = "checkout-api"
canary_deployment = "checkout-api"
},
]
labels = {
team = "payments"
service = "checkout"
}
}
# Downstream: grant the GitHub Actions release-creator SA the
# clouddeploy.releaser role scoped to THIS pipeline, using an output.
resource "google_clouddeploy_delivery_pipeline_iam_member" "ci_releaser" {
project = "kloudvin-prod-1a2b"
location = "asia-south1"
name = module.cloud_deploy.pipeline_name
role = "roles/clouddeploy.releaser"
member = "serviceAccount:gh-actions@kloudvin-prod-1a2b.iam.gserviceaccount.com"
}
With Terragrunt
Terragrunt keeps this module DRY across environments — define the backend and provider once in a root config, then a thin terragrunt.hcl per environment supplies only the inputs that differ.
1. Root config — live/terragrunt.hcl (inherited by every module):
remote_state {
backend = "gcs"
generate = { path = "backend.tf", if_exists = "overwrite" }
config = {
# ...gcs state bucket/container + key per path...
}
}
2. Module config — live/prod/cloud_deploy/terragrunt.hcl:
include "root" {
path = find_in_parent_folders()
}
terraform {
source = "git::https://dev.azure.com/teknohut/kloudvin/_git/terraform-modules//terraform-module-gcp-cloud-deploy?ref=v1.0.0"
}
inputs = {
project_id = "..."
region = "..."
pipeline_name = "..."
stages = ["...", "..."]
}
3. Deploy one environment, or roll out all modules together:
cd live/prod/cloud_deploy && terragrunt apply # this module
terragrunt run-all apply # every module under live/prod
Why Terragrunt here: the backend and provider live in one place instead of being copy-pasted into every module; inputs is overridden per environment (dev / stage / prod) without forking the module; and run-all orchestrates dependencies across modules. Reach for it once you have more than one environment or more than a handful of modules — for a single stack, the plain Quickstart above is enough.
Inputs
| Name | Type | Default | Required | Description |
|---|---|---|---|---|
project_id |
string |
— | Yes | GCP project ID that owns the pipeline and targets. |
region |
string |
— | Yes | Region for the pipeline and targets (e.g. asia-south1). Validated against GCP region pattern. |
pipeline_name |
string |
— | Yes | Pipeline name; prefixes each target name. Validated to 1–63 lowercase chars. |
description |
string |
"Managed by Terraform." |
No | Human-readable pipeline description. |
stages |
list(object) |
— | Yes | Ordered promotion stages (first = dev, last = prod). Each sets exactly one of gke_cluster/run_location; supports require_approval, canary fields, profiles, and deploy_parameters. |
suspended |
bool |
false |
No | Freeze switch — when true no new rollouts are accepted. |
labels |
map(string) |
{} |
No | Extra labels merged onto the pipeline and every target. |
Outputs
| Name | Description |
|---|---|
pipeline_id |
Fully qualified resource ID of the delivery pipeline. |
pipeline_name |
Pipeline name to pass to gcloud deploy releases create. |
pipeline_uid |
Server-generated unique identifier of the pipeline. |
target_ids |
Map of target_id => fully qualified Cloud Deploy target resource ID. |
target_names |
Map of target_id => target resource name. |
Enterprise scenario
A payments platform runs 40+ microservices on three GKE clusters across asia-south1. The platform team publishes this module at v1.0.0 and each service team instantiates it with the same three-stage shape, so every service promotes dev → staging → prod identically, prod always carries require_approval = true, and prod rollouts go canary at 25% then 50% with automated verify before reaching 100%. Because the release object is immutable, the exact artifact validated in staging is the one promoted to prod, and the approval is recorded in GCP audit logs — satisfying the PCI change-control evidence their auditors ask for each quarter.
Best practices
- Force
require_approval = trueon the final stage and scoperoles/clouddeploy.approverto a small group — separating who can promote from who can approve is your production change-control gate. - Give each target its own least-privilege execution service account via
execution_service_account; the dev target’s SA should not be able to deploy to the prod cluster. Never let targets share a broad SA. - Set an explicit
artifact_storage_bucketwith a lifecycle rule (e.g. 90-day deletion) — render artifacts accumulate per release and silently grow Cloud Storage cost if left in the default bucket. - Use
verifywith your canary phases so a failed smoke test auto-aborts the rollout before 100%, and keepexecution_timeouttight to avoid hung jobs holding Cloud Build/worker capacity. - Name pipelines after the service, not the environment (
checkout-api, notprod-deploy) — one pipeline spans all environments, so an env-suffixed name is misleading and breaks the one-pipeline-per-service model. - Flip
suspended = trueduring incident freezes instead of deleting the pipeline; it blocks new rollouts while preserving history and rollback ability, and is a clean Terraform-tracked switch.