IaC GCP

Terraform Module: GCP Cloud Run — Production-Ready Serverless Containers in One Block

Quick take — A reusable hashicorp/google ~> 5.0 module for google_cloud_run_v2_service: autoscaling, concurrency, secrets from Secret Manager, VPC egress, health probes, and a least-privilege runtime service account. New here? Jump to the Quickstart below to deploy it in minutes; read on for how it works and when to reach for it.

Quickstart (copy-paste)

Minimal, runnable configuration — drop this in a .tf file and fill in the "..." placeholders (each required input is commented):

provider "google" {
  project = "my-project"
  region  = "us-central1"
}

module "cloud_run" {
  source = "git::https://dev.azure.com/teknohut/kloudvin/_git/terraform-modules//terraform-module-gcp-cloud-run?ref=v1.0.0"

  project_id = "..."  # GCP project ID hosting the service.
  name       = "..."  # Service name; RFC1035, lowercase, <= 49 chars.
  location   = "..."  # Region, e.g. `asia-south1`.
  image      = "..."  # Container image, ideally pinned by digest.
}

Then terraform init && terraform apply. Every other input has a sensible default — see Inputs below to override behaviour.

What this module is

Cloud Run is GCP’s fully managed serverless container platform. You hand it a container image, and it runs that image behind an HTTPS endpoint, scaling the number of instances from zero up to your ceiling based on incoming traffic — you pay (by default) only while a request is being served. There are no nodes to patch, no autoscaler to tune, and no load balancer to wire up for the basic case.

The trouble is that a correct Cloud Run service is rarely just an image and a port. In production you almost always need: a dedicated runtime service account (not the default Compute SA with project-wide Editor), CPU/memory limits, an autoscaling floor and ceiling, request concurrency tuning, secrets injected from Secret Manager rather than baked into the image, startup/liveness probes so bad revisions never take traffic, and frequently private egress through a VPC connector to reach Cloud SQL or internal APIs. Hand-writing the google_cloud_run_v2_service block for every service means every team re-derives those settings — and gets the security-sensitive ones subtly wrong.

This module wraps google_cloud_run_v2_service (the v2 / Knative-free API) into a single, opinionated, variable-driven block. It creates a least-privilege runtime service account, wires Secret Manager references as environment variables, sets sane resource and scaling defaults, and exposes the service URL and revision name as outputs so downstream resources (a load balancer, a DNS record, a Pub/Sub push subscription) can consume them.

When to use it

Skip it if you need long-lived stateful workloads, GPU/TPU batch jobs better suited to Cloud Run Jobs or GKE, or sub-millisecond cold-start guarantees that only always-on infrastructure provides.

Module structure

terraform-module-gcp-cloud-run/
├── versions.tf      # provider + required_version pins
├── main.tf          # runtime SA, IAM, the v2 service, invoker binding
├── variables.tf     # var-driven inputs with validation
└── outputs.tf       # id/name, url, revision, service account email

versions.tf

terraform {
  required_version = ">= 1.5.0"

  required_providers {
    google = {
      source  = "hashicorp/google"
      version = "~> 5.0"
    }
  }
}

main.tf

locals {
  # A stable, predictable runtime SA id derived from the service name.
  # SA account_id must be 6-30 chars, lowercase, start with a letter.
  service_account_id = substr("${var.name}-run", 0, 30)
}

# Dedicated least-privilege runtime identity for this service.
resource "google_service_account" "runtime" {
  count = var.create_service_account ? 1 : 0

  project      = var.project_id
  account_id   = local.service_account_id
  display_name = "Cloud Run runtime SA for ${var.name}"
  description  = "Identity assumed by the ${var.name} Cloud Run service at runtime."
}

locals {
  runtime_sa_email = var.create_service_account ? google_service_account.runtime[0].email : var.service_account_email
}

# Allow the runtime SA to read each referenced secret. The module only grants
# access to the exact secrets the service consumes, at the secret level.
resource "google_secret_manager_secret_iam_member" "runtime_accessor" {
  for_each = { for s in var.secret_env : s.name => s }

  project   = var.project_id
  secret_id = each.value.secret_id
  role      = "roles/secretmanager.secretAccessor"
  member    = "serviceAccount:${local.runtime_sa_email}"
}

resource "google_cloud_run_v2_service" "this" {
  project             = var.project_id
  name                = var.name
  location            = var.location
  ingress             = var.ingress
  deletion_protection = var.deletion_protection

  labels = var.labels

  template {
    service_account                  = local.runtime_sa_email
    timeout                          = "${var.request_timeout_seconds}s"
    max_instance_request_concurrency = var.max_concurrency
    execution_environment            = var.execution_environment

    scaling {
      min_instance_count = var.min_instances
      max_instance_count = var.max_instances
    }

    # Optional private egress into a VPC (Cloud SQL private IP, internal APIs).
    dynamic "vpc_access" {
      for_each = var.vpc_connector == null && length(var.network_interfaces) == 0 ? [] : [1]
      content {
        connector = var.vpc_connector
        egress    = var.vpc_egress

        dynamic "network_interfaces" {
          for_each = var.network_interfaces
          content {
            network    = network_interfaces.value.network
            subnetwork = network_interfaces.value.subnetwork
            tags       = lookup(network_interfaces.value, "tags", null)
          }
        }
      }
    }

    containers {
      image = var.image

      dynamic "ports" {
        for_each = var.container_port == null ? [] : [var.container_port]
        content {
          container_port = ports.value
        }
      }

      resources {
        limits            = var.resource_limits
        cpu_idle          = var.cpu_idle
        startup_cpu_boost = var.startup_cpu_boost
      }

      # Plain (non-secret) environment variables.
      dynamic "env" {
        for_each = var.env
        content {
          name  = env.key
          value = env.value
        }
      }

      # Secret-backed environment variables sourced from Secret Manager.
      dynamic "env" {
        for_each = { for s in var.secret_env : s.name => s }
        content {
          name = env.value.name
          value_source {
            secret_key_ref {
              secret  = env.value.secret_id
              version = lookup(env.value, "version", "latest")
            }
          }
        }
      }

      # Startup probe: a revision only receives traffic once this passes.
      dynamic "startup_probe" {
        for_each = var.startup_probe_path == null ? [] : [1]
        content {
          initial_delay_seconds = var.startup_probe_initial_delay
          period_seconds        = var.startup_probe_period
          failure_threshold     = var.startup_probe_failure_threshold
          timeout_seconds       = var.startup_probe_timeout
          http_get {
            path = var.startup_probe_path
            port = var.container_port
          }
        }
      }

      # Liveness probe: a failing instance is restarted.
      dynamic "liveness_probe" {
        for_each = var.liveness_probe_path == null ? [] : [1]
        content {
          period_seconds    = var.liveness_probe_period
          failure_threshold = var.liveness_probe_failure_threshold
          timeout_seconds   = var.liveness_probe_timeout
          http_get {
            path = var.liveness_probe_path
            port = var.container_port
          }
        }
      }
    }
  }

  # Traffic always points at the latest healthy revision unless overridden.
  traffic {
    type    = "TRAFFIC_TARGET_ALLOCATION_TYPE_LATEST"
    percent = 100
  }
}

# Who may invoke the service. For a public endpoint, pass ["allUsers"].
# For internal-only, pass the calling service accounts as members.
resource "google_cloud_run_v2_service_iam_member" "invokers" {
  for_each = toset(var.invokers)

  project  = var.project_id
  location = google_cloud_run_v2_service.this.location
  name     = google_cloud_run_v2_service.this.name
  role     = "roles/run.invoker"
  member   = each.value
}

variables.tf

variable "project_id" {
  description = "GCP project ID that hosts the Cloud Run service."
  type        = string
}

variable "name" {
  description = "Cloud Run service name (lowercase, RFC1035: letters, digits, hyphens; <= 49 chars)."
  type        = string

  validation {
    condition     = can(regex("^[a-z]([-a-z0-9]*[a-z0-9])?$", var.name)) && length(var.name) <= 49
    error_message = "name must be lowercase RFC1035 (start with a letter, hyphens allowed) and <= 49 chars."
  }
}

variable "location" {
  description = "Region for the service, e.g. asia-south1, europe-west1, us-central1."
  type        = string
}

variable "image" {
  description = "Fully qualified container image, ideally pinned by digest (e.g. REGION-docker.pkg.dev/PROJ/REPO/app@sha256:...)."
  type        = string
}

variable "container_port" {
  description = "Port the container listens on. Set null to use Cloud Run's default ($PORT, 8080)."
  type        = number
  default     = 8080
}

variable "resource_limits" {
  description = "CPU and memory limits for the container. Memory must be >= 512Mi when cpu < 1."
  type        = map(string)
  default = {
    cpu    = "1"
    memory = "512Mi"
  }
}

variable "cpu_idle" {
  description = "If true, CPU is throttled when no request is in flight (request-based billing). Set false for always-allocated CPU (background work)."
  type        = bool
  default     = true
}

variable "startup_cpu_boost" {
  description = "Temporarily double CPU during container startup to reduce cold-start latency."
  type        = bool
  default     = true
}

variable "min_instances" {
  description = "Minimum number of warm instances. 0 allows scale-to-zero; >= 1 removes cold starts at a cost."
  type        = number
  default     = 0

  validation {
    condition     = var.min_instances >= 0
    error_message = "min_instances must be >= 0."
  }
}

variable "max_instances" {
  description = "Maximum number of instances the service may scale to."
  type        = number
  default     = 10

  validation {
    condition     = var.max_instances >= 1
    error_message = "max_instances must be >= 1."
  }
}

variable "max_concurrency" {
  description = "Max concurrent requests per instance (1-1000). Lower for CPU-bound apps, higher for I/O-bound."
  type        = number
  default     = 80

  validation {
    condition     = var.max_concurrency >= 1 && var.max_concurrency <= 1000
    error_message = "max_concurrency must be between 1 and 1000."
  }
}

variable "request_timeout_seconds" {
  description = "Maximum request duration in seconds (1-3600)."
  type        = number
  default     = 300

  validation {
    condition     = var.request_timeout_seconds >= 1 && var.request_timeout_seconds <= 3600
    error_message = "request_timeout_seconds must be between 1 and 3600."
  }
}

variable "execution_environment" {
  description = "Sandbox generation: EXECUTION_ENVIRONMENT_GEN1 or EXECUTION_ENVIRONMENT_GEN2 (gen2 needed for NFS/some syscalls)."
  type        = string
  default     = "EXECUTION_ENVIRONMENT_GEN2"

  validation {
    condition     = contains(["EXECUTION_ENVIRONMENT_GEN1", "EXECUTION_ENVIRONMENT_GEN2"], var.execution_environment)
    error_message = "execution_environment must be EXECUTION_ENVIRONMENT_GEN1 or EXECUTION_ENVIRONMENT_GEN2."
  }
}

variable "ingress" {
  description = "Ingress setting: INGRESS_TRAFFIC_ALL, INGRESS_TRAFFIC_INTERNAL_ONLY, or INGRESS_TRAFFIC_INTERNAL_LOAD_BALANCER."
  type        = string
  default     = "INGRESS_TRAFFIC_ALL"

  validation {
    condition = contains([
      "INGRESS_TRAFFIC_ALL",
      "INGRESS_TRAFFIC_INTERNAL_ONLY",
      "INGRESS_TRAFFIC_INTERNAL_LOAD_BALANCER",
    ], var.ingress)
    error_message = "ingress must be one of INGRESS_TRAFFIC_ALL, INGRESS_TRAFFIC_INTERNAL_ONLY, INGRESS_TRAFFIC_INTERNAL_LOAD_BALANCER."
  }
}

variable "invokers" {
  description = "IAM members granted roles/run.invoker. Use [\"allUsers\"] for a public endpoint, or specific service accounts for private."
  type        = list(string)
  default     = []
}

variable "create_service_account" {
  description = "Create a dedicated runtime service account. If false, you must supply service_account_email."
  type        = bool
  default     = true
}

variable "service_account_email" {
  description = "Existing runtime SA email to use when create_service_account is false."
  type        = string
  default     = null

  validation {
    condition     = var.create_service_account || var.service_account_email != null
    error_message = "service_account_email is required when create_service_account is false."
  }
}

variable "env" {
  description = "Plain (non-secret) environment variables as a name => value map."
  type        = map(string)
  default     = {}
}

variable "secret_env" {
  description = "Secret-backed env vars from Secret Manager. Each: { name, secret_id, version }. version defaults to 'latest'."
  type = list(object({
    name      = string
    secret_id = string
    version   = optional(string, "latest")
  }))
  default = []
}

variable "vpc_connector" {
  description = "Serverless VPC Access connector ID for private egress. Mutually exclusive with network_interfaces (Direct VPC egress)."
  type        = string
  default     = null
}

variable "network_interfaces" {
  description = "Direct VPC egress interfaces. Each: { network, subnetwork, tags }. Leave empty to use vpc_connector or no VPC."
  type = list(object({
    network    = string
    subnetwork = string
    tags       = optional(list(string))
  }))
  default = []
}

variable "vpc_egress" {
  description = "Egress mode when a VPC is attached: ALL_TRAFFIC or PRIVATE_RANGES_ONLY."
  type        = string
  default     = "PRIVATE_RANGES_ONLY"

  validation {
    condition     = contains(["ALL_TRAFFIC", "PRIVATE_RANGES_ONLY"], var.vpc_egress)
    error_message = "vpc_egress must be ALL_TRAFFIC or PRIVATE_RANGES_ONLY."
  }
}

variable "startup_probe_path" {
  description = "HTTP path for the startup probe (e.g. /healthz). Null disables the probe."
  type        = string
  default     = null
}

variable "startup_probe_initial_delay" {
  description = "Seconds to wait before the first startup probe."
  type        = number
  default     = 0
}

variable "startup_probe_period" {
  description = "Seconds between startup probes."
  type        = number
  default     = 10
}

variable "startup_probe_failure_threshold" {
  description = "Consecutive startup probe failures before the revision is marked failed."
  type        = number
  default     = 3
}

variable "startup_probe_timeout" {
  description = "Per-attempt startup probe timeout in seconds."
  type        = number
  default     = 3
}

variable "liveness_probe_path" {
  description = "HTTP path for the liveness probe (e.g. /healthz). Null disables the probe."
  type        = string
  default     = null
}

variable "liveness_probe_period" {
  description = "Seconds between liveness probes."
  type        = number
  default     = 30
}

variable "liveness_probe_failure_threshold" {
  description = "Consecutive liveness probe failures before the instance is restarted."
  type        = number
  default     = 3
}

variable "liveness_probe_timeout" {
  description = "Per-attempt liveness probe timeout in seconds."
  type        = number
  default     = 3
}

variable "deletion_protection" {
  description = "Block accidental deletion of the service via Terraform."
  type        = bool
  default     = true
}

variable "labels" {
  description = "Labels applied to the Cloud Run service."
  type        = map(string)
  default     = {}
}

outputs.tf

output "id" {
  description = "Fully qualified Cloud Run service ID."
  value       = google_cloud_run_v2_service.this.id
}

output "name" {
  description = "Name of the Cloud Run service."
  value       = google_cloud_run_v2_service.this.name
}

output "uri" {
  description = "Public HTTPS URL of the service (run.app or custom)."
  value       = google_cloud_run_v2_service.this.uri
}

output "location" {
  description = "Region the service is deployed in."
  value       = google_cloud_run_v2_service.this.location
}

output "latest_ready_revision" {
  description = "Name of the latest revision that is serving / ready."
  value       = google_cloud_run_v2_service.this.latest_ready_revision
}

output "service_account_email" {
  description = "Runtime service account email used by the service."
  value       = local.runtime_sa_email
}

How to use it

# Secret created/managed elsewhere; the module is granted accessor on it.
resource "google_secret_manager_secret" "db_url" {
  project   = var.project_id
  secret_id = "orders-api-db-url"
  replication {
    auto {}
  }
}

module "cloud_run" {
  source = "git::https://dev.azure.com/teknohut/kloudvin/_git/terraform-modules//terraform-module-gcp-cloud-run?ref=v1.0.0"

  project_id = var.project_id
  name       = "orders-api"
  location   = "asia-south1"

  # Pin by digest in real pipelines; tag shown for readability.
  image = "asia-south1-docker.pkg.dev/${var.project_id}/services/orders-api:1.8.2"

  container_port  = 8080
  min_instances   = 1     # keep one warm instance to avoid cold starts on a customer-facing API
  max_instances   = 30
  max_concurrency = 60

  resource_limits = {
    cpu    = "2"
    memory = "1Gi"
  }

  env = {
    LOG_LEVEL = "info"
    REGION    = "asia-south1"
  }

  secret_env = [
    {
      name      = "DATABASE_URL"
      secret_id = google_secret_manager_secret.db_url.secret_id
      version   = "latest"
    },
  ]

  # Private egress to Cloud SQL over the VPC.
  vpc_connector = "projects/${var.project_id}/locations/asia-south1/connectors/run-conn"
  vpc_egress    = "PRIVATE_RANGES_ONLY"

  startup_probe_path  = "/healthz"
  liveness_probe_path = "/healthz"

  # Fronted by an external HTTPS LB, so keep ingress restricted to the LB.
  ingress  = "INGRESS_TRAFFIC_INTERNAL_LOAD_BALANCER"
  invokers = ["allUsers"]

  labels = {
    team        = "payments"
    environment = "prod"
  }
}

# Downstream: attach the service to an external HTTPS Load Balancer via a Serverless NEG.
resource "google_compute_region_network_endpoint_group" "orders_neg" {
  project               = var.project_id
  name                  = "orders-api-neg"
  region                = "asia-south1"
  network_endpoint_type = "SERVERLESS"

  cloud_run {
    service = module.cloud_run.name # <- module output wires the NEG to the service
  }
}

With Terragrunt

Terragrunt keeps this module DRY across environments — define the backend and provider once in a root config, then a thin terragrunt.hcl per environment supplies only the inputs that differ.

1. Root configlive/terragrunt.hcl (inherited by every module):

remote_state {
  backend = "gcs"
  generate = { path = "backend.tf", if_exists = "overwrite" }
  config = {
    # ...gcs state bucket/container + key per path...
  }
}

2. Module configlive/prod/cloud_run/terragrunt.hcl:

include "root" {
  path = find_in_parent_folders()
}

terraform {
  source = "git::https://dev.azure.com/teknohut/kloudvin/_git/terraform-modules//terraform-module-gcp-cloud-run?ref=v1.0.0"
}

inputs = {
  project_id = "..."
  name = "..."
  location = "..."
  image = "..."
}

3. Deploy one environment, or roll out all modules together:

cd live/prod/cloud_run && terragrunt apply        # this module
terragrunt run-all apply                      # every module under live/prod

Why Terragrunt here: the backend and provider live in one place instead of being copy-pasted into every module; inputs is overridden per environment (dev / stage / prod) without forking the module; and run-all orchestrates dependencies across modules. Reach for it once you have more than one environment or more than a handful of modules — for a single stack, the plain Quickstart above is enough.

Inputs

Name Type Default Required Description
project_id string yes GCP project ID hosting the service.
name string yes Service name; RFC1035, lowercase, <= 49 chars.
location string yes Region, e.g. asia-south1.
image string yes Container image, ideally pinned by digest.
container_port number 8080 no Port the container listens on; null for Cloud Run default.
resource_limits map(string) {cpu="1",memory="512Mi"} no CPU/memory limits per instance.
cpu_idle bool true no Throttle CPU between requests (request-based billing).
startup_cpu_boost bool true no Double CPU during startup to cut cold-start latency.
min_instances number 0 no Warm instance floor; 0 allows scale-to-zero.
max_instances number 10 no Instance ceiling.
max_concurrency number 80 no Concurrent requests per instance (1-1000).
request_timeout_seconds number 300 no Max request duration (1-3600).
execution_environment string EXECUTION_ENVIRONMENT_GEN2 no Sandbox generation (gen1/gen2).
ingress string INGRESS_TRAFFIC_ALL no All / internal-only / internal-LB ingress.
invokers list(string) [] no IAM members granted roles/run.invoker.
create_service_account bool true no Create a dedicated runtime SA.
service_account_email string null no Existing runtime SA email when not creating one.
env map(string) {} no Plain environment variables.
secret_env list(object) [] no Secret Manager-backed env vars {name, secret_id, version}.
vpc_connector string null no Serverless VPC Access connector ID for private egress.
network_interfaces list(object) [] no Direct VPC egress interfaces {network, subnetwork, tags}.
vpc_egress string PRIVATE_RANGES_ONLY no Egress mode when a VPC is attached.
startup_probe_path string null no Startup probe HTTP path; null disables.
startup_probe_initial_delay number 0 no Delay before first startup probe.
startup_probe_period number 10 no Seconds between startup probes.
startup_probe_failure_threshold number 3 no Failures before a revision is marked failed.
startup_probe_timeout number 3 no Per-attempt startup probe timeout.
liveness_probe_path string null no Liveness probe HTTP path; null disables.
liveness_probe_period number 30 no Seconds between liveness probes.
liveness_probe_failure_threshold number 3 no Failures before an instance restarts.
liveness_probe_timeout number 3 no Per-attempt liveness probe timeout.
deletion_protection bool true no Block Terraform deletion of the service.
labels map(string) {} no Labels on the service.

Outputs

Name Description
id Fully qualified Cloud Run service ID.
name Service name.
uri Public HTTPS URL (run.app or custom domain).
location Region the service runs in.
latest_ready_revision Name of the latest ready/serving revision.
service_account_email Runtime service account email used by the service.

Enterprise scenario

A payments platform runs roughly 40 internal microservices behind a single external HTTPS Load Balancer. Each team owns a thin root module that calls this Cloud Run module once per service, setting ingress = "INGRESS_TRAFFIC_INTERNAL_LOAD_BALANCER" so the only public path is through the LB (where Cloud Armor and WAF rules live), pulling DB credentials and API keys from Secret Manager via secret_env, and reaching private Cloud SQL through a shared VPC connector. Because every service gets its own runtime service account with accessor rights on only its own secrets, a compromised container cannot read another team’s credentials, and the platform team can audit the entire fleet’s IAM surface from one consistent pattern.

Best practices

TerraformGCPCloud RunModuleIaC
Need this built for real?

Vinod is a Senior Cloud Architect (22+ yrs) — available for Azure / AWS / GCP architecture, landing zones, and migrations.

Work with me

Comments

Keep Reading