Terraform Module: GCP Cloud Tasks — production-ready async queues with tuned rate and retry policy

Quick take — A reusable Terraform module for google_cloud_tasks_queue on hashicorp/google ~> 5.0: var-driven rate limits, retry config, HTTP target routing, and IAM enqueuer bindings for production async dispatch. New here? Jump to the Quickstart below to deploy it in minutes; read on for how it works and when to reach for it.

Quickstart (copy-paste)

Minimal, runnable configuration — drop this in a .tf file and fill in the "..." placeholders (each required input is commented):

provider "google" {
  project = "my-project"
  region  = "us-central1"
}

module "cloud_tasks" {
  source = "git::https://dev.azure.com/teknohut/kloudvin/_git/terraform-modules//terraform-module-gcp-cloud-tasks?ref=v1.0.0"

  project_id = "..."  # GCP project ID that hosts the queue.
  queue_name = "..."  # Queue ID (1-100 chars, lowercase, validated).
  location   = "..."  # Region for the queue; immutable after creation.
}

Then terraform init && terraform apply. Every other input has a sensible default — see Inputs below to override behaviour.

What this module is

Google Cloud Tasks is a fully managed queue for dispatching asynchronous work — it accepts a task, holds it durably, and then calls back into your own HTTP service (Cloud Run, GKE, an App Engine handler, or any reachable URL) at a rate you control, retrying with backoff until the handler returns a 2xx. Unlike Pub/Sub, which is a fan-out broadcast bus where subscribers pull, Cloud Tasks is a per-task, push-style dispatcher: each task is an explicit unit of work targeted at one endpoint, with its own schedule time, dedup name, and retry lifecycle. That makes it the natural fit for “do this thing later, exactly the way I asked, and don’t melt my backend doing it.”

The control surface that matters in production lives entirely on the queue resource: rate_limits (how fast tasks leave the queue and how many run concurrently) and retry_config (how aggressively failures are re-attempted). Getting these wrong is how teams either hammer a downstream database into the ground or let failed tasks silently spin for a week. Wrapping google_cloud_tasks_queue in a module lets you encode sane, reviewed defaults once, expose only the knobs teams actually tune, and bolt on the IAM binding that lets a service account enqueue tasks — so every queue across your estate is consistent, named to convention, and least-privilege by default.

When to use it

You need to offload slow or spiky work from a request path — sending email, transcoding, calling a flaky third-party API — without blocking the user.
You want explicit rate control into a downstream that can’t absorb bursts (a legacy SQL box, a vendor API with a strict QPS cap). Cloud Tasks’ max_dispatches_per_second is the throttle.
You need scheduled / deferred execution per item (schedule_time up to 30 days out) rather than a single cron, and deduplication by task name to make enqueues idempotent.
You are dispatching to an HTTP target (Cloud Run / GKE / external URL) with OIDC or OAuth auth, and want the retry/backoff handled for you instead of rebuilding it.
You are standing up many queues (per-tenant, per-priority, per-environment) and want them uniform and policy-compliant. Reach for Pub/Sub instead if you need fan-out to many consumers or pull semantics.

Module structure

terraform-module-gcp-cloud-tasks/
├── versions.tf
├── main.tf
├── variables.tf
└── outputs.tf

# versions.tf
terraform {
  required_version = ">= 1.5.0"

  required_providers {
    google = {
      source  = "hashicorp/google"
      version = "~> 5.0"
    }
  }
}

# main.tf

resource "google_cloud_tasks_queue" "this" {
  project  = var.project_id
  name     = var.queue_name
  location = var.location

  rate_limits {
    max_dispatches_per_second = var.max_dispatches_per_second
    max_concurrent_dispatches = var.max_concurrent_dispatches
  }

  retry_config {
    max_attempts       = var.retry_max_attempts
    max_retry_duration = var.retry_max_duration
    min_backoff        = var.retry_min_backoff
    max_backoff        = var.retry_max_backoff
    max_doublings      = var.retry_max_doublings
  }

  # Per-queue routing override for HTTP targets. When set, every task's
  # relative URI is resolved against this host, and auth is injected by
  # the queue (no per-task secrets). Only emitted when a host is supplied.
  dynamic "http_target" {
    for_each = var.http_target_host == null ? [] : [1]

    content {
      uri_override {
        scheme = var.http_target_scheme
        host   = var.http_target_host
        path_override {
          path = var.http_target_path
        }
        uri_override_enforce_mode = "ALWAYS"
      }

      # OIDC for Cloud Run / IAP-protected services; OAuth for Google APIs.
      dynamic "oidc_token" {
        for_each = var.http_oidc_service_account_email == null ? [] : [1]
        content {
          service_account_email = var.http_oidc_service_account_email
          audience              = var.http_oidc_audience
        }
      }
    }
  }

  # Optional Stackdriver sampling for per-task operation logging.
  dynamic "stackdriver_logging_config" {
    for_each = var.logging_sampling_ratio == null ? [] : [1]
    content {
      sampling_ratio = var.logging_sampling_ratio
    }
  }
}

# Least-privilege enqueue: grant roles/cloudtasks.enqueuer on THIS queue
# only, to the service accounts that are allowed to create tasks.
resource "google_cloud_tasks_queue_iam_member" "enqueuers" {
  for_each = toset(var.enqueuer_members)

  project  = google_cloud_tasks_queue.this.project
  location = google_cloud_tasks_queue.this.location
  name     = google_cloud_tasks_queue.this.name
  role     = "roles/cloudtasks.enqueuer"
  member   = each.value
}

# variables.tf

variable "project_id" {
  description = "GCP project ID that hosts the Cloud Tasks queue."
  type        = string
}

variable "queue_name" {
  description = "Queue ID. Lowercase letters, numbers and hyphens; 1-100 chars."
  type        = string

  validation {
    condition     = can(regex("^[a-z]([a-z0-9-]{0,98}[a-z0-9])?$", var.queue_name))
    error_message = "queue_name must be 1-100 chars, start with a letter, and contain only lowercase letters, numbers and hyphens."
  }
}

variable "location" {
  description = "Region for the queue (e.g. us-central1, europe-west1, asia-south1). Immutable after creation."
  type        = string
}

# ---- Rate limits ----------------------------------------------------------

variable "max_dispatches_per_second" {
  description = "Sustained dispatch rate ceiling for the queue (token-bucket QPS). Use this to protect a downstream from bursts."
  type        = number
  default     = 500

  validation {
    condition     = var.max_dispatches_per_second > 0 && var.max_dispatches_per_second <= 500
    error_message = "max_dispatches_per_second must be in the range (0, 500]."
  }
}

variable "max_concurrent_dispatches" {
  description = "Maximum number of tasks running at once across the queue. Cap this to the concurrency your handler can survive."
  type        = number
  default     = 1000

  validation {
    condition     = var.max_concurrent_dispatches >= 1 && var.max_concurrent_dispatches <= 5000
    error_message = "max_concurrent_dispatches must be between 1 and 5000."
  }
}

# ---- Retry config ---------------------------------------------------------

variable "retry_max_attempts" {
  description = "Number of attempts per task including the first. -1 means unlimited (bounded only by max_retry_duration)."
  type        = number
  default     = 10

  validation {
    condition     = var.retry_max_attempts == -1 || var.retry_max_attempts >= 1
    error_message = "retry_max_attempts must be -1 (unlimited) or a positive integer."
  }
}

variable "retry_max_duration" {
  description = "Time limit for retrying a failed task, as a duration string (e.g. \"3600s\"). \"0s\" means no time limit."
  type        = string
  default     = "0s"

  validation {
    condition     = can(regex("^[0-9]+(\\.[0-9]+)?s$", var.retry_max_duration))
    error_message = "retry_max_duration must be a duration ending in 's', e.g. \"3600s\"."
  }
}

variable "retry_min_backoff" {
  description = "Minimum wait before retrying a failed task, as a duration string (e.g. \"0.1s\")."
  type        = string
  default     = "0.1s"

  validation {
    condition     = can(regex("^[0-9]+(\\.[0-9]+)?s$", var.retry_min_backoff))
    error_message = "retry_min_backoff must be a duration ending in 's', e.g. \"0.1s\"."
  }
}

variable "retry_max_backoff" {
  description = "Maximum wait between retries, as a duration string (e.g. \"3600s\")."
  type        = string
  default     = "3600s"

  validation {
    condition     = can(regex("^[0-9]+(\\.[0-9]+)?s$", var.retry_max_backoff))
    error_message = "retry_max_backoff must be a duration ending in 's', e.g. \"3600s\"."
  }
}

variable "retry_max_doublings" {
  description = "How many times the retry interval doubles before increasing linearly. Controls the backoff curve shape."
  type        = number
  default     = 16

  validation {
    condition     = var.retry_max_doublings >= 0 && var.retry_max_doublings <= 16
    error_message = "retry_max_doublings must be between 0 and 16."
  }
}

# ---- HTTP target routing (optional) --------------------------------------

variable "http_target_host" {
  description = "Host to route all tasks to (queue-level override), e.g. \"worker.run.app\". Null disables the override; tasks then carry their own absolute URL."
  type        = string
  default     = null
}

variable "http_target_scheme" {
  description = "URI scheme for the HTTP target override."
  type        = string
  default     = "HTTPS"

  validation {
    condition     = contains(["HTTP", "HTTPS"], var.http_target_scheme)
    error_message = "http_target_scheme must be HTTP or HTTPS."
  }
}

variable "http_target_path" {
  description = "Path the queue forces on every dispatched task when http_target_host is set, e.g. \"/tasks/process\"."
  type        = string
  default     = "/"
}

variable "http_oidc_service_account_email" {
  description = "Service account whose OIDC token the queue mints when calling the HTTP target (for Cloud Run / IAP). Null disables OIDC auth."
  type        = string
  default     = null
}

variable "http_oidc_audience" {
  description = "OIDC token audience. Usually the fully-qualified target URL; leave null to default to the request URL."
  type        = string
  default     = null
}

# ---- Logging --------------------------------------------------------------

variable "logging_sampling_ratio" {
  description = "Fraction (0.0-1.0) of task operations written to Cloud Logging. Null disables Stackdriver logging config."
  type        = number
  default     = null

  validation {
    condition     = var.logging_sampling_ratio == null || (var.logging_sampling_ratio >= 0.0 && var.logging_sampling_ratio <= 1.0)
    error_message = "logging_sampling_ratio must be between 0.0 and 1.0."
  }
}

# ---- IAM ------------------------------------------------------------------

variable "enqueuer_members" {
  description = "IAM members granted roles/cloudtasks.enqueuer on this queue, e.g. [\"serviceAccount:api@proj.iam.gserviceaccount.com\"]."
  type        = list(string)
  default     = []
}

# outputs.tf

output "id" {
  description = "Fully-qualified queue ID: projects/{project}/locations/{location}/queues/{name}."
  value       = google_cloud_tasks_queue.this.id
}

output "name" {
  description = "Short queue name (the queue ID without the resource path)."
  value       = google_cloud_tasks_queue.this.name
}

output "location" {
  description = "Region the queue is deployed in."
  value       = google_cloud_tasks_queue.this.location
}

output "project" {
  description = "Project the queue belongs to."
  value       = google_cloud_tasks_queue.this.project
}

output "max_dispatches_per_second" {
  description = "Effective sustained dispatch rate ceiling applied to the queue."
  value       = google_cloud_tasks_queue.this.rate_limits[0].max_dispatches_per_second
}

output "enqueuer_members" {
  description = "IAM members granted enqueue rights on this queue."
  value       = [for m in google_cloud_tasks_queue_iam_member.enqueuers : m.member]
}

How to use it

module "cloud_tasks" {
  source = "git::https://dev.azure.com/teknohut/kloudvin/_git/terraform-modules//terraform-module-gcp-cloud-tasks?ref=v1.0.0"

  project_id = "kv-prod-platform"
  queue_name = "invoice-email-dispatch"
  location   = "asia-south1"

  # Protect a vendor email API capped at ~50 QPS, and keep handler
  # concurrency modest so we don't blow our SMTP relay's connection pool.
  max_dispatches_per_second = 45
  max_concurrent_dispatches = 80

  # Give up after 1 hour of failures with capped exponential backoff.
  retry_max_attempts = -1
  retry_max_duration = "3600s"
  retry_min_backoff  = "5s"
  retry_max_backoff  = "300s"
  retry_max_doublings = 6

  # Route every task to the Cloud Run worker, authenticated via OIDC.
  http_target_host                = "invoice-worker-abcd-el.a.run.app"
  http_target_path                = "/tasks/send-invoice"
  http_oidc_service_account_email = "tasks-invoker@kv-prod-platform.iam.gserviceaccount.com"

  logging_sampling_ratio = 0.1

  enqueuer_members = [
    "serviceAccount:invoice-api@kv-prod-platform.iam.gserviceaccount.com",
  ]
}

# Downstream: surface the queue name to the API service that enqueues
# tasks (e.g. injected as an env var into a Cloud Run revision).
resource "google_cloud_run_v2_service" "invoice_api" {
  name     = "invoice-api"
  location = "asia-south1"

  template {
    service_account = "invoice-api@kv-prod-platform.iam.gserviceaccount.com"

    containers {
      image = "asia-south1-docker.pkg.dev/kv-prod-platform/svc/invoice-api:latest"

      env {
        name  = "TASKS_QUEUE_ID"
        value = module.cloud_tasks.id
      }
      env {
        name  = "TASKS_QUEUE_REGION"
        value = module.cloud_tasks.location
      }
    }
  }
}

With Terragrunt

Terragrunt keeps this module DRY across environments — define the backend and provider once in a root config, then a thin terragrunt.hcl per environment supplies only the inputs that differ.

1. Root config — live/terragrunt.hcl (inherited by every module):

remote_state {
  backend = "gcs"
  generate = { path = "backend.tf", if_exists = "overwrite" }
  config = {
    # ...gcs state bucket/container + key per path...
  }
}

2. Module config — live/prod/cloud_tasks/terragrunt.hcl:

include "root" {
  path = find_in_parent_folders()
}

terraform {
  source = "git::https://dev.azure.com/teknohut/kloudvin/_git/terraform-modules//terraform-module-gcp-cloud-tasks?ref=v1.0.0"
}

inputs = {
  project_id = "..."
  queue_name = "..."
  location = "..."
}

3. Deploy one environment, or roll out all modules together:

cd live/prod/cloud_tasks && terragrunt apply        # this module
terragrunt run-all apply                      # every module under live/prod

Why Terragrunt here: the backend and provider live in one place instead of being copy-pasted into every module; inputs is overridden per environment (dev / stage / prod) without forking the module; and run-all orchestrates dependencies across modules. Reach for it once you have more than one environment or more than a handful of modules — for a single stack, the plain Quickstart above is enough.

Inputs

Name	Type	Default	Required	Description
`project_id`	`string`	—	Yes	GCP project ID that hosts the queue.
`queue_name`	`string`	—	Yes	Queue ID (1-100 chars, lowercase, validated).
`location`	`string`	—	Yes	Region for the queue; immutable after creation.
`max_dispatches_per_second`	`number`	`500`	No	Sustained dispatch QPS ceiling; range (0, 500].
`max_concurrent_dispatches`	`number`	`1000`	No	Max tasks running concurrently; range 1-5000.
`retry_max_attempts`	`number`	`10`	No	Attempts per task including first; `-1` = unlimited.
`retry_max_duration`	`string`	`"0s"`	No	Time limit for retrying a task; `"0s"` = no limit.
`retry_min_backoff`	`string`	`"0.1s"`	No	Minimum wait before a retry.
`retry_max_backoff`	`string`	`"3600s"`	No	Maximum wait between retries.
`retry_max_doublings`	`number`	`16`	No	Times the backoff interval doubles before going linear; 0-16.
`http_target_host`	`string`	`null`	No	Queue-level host override for all tasks; null disables it.
`http_target_scheme`	`string`	`"HTTPS"`	No	URI scheme for the HTTP target (HTTP or HTTPS).
`http_target_path`	`string`	`"/"`	No	Forced path on every dispatched task when host is set.
`http_oidc_service_account_email`	`string`	`null`	No	SA whose OIDC token the queue mints when calling the target.
`http_oidc_audience`	`string`	`null`	No	OIDC token audience; null defaults to the request URL.
`logging_sampling_ratio`	`number`	`null`	No	Fraction (0.0-1.0) of task ops logged; null disables it.
`enqueuer_members`	`list(string)`	`[]`	No	IAM members granted `roles/cloudtasks.enqueuer` on this queue.

Outputs

Name	Description
`id`	Fully-qualified queue ID: `projects/{project}/locations/{location}/queues/{name}`.
`name`	Short queue name without the resource path.
`location`	Region the queue is deployed in.
`project`	Project the queue belongs to.
`max_dispatches_per_second`	Effective sustained dispatch rate ceiling applied to the queue.
`enqueuer_members`	IAM members granted enqueue rights on this queue.

Enterprise scenario

A fintech runs a billing platform where every invoice triggers a downstream call to a third-party tax-calculation API that is contractually rate-limited to 50 requests/second. They deploy one queue per region via this module with max_dispatches_per_second = 45 and max_concurrent_dispatches = 80, routing all tasks to a Cloud Run worker over OIDC so no API keys live in task payloads. When the tax vendor has a brownout, retry_max_duration = "3600s" with capped exponential backoff drains the backlog gracefully once the vendor recovers — instead of a thundering-herd retry storm — and the per-queue cloudtasks.enqueuer binding ensures only the billing API service account, not the broader platform, can submit work.

Best practices

Treat max_dispatches_per_second as a downstream protection contract, not a performance dial. Set it to your handler’s safe steady-state QPS (or a vendor’s documented cap) and let the queue absorb bursts — that’s the whole point of Cloud Tasks over direct calls.
Always bound retries with retry_max_duration or finite retry_max_attempts. Unlimited attempts (-1) with no duration cap means a permanently-failing task retries forever, quietly burning dispatch budget and polluting logs. Pair -1 with a real duration ceiling.
Inject auth at the queue via OIDC, never in the task body. Use http_oidc_service_account_email so the queue mints short-lived tokens for Cloud Run / IAP targets; this keeps credentials out of payloads and out of state, and the worker should reject any request lacking a valid OIDC token.
Grant roles/cloudtasks.enqueuer on the specific queue, not at the project. The module’s enqueuer_members binds the role resource-scoped, so a compromised service can’t flood every queue in the project. Avoid handing out cloudtasks.admin to enqueuing services.
Use deterministic task names for idempotency, and keep concurrency honest. Naming tasks (e.g. by invoice ID) makes enqueues dedup-safe across retries of the producer; combine that with a max_concurrent_dispatches your database can actually tolerate.
Name and locate queues to convention, and pin the module by tag. Encode purpose and environment in queue_name (invoice-email-dispatch), co-locate the queue with its worker region to cut latency, and consume the module via ?ref=v1.0.0 so rate/retry policy changes roll out as reviewed version bumps.

Terraform Module: GCP Cloud Tasks — production-ready async queues with tuned rate and retry policy

Quickstart (copy-paste)

What this module is

When to use it

Module structure

How to use it

With Terragrunt

Inputs

Outputs

Enterprise scenario

Best practices

Written by Vinod

Comments

Keep Reading

The Terraform Architecting Ladder: From a Single Module to an Enterprise IaC Platform

HashiCorp Terraform Associate (003) Prep Kit: Objectives, Practice Questions & Cheat Sheet

Terraform Fundamentals: HCL, Providers, State & the Core Workflow