IaC GCP

Terraform Module: GCP Vertex AI Featurestore — autoscaled online serving with CMEK in one wrapper

Quick take — Provision a GCP Vertex AI Featurestore with Terraform: autoscaled or fixed-node online serving, customer-managed encryption (CMEK), labels and force_destroy, all behind a reusable module. New here? Jump to the Quickstart below to deploy it in minutes; read on for how it works and when to reach for it.

Quickstart (copy-paste)

Minimal, runnable configuration — drop this in a .tf file and fill in the "..." placeholders (each required input is commented):

provider "google" {
  project = "my-project"
  region  = "us-central1"
}

module "vertex_featurestore" {
  source = "git::https://dev.azure.com/teknohut/kloudvin/_git/terraform-modules//terraform-module-gcp-vertex-featurestore?ref=v1.0.0"

  name   = "..."  # Featurestore name; lowercase letters, digits, underscor…
  region = "..."  # GCP region for the regional Featurestore (e.g. `us-cent…
}

Then terraform init && terraform apply. Every other input has a sensible default — see Inputs below to override behaviour.

What this module is

A Vertex AI Featurestore is GCP’s managed home for ML features — the engineered signals (a user’s 30-day spend, an item’s rolling click-through rate, a device’s fraud score) that both training jobs and live prediction services read. The top-level google_vertex_ai_featurestore resource is the container: it owns the regional storage and, critically, the online serving tier that answers low-latency point lookups at prediction time. Entity types and individual features live underneath it, but the Featurestore is where you make the decisions that cost money and drive latency: whether online serving runs on a fixed node count or autoscales, and whether the data is encrypted with a customer-managed key.

Those are exactly the knobs teams get wrong when they click through the console. A reusable module pins them down. It forces a deliberate choice between fixed_node_count (predictable cost, predictable throughput) and scaling (min/max nodes that follow traffic), wires in CMEK via encryption_spec so the data inherits your org’s key policy, and standardizes labels and the force_destroy flag so a terraform destroy in a sandbox doesn’t get blocked by stray entity types while production stays protected. Wrapping it means every Featurestore across your estate serves online traffic the same way, encrypts the same way, and tags the same way — without anyone re-deriving the right online_serving_config block from the docs each time.

When to use it

Module structure

terraform-module-gcp-vertex-featurestore/
├── versions.tf      # provider + Terraform version pins
├── main.tf          # google_vertex_ai_featurestore resource
├── variables.tf     # var-driven inputs with validation
└── outputs.tf       # id, name, region, serving mode, etc.

versions.tf

terraform {
  required_version = ">= 1.5.0"

  required_providers {
    google = {
      source  = "hashicorp/google"
      version = "~> 5.0"
    }
  }
}

main.tf

locals {
  # Exactly one online serving mode is active. When online serving is
  # disabled entirely, both blocks collapse to empty and Vertex AI
  # treats the store as offline-only.
  use_scaling = var.online_serving_enabled && var.scaling != null
  use_fixed   = var.online_serving_enabled && var.scaling == null && var.fixed_node_count > 0
}

resource "google_vertex_ai_featurestore" "this" {
  provider = google

  name   = var.name
  region = var.region
  labels = var.labels

  # If true, `terraform destroy` will delete the Featurestore even when
  # entity types still exist under it. Keep false in production.
  force_destroy = var.force_destroy

  online_serving_config {
    # Fixed-node path: a constant number of serving nodes.
    fixed_node_count = local.use_fixed ? var.fixed_node_count : null

    # Autoscaling path: nodes float between min and max with load.
    dynamic "scaling" {
      for_each = local.use_scaling ? [var.scaling] : []
      content {
        min_node_count = scaling.value.min_node_count
        max_node_count = scaling.value.max_node_count
      }
    }
  }

  # Customer-managed encryption (CMEK). Omitted entirely when no key is
  # supplied, in which case Google-managed encryption applies.
  dynamic "encryption_spec" {
    for_each = var.kms_key_name != null ? [1] : []
    content {
      kms_key_name = var.kms_key_name
    }
  }

  timeouts {
    create = var.create_timeout
    update = var.update_timeout
    delete = var.delete_timeout
  }
}

variables.tf

variable "name" {
  description = "Name of the Featurestore. Must be unique within the project and region."
  type        = string

  validation {
    condition     = can(regex("^[a-z][a-z0-9_]{0,59}$", var.name))
    error_message = "name must start with a lowercase letter and contain only lowercase letters, digits, and underscores (max 60 chars)."
  }
}

variable "region" {
  description = "GCP region for the Featurestore, e.g. us-central1. Featurestores are regional resources."
  type        = string
}

variable "labels" {
  description = "Key/value labels applied to the Featurestore for cost allocation and ownership."
  type        = map(string)
  default     = {}
}

variable "online_serving_enabled" {
  description = "Whether to provision online serving capacity. When false, the store is offline-only (no node cost)."
  type        = bool
  default     = true
}

variable "fixed_node_count" {
  description = "Number of nodes for fixed-capacity online serving. Used only when scaling is null. Set 0 to disable fixed serving."
  type        = number
  default     = 1

  validation {
    condition     = var.fixed_node_count >= 0 && var.fixed_node_count <= 100
    error_message = "fixed_node_count must be between 0 and 100."
  }
}

variable "scaling" {
  description = "Autoscaling config for online serving. When set, overrides fixed_node_count. Null disables autoscaling."
  type = object({
    min_node_count = number
    max_node_count = number
  })
  default = null

  validation {
    condition = var.scaling == null || (
      var.scaling.min_node_count >= 1 &&
      var.scaling.max_node_count >= var.scaling.min_node_count &&
      var.scaling.max_node_count <= 100
    )
    error_message = "scaling requires 1 <= min_node_count <= max_node_count <= 100."
  }
}

variable "kms_key_name" {
  description = "Full resource ID of a Cloud KMS key for CMEK, e.g. projects/p/locations/us-central1/keyRings/r/cryptoKeys/k. Null uses Google-managed encryption."
  type        = string
  default     = null

  validation {
    condition     = var.kms_key_name == null || can(regex("^projects/.+/locations/.+/keyRings/.+/cryptoKeys/.+$", var.kms_key_name))
    error_message = "kms_key_name must be a full Cloud KMS cryptoKey resource ID or null."
  }
}

variable "force_destroy" {
  description = "If true, allow destroying the Featurestore even when entity types still exist. Keep false in production."
  type        = bool
  default     = false
}

variable "create_timeout" {
  description = "Timeout for create operations."
  type        = string
  default     = "20m"
}

variable "update_timeout" {
  description = "Timeout for update operations."
  type        = string
  default     = "20m"
}

variable "delete_timeout" {
  description = "Timeout for delete operations."
  type        = string
  default     = "20m"
}

outputs.tf

output "id" {
  description = "Fully qualified Featurestore ID (projects/{project}/locations/{region}/featurestores/{name})."
  value       = google_vertex_ai_featurestore.this.id
}

output "name" {
  description = "Short name of the Featurestore."
  value       = google_vertex_ai_featurestore.this.name
}

output "region" {
  description = "Region the Featurestore is deployed in."
  value       = google_vertex_ai_featurestore.this.region
}

output "etag" {
  description = "Used for optimistic concurrency control on updates."
  value       = google_vertex_ai_featurestore.this.etag
}

output "online_serving_mode" {
  description = "Resolved online serving mode: 'autoscaling', 'fixed', or 'offline'."
  value       = local.use_scaling ? "autoscaling" : (local.use_fixed ? "fixed" : "offline")
}

output "kms_key_name" {
  description = "CMEK key in use, or null when Google-managed encryption applies."
  value       = var.kms_key_name
}

How to use it

module "vertex_ai_featurestore" {
  source = "git::https://dev.azure.com/teknohut/kloudvin/_git/terraform-modules//terraform-module-gcp-vertex-featurestore?ref=v1.0.0"

  name   = "fraud_features_prod"
  region = "us-central1"

  # Autoscale online serving between 2 and 8 nodes as traffic rises.
  scaling = {
    min_node_count = 2
    max_node_count = 8
  }

  # Encrypt feature data at rest with our own KMS key.
  kms_key_name = "projects/kv-ml-prod/locations/us-central1/keyRings/vertex/cryptoKeys/featurestore"

  force_destroy = false

  labels = {
    team        = "risk-ml"
    environment = "prod"
    cost-center = "ml-platform"
  }
}

# Downstream: define an entity type inside the Featurestore returned above,
# referencing the module's id output.
resource "google_vertex_ai_featurestore_entitytype" "user" {
  name         = "user"
  featurestore = module.vertex_ai_featurestore.id

  description = "Per-user aggregated fraud signals."
}

With Terragrunt

Terragrunt keeps this module DRY across environments — define the backend and provider once in a root config, then a thin terragrunt.hcl per environment supplies only the inputs that differ.

1. Root configlive/terragrunt.hcl (inherited by every module):

remote_state {
  backend = "gcs"
  generate = { path = "backend.tf", if_exists = "overwrite" }
  config = {
    # ...gcs state bucket/container + key per path...
  }
}

2. Module configlive/prod/vertex_featurestore/terragrunt.hcl:

include "root" {
  path = find_in_parent_folders()
}

terraform {
  source = "git::https://dev.azure.com/teknohut/kloudvin/_git/terraform-modules//terraform-module-gcp-vertex-featurestore?ref=v1.0.0"
}

inputs = {
  name = "..."
  region = "..."
}

3. Deploy one environment, or roll out all modules together:

cd live/prod/vertex_featurestore && terragrunt apply        # this module
terragrunt run-all apply                      # every module under live/prod

Why Terragrunt here: the backend and provider live in one place instead of being copy-pasted into every module; inputs is overridden per environment (dev / stage / prod) without forking the module; and run-all orchestrates dependencies across modules. Reach for it once you have more than one environment or more than a handful of modules — for a single stack, the plain Quickstart above is enough.

Inputs

Name Type Default Required Description
name string Yes Featurestore name; lowercase letters, digits, underscores, starts with a letter, max 60 chars.
region string Yes GCP region for the regional Featurestore (e.g. us-central1).
labels map(string) {} No Labels for cost allocation and ownership.
online_serving_enabled bool true No Provision online serving capacity; false makes the store offline-only.
fixed_node_count number 1 No Fixed serving node count (0–100); used only when scaling is null.
scaling object({ min_node_count, max_node_count }) null No Autoscaling bounds for online serving; overrides fixed_node_count when set.
kms_key_name string null No Full Cloud KMS cryptoKey ID for CMEK; null uses Google-managed encryption.
force_destroy bool false No Allow destroy even when entity types exist.
create_timeout string "20m" No Timeout for create operations.
update_timeout string "20m" No Timeout for update operations.
delete_timeout string "20m" No Timeout for delete operations.

Outputs

Name Description
id Fully qualified Featurestore ID (projects/{project}/locations/{region}/featurestores/{name}).
name Short name of the Featurestore.
region Region the Featurestore is deployed in.
etag Etag for optimistic concurrency control on updates.
online_serving_mode Resolved serving mode: autoscaling, fixed, or offline.
kms_key_name CMEK key in use, or null when Google-managed encryption applies.

Enterprise scenario

A digital-payments company runs real-time fraud scoring on every transaction. Their risk-ML team uses this module to stand up fraud_features_prod in us-central1 with autoscaling online serving (2–8 nodes) so it absorbs payday and holiday traffic spikes without manual resizing, and pins a Cloud KMS key via kms_key_name to satisfy PCI-driven encryption requirements. The same module, with online_serving_enabled = false and force_destroy = true, provisions a throwaway offline-only store in their sandbox project for feature backfill experiments — same code, zero online node cost, and a clean teardown.

Best practices

TerraformGCPVertex AI FeaturestoreModuleIaC
Need this built for real?

Vinod is a Senior Cloud Architect (22+ yrs) — available for Azure / AWS / GCP architecture, landing zones, and migrations.

Work with me

Comments

Keep Reading