Terraform Module: GCP Healthcare API — a HIPAA-ready dataset with FHIR, DICOM and HL7v2 stores in one wrapper

Quick take — Provision a Google Cloud Healthcare API dataset with FHIR R4, DICOM and HL7v2 stores, CMEK encryption and audit log streaming using a reusable Terraform module for hashicorp/google ~> 5.0. New here? Jump to the Quickstart below to deploy it in minutes; read on for how it works and when to reach for it.

Quickstart (copy-paste)

Minimal, runnable configuration — drop this in a .tf file and fill in the "..." placeholders (each required input is commented):

provider "google" {
  project = "my-project"
  region  = "us-central1"
}

module "healthcare_api" {
  source = "git::https://dev.azure.com/teknohut/kloudvin/_git/terraform-modules//terraform-module-gcp-healthcare-api?ref=v1.0.0"

  project_id   = "..."  # GCP project ID hosting the dataset.
  dataset_name = "..."  # Dataset name; prefix for default store names.
  location     = "..."  # GCP region for the dataset (regions only, no multi-regi…
}

Then terraform init && terraform apply. Every other input has a sensible default — see Inputs below to override behaviour.

What this module is

The Google Cloud Healthcare API is a managed service for ingesting, storing, transforming and serving clinical data in the formats hospitals and payers actually use: FHIR (R4/STU3), DICOM (medical imaging) and HL7v2 (ADT, ORM, ORU messaging). Everything hangs off a single top-level container called a dataset (google_healthcare_dataset), which is regional, pins a time_zone for HL7v2/FHIR timestamp resolution, and acts as the IAM and encryption boundary for the FHIR, DICOM and HL7v2 stores created inside it.

Wiring this up by hand is repetitive and easy to get wrong: you have to enable the healthcare.googleapis.com service, create the dataset, then create one or more google_healthcare_fhir_store / google_healthcare_dicom_store / google_healthcare_hl7_v2_store resources, attach CMEK keys, configure Pub/Sub notifications for downstream pipelines, and set per-store streaming to BigQuery for analytics. This module wraps all of that behind a handful of variables. You ask for enable_fhir = true with a FHIR version, and you get back a dataset plus a correctly-configured, version-locked FHIR store with referential-integrity and update-create semantics set the way production deployments expect — without copy-pasting 200 lines of HCL into every project.

When to use it

You are building a clinical data platform on GCP and need a consistent, audited way to stand up Healthcare API datasets per environment (dev / staging / prod) or per tenant.
You ingest HL7v2 feeds from an on-prem EHR (Epic, Cerner, MEDITECH) and want them landed in a dataset with Pub/Sub notifications driving a transform-to-FHIR pipeline.
You need DICOM storage for a PACS/imaging workload with a managed, de-identifiable backing store instead of running your own.
You require CMEK (customer-managed encryption keys) and BigQuery streaming for compliance and analytics, and want those toggles enforced as code rather than clicked in the console.
You want a single module that other Terraform (de-identification jobs, IAM bindings, BigQuery datasets) can reference by output.

If you only ever need a bare dataset with no stores, a raw resource is fine — the module earns its keep once FHIR/DICOM/HL7v2 stores, CMEK and streaming enter the picture.

Module structure

terraform-module-gcp-healthcare-api/
├── versions.tf
├── main.tf
├── variables.tf
└── outputs.tf

versions.tf

terraform {
  required_version = ">= 1.5.0"

  required_providers {
    google = {
      source  = "hashicorp/google"
      version = "~> 5.0"
    }
  }
}

main.tf

locals {
  # FHIR stores stream to BigQuery only when a dataset target is supplied.
  fhir_streaming_enabled = var.enable_fhir && var.fhir_stream_bigquery_dataset != null
}

# Ensure the Healthcare API is enabled before any resource is created.
resource "google_project_service" "healthcare" {
  count   = var.manage_api_enablement ? 1 : 0
  project = var.project_id
  service = "healthcare.googleapis.com"

  disable_on_destroy = false
}

# Top-level dataset: the regional, IAM and encryption boundary for all stores.
resource "google_healthcare_dataset" "this" {
  project   = var.project_id
  name      = var.dataset_name
  location  = var.location
  time_zone = var.time_zone

  dynamic "encryption_spec" {
    for_each = var.kms_key_name != null ? [1] : []
    content {
      kms_key_name = var.kms_key_name
    }
  }

  depends_on = [google_project_service.healthcare]
}

# ---------- FHIR ----------
resource "google_healthcare_fhir_store" "this" {
  count   = var.enable_fhir ? 1 : 0
  name    = coalesce(var.fhir_store_name, "${var.dataset_name}-fhir")
  dataset = google_healthcare_dataset.this.id
  version = var.fhir_version

  enable_update_create          = var.fhir_enable_update_create
  disable_referential_integrity = var.fhir_disable_referential_integrity
  disable_resource_versioning   = var.fhir_disable_resource_versioning
  enable_history_import         = false

  complex_data_type_reference_parsing = "ENABLED"

  dynamic "notification_configs" {
    for_each = var.fhir_pubsub_topic != null ? [1] : []
    content {
      pubsub_topic                  = var.fhir_pubsub_topic
      send_full_resource            = true
      send_previous_resource_on_delete = false
    }
  }

  dynamic "stream_configs" {
    for_each = local.fhir_streaming_enabled ? [1] : []
    content {
      resource_types = var.fhir_stream_resource_types
      bigquery_destination {
        dataset_uri = var.fhir_stream_bigquery_dataset
        schema_config {
          schema_type           = "ANALYTICS_V2"
          recursive_structure_depth = 3
        }
      }
    }
  }

  labels = var.labels
}

# ---------- DICOM ----------
resource "google_healthcare_dicom_store" "this" {
  count   = var.enable_dicom ? 1 : 0
  name    = coalesce(var.dicom_store_name, "${var.dataset_name}-dicom")
  dataset = google_healthcare_dataset.this.id

  dynamic "notification_config" {
    for_each = var.dicom_pubsub_topic != null ? [1] : []
    content {
      pubsub_topic = var.dicom_pubsub_topic
    }
  }

  labels = var.labels
}

# ---------- HL7v2 ----------
resource "google_healthcare_hl7_v2_store" "this" {
  count   = var.enable_hl7v2 ? 1 : 0
  name    = coalesce(var.hl7v2_store_name, "${var.dataset_name}-hl7v2")
  dataset = google_healthcare_dataset.this.id

  parser_config {
    version          = "V3"
    allow_null_header = false
  }

  reject_duplicate_message = var.hl7v2_reject_duplicate_message

  dynamic "notification_configs" {
    for_each = var.hl7v2_pubsub_topic != null ? [1] : []
    content {
      pubsub_topic = var.hl7v2_pubsub_topic
      filter       = var.hl7v2_notification_filter
    }
  }

  labels = var.labels
}

variables.tf

variable "project_id" {
  description = "GCP project ID that will host the Healthcare dataset."
  type        = string
}

variable "dataset_name" {
  description = "Name of the Healthcare dataset. Used as the prefix for default store names."
  type        = string

  validation {
    condition     = can(regex("^[a-zA-Z0-9][a-zA-Z0-9_-]{0,253}$", var.dataset_name))
    error_message = "dataset_name must be 1-254 chars: letters, digits, underscores or hyphens, starting alphanumeric."
  }
}

variable "location" {
  description = "Region for the dataset (e.g. us-central1, europe-west2, asia-south1). Must be a region the Healthcare API supports."
  type        = string

  validation {
    condition     = length(regexall("-", var.location)) >= 1
    error_message = "location must be a GCP region such as us-central1; multi-regions are not supported for healthcare datasets."
  }
}

variable "time_zone" {
  description = "IANA time zone used to resolve HL7v2/FHIR timestamps without an explicit offset (e.g. Asia/Kolkata, UTC)."
  type        = string
  default     = "UTC"
}

variable "manage_api_enablement" {
  description = "Whether this module should enable healthcare.googleapis.com on the project."
  type        = bool
  default     = true
}

variable "kms_key_name" {
  description = "Full resource ID of a Cloud KMS CryptoKey for CMEK at-rest encryption. Null uses Google-managed keys."
  type        = string
  default     = null

  validation {
    condition     = var.kms_key_name == null || can(regex("^projects/.+/locations/.+/keyRings/.+/cryptoKeys/.+$", var.kms_key_name))
    error_message = "kms_key_name must be a full CryptoKey ID: projects/P/locations/L/keyRings/R/cryptoKeys/K."
  }
}

variable "labels" {
  description = "Labels applied to every store created by the module."
  type        = map(string)
  default     = {}
}

# ---------- FHIR ----------
variable "enable_fhir" {
  description = "Create a FHIR store inside the dataset."
  type        = bool
  default     = true
}

variable "fhir_store_name" {
  description = "Override name for the FHIR store. Defaults to <dataset_name>-fhir."
  type        = string
  default     = null
}

variable "fhir_version" {
  description = "FHIR specification version for the store."
  type        = string
  default     = "R4"

  validation {
    condition     = contains(["DSTU2", "STU3", "R4"], var.fhir_version)
    error_message = "fhir_version must be one of DSTU2, STU3 or R4."
  }
}

variable "fhir_enable_update_create" {
  description = "Allow creating a resource via an update (PUT) to a not-yet-existing ID. Common for migrations."
  type        = bool
  default     = true
}

variable "fhir_disable_referential_integrity" {
  description = "Disable referential integrity checks. Keep false in production to enforce valid references."
  type        = bool
  default     = false
}

variable "fhir_disable_resource_versioning" {
  description = "Disable resource version history. Keep false to retain an audit trail of changes."
  type        = bool
  default     = false
}

variable "fhir_pubsub_topic" {
  description = "Pub/Sub topic ID notified on FHIR resource changes (projects/P/topics/T). Null disables notifications."
  type        = string
  default     = null
}

variable "fhir_stream_bigquery_dataset" {
  description = "BigQuery dataset URI (bq://project.dataset) to stream FHIR changes into. Null disables streaming."
  type        = string
  default     = null
}

variable "fhir_stream_resource_types" {
  description = "FHIR resource types to stream to BigQuery. Empty list streams all types."
  type        = list(string)
  default     = []
}

# ---------- DICOM ----------
variable "enable_dicom" {
  description = "Create a DICOM store inside the dataset."
  type        = bool
  default     = false
}

variable "dicom_store_name" {
  description = "Override name for the DICOM store. Defaults to <dataset_name>-dicom."
  type        = string
  default     = null
}

variable "dicom_pubsub_topic" {
  description = "Pub/Sub topic ID notified when DICOM instances are stored. Null disables notifications."
  type        = string
  default     = null
}

# ---------- HL7v2 ----------
variable "enable_hl7v2" {
  description = "Create an HL7v2 store inside the dataset."
  type        = bool
  default     = false
}

variable "hl7v2_store_name" {
  description = "Override name for the HL7v2 store. Defaults to <dataset_name>-hl7v2."
  type        = string
  default     = null
}

variable "hl7v2_reject_duplicate_message" {
  description = "Reject duplicate HL7v2 messages based on message control ID (MSH-10)."
  type        = bool
  default     = true
}

variable "hl7v2_pubsub_topic" {
  description = "Pub/Sub topic ID notified on HL7v2 message ingestion (projects/P/topics/T). Null disables notifications."
  type        = string
  default     = null
}

variable "hl7v2_notification_filter" {
  description = "Filter expression limiting which HL7v2 messages trigger notifications (e.g. by message type)."
  type        = string
  default     = ""
}

outputs.tf

output "dataset_id" {
  description = "Fully qualified dataset ID (projects/P/locations/L/datasets/D)."
  value       = google_healthcare_dataset.this.id
}

output "dataset_name" {
  description = "Short name of the dataset."
  value       = google_healthcare_dataset.this.name
}

output "dataset_self_link" {
  description = "Server-assigned self link for the dataset."
  value       = google_healthcare_dataset.this.self_link
}

output "fhir_store_id" {
  description = "ID of the FHIR store, or null when FHIR is disabled."
  value       = try(google_healthcare_fhir_store.this[0].id, null)
}

output "fhir_store_name" {
  description = "Name of the FHIR store, or null when FHIR is disabled."
  value       = try(google_healthcare_fhir_store.this[0].name, null)
}

output "dicom_store_id" {
  description = "ID of the DICOM store, or null when DICOM is disabled."
  value       = try(google_healthcare_dicom_store.this[0].id, null)
}

output "hl7v2_store_id" {
  description = "ID of the HL7v2 store, or null when HL7v2 is disabled."
  value       = try(google_healthcare_hl7_v2_store.this[0].id, null)
}

How to use it

Stand up a production dataset in Mumbai (asia-south1) with CMEK, a FHIR R4 store streaming to BigQuery, and an HL7v2 store fed by an EHR feed with Pub/Sub notifications:

module "healthcare_api" {
  source = "git::https://dev.azure.com/teknohut/kloudvin/_git/terraform-modules//terraform-module-gcp-healthcare-api?ref=v1.0.0"

  project_id   = "kv-clinical-prod"
  dataset_name = "patient-records"
  location     = "asia-south1"
  time_zone    = "Asia/Kolkata"

  kms_key_name = google_kms_crypto_key.healthcare.id

  # FHIR R4, streamed to BigQuery for analytics
  enable_fhir                  = true
  fhir_version                 = "R4"
  fhir_pubsub_topic            = google_pubsub_topic.fhir_changes.id
  fhir_stream_bigquery_dataset = "bq://kv-clinical-prod.fhir_analytics"
  fhir_stream_resource_types   = ["Patient", "Observation", "Encounter"]

  # HL7v2 feed from the on-prem EHR
  enable_hl7v2                   = true
  hl7v2_pubsub_topic             = google_pubsub_topic.hl7_inbound.id
  hl7v2_reject_duplicate_message = true

  labels = {
    env        = "prod"
    compliance = "hipaa"
    owner      = "platform-clinical"
  }
}

# Downstream: grant a de-identification service account read access to the FHIR store
resource "google_healthcare_fhir_store_iam_member" "deid_reader" {
  fhir_store_id = module.healthcare_api.fhir_store_id
  role          = "roles/healthcare.fhirResourceReader"
  member        = "serviceAccount:${google_service_account.deid_pipeline.email}"
}

The fhir_store_id output flows straight into a google_healthcare_fhir_store_iam_member so the IAM binding is always scoped to the exact store this module created — no hardcoded resource paths.

With Terragrunt

Terragrunt keeps this module DRY across environments — define the backend and provider once in a root config, then a thin terragrunt.hcl per environment supplies only the inputs that differ.

1. Root config — live/terragrunt.hcl (inherited by every module):

remote_state {
  backend = "gcs"
  generate = { path = "backend.tf", if_exists = "overwrite" }
  config = {
    # ...gcs state bucket/container + key per path...
  }
}

2. Module config — live/prod/healthcare_api/terragrunt.hcl:

include "root" {
  path = find_in_parent_folders()
}

terraform {
  source = "git::https://dev.azure.com/teknohut/kloudvin/_git/terraform-modules//terraform-module-gcp-healthcare-api?ref=v1.0.0"
}

inputs = {
  project_id = "..."
  dataset_name = "..."
  location = "..."
}

3. Deploy one environment, or roll out all modules together:

cd live/prod/healthcare_api && terragrunt apply        # this module
terragrunt run-all apply                      # every module under live/prod

Why Terragrunt here: the backend and provider live in one place instead of being copy-pasted into every module; inputs is overridden per environment (dev / stage / prod) without forking the module; and run-all orchestrates dependencies across modules. Reach for it once you have more than one environment or more than a handful of modules — for a single stack, the plain Quickstart above is enough.

Inputs

Name	Type	Default	Required	Description
`project_id`	string	—	yes	GCP project ID hosting the dataset.
`dataset_name`	string	—	yes	Dataset name; prefix for default store names.
`location`	string	—	yes	GCP region for the dataset (regions only, no multi-region).
`time_zone`	string	`"UTC"`	no	IANA time zone for resolving HL7v2/FHIR timestamps.
`manage_api_enablement`	bool	`true`	no	Enable `healthcare.googleapis.com` from the module.
`kms_key_name`	string	`null`	no	Full CryptoKey ID for CMEK; null uses Google-managed keys.
`labels`	map(string)	`{}`	no	Labels applied to every store.
`enable_fhir`	bool	`true`	no	Create a FHIR store.
`fhir_store_name`	string	`null`	no	Override FHIR store name.
`fhir_version`	string	`"R4"`	no	FHIR version: `DSTU2`, `STU3` or `R4`.
`fhir_enable_update_create`	bool	`true`	no	Allow create-via-update (PUT to new ID).
`fhir_disable_referential_integrity`	bool	`false`	no	Disable referential integrity checks.
`fhir_disable_resource_versioning`	bool	`false`	no	Disable FHIR resource version history.
`fhir_pubsub_topic`	string	`null`	no	Pub/Sub topic for FHIR change notifications.
`fhir_stream_bigquery_dataset`	string	`null`	no	BigQuery dataset URI to stream FHIR changes into.
`fhir_stream_resource_types`	list(string)	`[]`	no	FHIR resource types to stream (empty = all).
`enable_dicom`	bool	`false`	no	Create a DICOM store.
`dicom_store_name`	string	`null`	no	Override DICOM store name.
`dicom_pubsub_topic`	string	`null`	no	Pub/Sub topic for DICOM instance notifications.
`enable_hl7v2`	bool	`false`	no	Create an HL7v2 store.
`hl7v2_store_name`	string	`null`	no	Override HL7v2 store name.
`hl7v2_reject_duplicate_message`	bool	`true`	no	Reject duplicate HL7v2 messages by MSH-10.
`hl7v2_pubsub_topic`	string	`null`	no	Pub/Sub topic for HL7v2 ingestion notifications.
`hl7v2_notification_filter`	string	`""`	no	Filter limiting which HL7v2 messages notify.

Outputs

Name	Description
`dataset_id`	Fully qualified dataset ID (`projects/P/locations/L/datasets/D`).
`dataset_name`	Short name of the dataset.
`dataset_self_link`	Server-assigned self link for the dataset.
`fhir_store_id`	FHIR store ID, or `null` when FHIR is disabled.
`fhir_store_name`	FHIR store name, or `null` when FHIR is disabled.
`dicom_store_id`	DICOM store ID, or `null` when DICOM is disabled.
`hl7v2_store_id`	HL7v2 store ID, or `null` when HL7v2 is disabled.

Enterprise scenario

A national diagnostics chain runs an HL7v2 feed from each hospital’s lab system into a single Healthcare API dataset in asia-south1. The module provisions the dataset with CMEK keys held in the security team’s KMS project, an HL7v2 store with reject_duplicate_message on to absorb retransmissions from flaky hospital links, and an FHIR R4 store streaming Patient, Observation and DiagnosticReport resources into BigQuery. A Cloud Run transform service converts inbound ORU messages to FHIR, and the BigQuery stream powers a Looker turnaround-time dashboard for the clinical operations team — all reproducible per region by re-instantiating the module with a different location and dataset_name.

Best practices

Always set CMEK in regulated environments. Pass kms_key_name so PHI is encrypted with a key you control and can revoke; keep the key in a dedicated security project with its own IAM. Google-managed keys are fine for non-PHI dev datasets only.
Never disable referential integrity or versioning in production. Leave fhir_disable_referential_integrity and fhir_disable_resource_versioning at false — versioning is your audit trail, and integrity checks stop dangling references that break downstream de-identification and export jobs.
Pick the region deliberately and never change it. Datasets are regional and cannot be moved; choose a region close to your EHRs and inside your data-residency boundary (e.g. asia-south1 for India PHI), because relocating means an export/re-import migration.
Scope IAM at the store, not the project. Bind roles/healthcare.fhirResourceReader / dicomStoreUser on individual stores using the module’s outputs, so a de-identification or imaging service account never gets blanket access to every dataset in the project.
Stream to BigQuery instead of polling FHIR. Use fhir_stream_bigquery_dataset with a constrained fhir_stream_resource_types list for analytics — it is cheaper and lower-latency than periodic FHIR search export, and avoids loading the operational store.
Name and label for compliance reporting. Keep the <dataset_name>-<store> convention and apply compliance = "hipaa" plus env labels on every store so cost, access-review and audit tooling can filter PHI workloads by label.