Terraform Module: Azure Cognitive Services — private-by-default AI accounts with key vault wiring

Quick take — A reusable hashicorp/azurerm ~> 4.0 Terraform module for azurerm_cognitive_account: pick the kind/SKU, lock it behind private endpoints and network rules, disable local keys, and stash secrets in Key Vault. New here? Jump to the Quickstart below to deploy it in minutes; read on for how it works and when to reach for it.

Quickstart (copy-paste)

Minimal, runnable configuration — drop this in a .tf file and fill in the "..." placeholders (each required input is commented):

provider "azurerm" {
  features {}
}

module "cognitive_services" {
  source = "git::https://dev.azure.com/teknohut/kloudvin/_git/terraform-modules//terraform-module-azure-cognitive-services?ref=v1.0.0"

  name                = "..."  # Account name; also the default custom subdomain, so it …
  resource_group_name = "..."  # Resource group for the account and child resources.
  location            = "..."  # Azure region; must support your chosen `kind`/models.
  kind                = "..."  # API surface (e.g. `OpenAI`, `AIServices`, `ComputerVisi…
}

Then terraform init && terraform apply. Every other input has a sensible default — see Inputs below to override behaviour.

What this module is

Azure Cognitive Services (now grouped under Azure AI Services) is the family of managed AI APIs — OpenAI, ComputerVision, TextAnalytics, SpeechServices, FormRecognizer (Document Intelligence), ContentSafety and a dozen more — all provisioned through a single Terraform resource: azurerm_cognitive_account. The kind argument decides which API surface you get, and the sku_name (F0 free, S0 standard, or the data-zone/provisioned tiers for OpenAI) decides how it bills.

The raw resource is deceptively simple, but a production-grade Cognitive account has a lot of footguns: it ships with a public endpoint and two live access keys by default, the custom_subdomain_name is mandatory the moment you want Azure AD (Entra) token auth or a private endpoint, and the subdomain must be globally unique. This module wraps all of that so every AI account in your estate is born the same way — custom subdomain set, public network access off, local key auth disabled in favour of Entra RBAC, a system-assigned managed identity attached, an optional private endpoint, and (when you do need keys) the primary/secondary keys pushed straight into Key Vault instead of leaking into state outputs.

When to use it

You are standing up Azure OpenAI and need the account, a custom subdomain, and private networking before any azurerm_cognitive_deployment (GPT-4o, embeddings) can be attached.
You run many Cognitive accounts of different kinds (Vision for one app, Document Intelligence for another, Speech for a third) and want one consistent, audited shape instead of hand-rolled resources.
You have a “no public endpoints” or “no local keys” policy (Azure Policy / landing-zone guardrail) and need the module to enforce public_network_access_enabled = false and local_auth_enabled = false by default.
You want secrets handled correctly: keys (if used at all) land in Key Vault, never in a plaintext terraform output.

If you only need a throwaway F0 account for a spike, the bare resource is fine — reach for this module when the account is long-lived, networked, or governed.

Module structure

terraform-module-azure-cognitive-services/
├── versions.tf
├── main.tf
├── variables.tf
└── outputs.tf

# versions.tf
terraform {
  required_version = ">= 1.6.0"

  required_providers {
    azurerm = {
      source  = "hashicorp/azurerm"
      version = "~> 4.0"
    }
  }
}

# main.tf

locals {
  # custom_subdomain_name must be globally unique and is REQUIRED for
  # AAD/token auth and for private endpoints. Default to the account name.
  custom_subdomain = coalesce(var.custom_subdomain_name, var.name)

  # Only push keys to Key Vault when both a vault is supplied AND local
  # auth is actually enabled (no keys exist when local_auth is disabled).
  store_keys = var.key_vault_id != null && var.local_auth_enabled
}

resource "azurerm_cognitive_account" "this" {
  name                = var.name
  resource_group_name = var.resource_group_name
  location            = var.location

  kind     = var.kind
  sku_name = var.sku_name

  custom_subdomain_name         = local.custom_subdomain
  public_network_access_enabled = var.public_network_access_enabled
  local_auth_enabled            = var.local_auth_enabled
  outbound_network_access_restricted = var.outbound_network_access_restricted

  # Pin the data residency for OpenAI/AI accounts where it is supported.
  dynamic "identity" {
    for_each = var.identity_type == null ? [] : [1]
    content {
      type         = var.identity_type
      identity_ids = var.identity_type == "UserAssigned" || var.identity_type == "SystemAssigned, UserAssigned" ? var.user_assigned_identity_ids : null
    }
  }

  # Network ACLs only make sense when the public endpoint is on but locked
  # to specific IPs/subnets, OR when a private endpoint needs a default deny.
  dynamic "network_acls" {
    for_each = var.network_acls == null ? [] : [var.network_acls]
    content {
      default_action = network_acls.value.default_action
      ip_rules       = network_acls.value.ip_rules

      dynamic "virtual_network_rules" {
        for_each = network_acls.value.virtual_network_subnet_ids
        content {
          subnet_id = virtual_network_rules.value
        }
      }
    }
  }

  # Customer-managed key encryption (BYOK) for compliance estates.
  dynamic "customer_managed_key" {
    for_each = var.customer_managed_key == null ? [] : [var.customer_managed_key]
    content {
      key_vault_key_id   = customer_managed_key.value.key_vault_key_id
      identity_client_id = customer_managed_key.value.identity_client_id
    }
  }

  tags = var.tags

  lifecycle {
    # The subdomain is immutable; protect it (and the data it fronts) from
    # accidental recreation on drift.
    ignore_changes = []
  }
}

# Optional private endpoint so the account is reachable only over the VNet.
resource "azurerm_private_endpoint" "this" {
  count = var.private_endpoint == null ? 0 : 1

  name                = coalesce(var.private_endpoint.name, "${var.name}-pe")
  resource_group_name = var.resource_group_name
  location            = var.location
  subnet_id           = var.private_endpoint.subnet_id

  private_service_connection {
    name                           = "${var.name}-psc"
    private_connection_resource_id = azurerm_cognitive_account.this.id
    subresource_names              = ["account"]
    is_manual_connection           = false
  }

  dynamic "private_dns_zone_group" {
    for_each = length(var.private_endpoint.private_dns_zone_ids) == 0 ? [] : [1]
    content {
      name                 = "default"
      private_dns_zone_ids = var.private_endpoint.private_dns_zone_ids
    }
  }

  tags = var.tags
}

# Stash the live keys in Key Vault (only when local auth is enabled).
resource "azurerm_key_vault_secret" "primary_key" {
  count = local.store_keys ? 1 : 0

  name         = "${var.name}-primary-key"
  value        = azurerm_cognitive_account.this.primary_access_key
  key_vault_id = var.key_vault_id
  content_type = "cognitive-services-key"
  tags         = var.tags
}

resource "azurerm_key_vault_secret" "secondary_key" {
  count = local.store_keys ? 1 : 0

  name         = "${var.name}-secondary-key"
  value        = azurerm_cognitive_account.this.secondary_access_key
  key_vault_id = var.key_vault_id
  content_type = "cognitive-services-key"
  tags         = var.tags
}

# variables.tf

variable "name" {
  type        = string
  description = "Name of the Cognitive Services account. Also used as the default custom subdomain, so it must be globally unique."

  validation {
    condition     = can(regex("^[a-zA-Z0-9][a-zA-Z0-9-]{1,62}[a-zA-Z0-9]$", var.name))
    error_message = "name must be 3-64 chars, alphanumeric or hyphens, and start/end with an alphanumeric character."
  }
}

variable "resource_group_name" {
  type        = string
  description = "Resource group that will hold the account and its child resources."
}

variable "location" {
  type        = string
  description = "Azure region for the account (e.g. swedencentral, eastus2). Pick a region where your chosen kind/models are available."
}

variable "kind" {
  type        = string
  description = "The API surface to provision (decides which Cognitive Service you get)."

  validation {
    condition = contains([
      "OpenAI", "AIServices", "ComputerVision", "CustomVision.Training",
      "CustomVision.Prediction", "TextAnalytics", "Language", "SpeechServices",
      "FormRecognizer", "ContentSafety", "Face", "ContentModerator",
      "TextTranslation", "HealthInsights", "MetricsAdvisor", "Personalizer"
    ], var.kind)
    error_message = "kind must be a supported azurerm_cognitive_account kind (e.g. OpenAI, AIServices, ComputerVision, FormRecognizer, SpeechServices, ContentSafety)."
  }
}

variable "sku_name" {
  type        = string
  default     = "S0"
  description = "Pricing tier. F0 = free (one per subscription per kind), S0 = standard. OpenAI also supports DataZoneStandard, GlobalStandard, ProvisionedManaged, etc."
}

variable "custom_subdomain_name" {
  type        = string
  default     = null
  description = "Custom subdomain (globally unique). Required for AAD token auth and private endpoints. Defaults to var.name when null."
}

variable "public_network_access_enabled" {
  type        = bool
  default     = false
  description = "Whether the public endpoint is reachable. Defaults to false (private-by-default); set true only with network_acls or for dev."
}

variable "local_auth_enabled" {
  type        = bool
  default     = false
  description = "Whether API-key (local) auth is allowed. Defaults to false so callers must use Entra ID tokens + RBAC. When false, no keys are stored in Key Vault."
}

variable "outbound_network_access_restricted" {
  type        = bool
  default     = false
  description = "Restrict the account's outbound network access (e.g. for OpenAI 'on your data' egress control)."
}

variable "identity_type" {
  type        = string
  default     = "SystemAssigned"
  description = "Managed identity type: SystemAssigned, UserAssigned, 'SystemAssigned, UserAssigned', or null for none."

  validation {
    condition = var.identity_type == null || contains([
      "SystemAssigned", "UserAssigned", "SystemAssigned, UserAssigned"
    ], var.identity_type)
    error_message = "identity_type must be one of: SystemAssigned, UserAssigned, 'SystemAssigned, UserAssigned', or null."
  }
}

variable "user_assigned_identity_ids" {
  type        = list(string)
  default     = []
  description = "User-assigned identity resource IDs (required when identity_type includes UserAssigned)."
}

variable "network_acls" {
  type = object({
    default_action             = string
    ip_rules                   = optional(list(string), [])
    virtual_network_subnet_ids = optional(list(string), [])
  })
  default     = null
  description = "Optional network rules. default_action is Allow or Deny; combine Deny with ip_rules/subnet_ids to allow-list."

  validation {
    condition     = var.network_acls == null || contains(["Allow", "Deny"], try(var.network_acls.default_action, ""))
    error_message = "network_acls.default_action must be either 'Allow' or 'Deny'."
  }
}

variable "private_endpoint" {
  type = object({
    name                 = optional(string)
    subnet_id            = string
    private_dns_zone_ids = optional(list(string), [])
  })
  default     = null
  description = "Optional private endpoint. private_dns_zone_ids should point at the privatelink.cognitiveservices.azure.com (or .openai.azure.com) zone."
}

variable "customer_managed_key" {
  type = object({
    key_vault_key_id   = string
    identity_client_id = optional(string)
  })
  default     = null
  description = "Optional BYOK encryption. key_vault_key_id is the versionless/ versioned key ID; identity_client_id is the UAMI used to reach the vault."
}

variable "key_vault_id" {
  type        = string
  default     = null
  description = "Key Vault to store the primary/secondary keys in. Only used when local_auth_enabled = true."
}

variable "tags" {
  type        = map(string)
  default     = {}
  description = "Tags applied to all resources created by the module."
}

# outputs.tf

output "id" {
  description = "Resource ID of the Cognitive Services account."
  value       = azurerm_cognitive_account.this.id
}

output "name" {
  description = "Name of the Cognitive Services account."
  value       = azurerm_cognitive_account.this.name
}

output "endpoint" {
  description = "The HTTPS endpoint used by SDKs/REST clients to reach the account."
  value       = azurerm_cognitive_account.this.endpoint
}

output "custom_subdomain_name" {
  description = "The custom subdomain assigned to the account (used in the token-auth endpoint)."
  value       = azurerm_cognitive_account.this.custom_subdomain_name
}

output "identity_principal_id" {
  description = "Principal ID of the system-assigned identity (null if no system identity)."
  value       = try(azurerm_cognitive_account.this.identity[0].principal_id, null)
}

output "private_endpoint_id" {
  description = "Resource ID of the private endpoint, if one was created."
  value       = try(azurerm_private_endpoint.this[0].id, null)
}

output "primary_key_secret_id" {
  description = "Key Vault secret ID holding the primary key (null when local auth is disabled or no vault supplied)."
  value       = try(azurerm_key_vault_secret.primary_key[0].id, null)
}

output "primary_access_key" {
  description = "Primary access key. Empty when local_auth_enabled = false. Marked sensitive."
  value       = azurerm_cognitive_account.this.primary_access_key
  sensitive   = true
}

How to use it

The example below provisions a private Azure OpenAI account with Entra-only auth (no keys), wires it to a VNet subnet via private endpoint, then references the module’s id output from a downstream azurerm_cognitive_deployment for GPT-4o.

module "cognitive_services" {
  source = "git::https://dev.azure.com/teknohut/kloudvin/_git/terraform-modules//terraform-module-azure-cognitive-services?ref=v1.0.0"

  name                = "kv-prod-openai-swc"
  resource_group_name = azurerm_resource_group.ai.name
  location            = "swedencentral"

  kind     = "OpenAI"
  sku_name = "S0"

  # Private-by-default posture: no public endpoint, no local keys.
  public_network_access_enabled = false
  local_auth_enabled            = false

  identity_type = "SystemAssigned"

  private_endpoint = {
    subnet_id            = azurerm_subnet.privatelink.id
    private_dns_zone_ids = [azurerm_private_dns_zone.openai.id]
  }

  tags = {
    workload    = "rag-assistant"
    environment = "prod"
    costcenter  = "ml-platform"
  }
}

# Downstream: a model deployment hung off the account created by the module.
resource "azurerm_cognitive_deployment" "gpt4o" {
  name                 = "gpt-4o"
  cognitive_account_id = module.cognitive_services.id

  model {
    format  = "OpenAI"
    name    = "gpt-4o"
    version = "2024-08-06"
  }

  sku {
    name     = "DataZoneStandard"
    capacity = 50
  }
}

# Grant the app's managed identity data-plane access (Entra RBAC, no keys).
resource "azurerm_role_assignment" "app_openai_user" {
  scope                = module.cognitive_services.id
  role_definition_name = "Cognitive Services OpenAI User"
  principal_id         = azurerm_user_assigned_identity.app.principal_id
}

With Terragrunt

Terragrunt keeps this module DRY across environments — define the backend and provider once in a root config, then a thin terragrunt.hcl per environment supplies only the inputs that differ.

1. Root config — live/terragrunt.hcl (inherited by every module):

remote_state {
  backend = "azurerm"
  generate = { path = "backend.tf", if_exists = "overwrite" }
  config = {
    # ...azurerm state bucket/container + key per path...
  }
}

2. Module config — live/prod/cognitive_services/terragrunt.hcl:

include "root" {
  path = find_in_parent_folders()
}

terraform {
  source = "git::https://dev.azure.com/teknohut/kloudvin/_git/terraform-modules//terraform-module-azure-cognitive-services?ref=v1.0.0"
}

inputs = {
  name = "..."
  resource_group_name = "..."
  location = "..."
  kind = "..."
}

3. Deploy one environment, or roll out all modules together:

cd live/prod/cognitive_services && terragrunt apply        # this module
terragrunt run-all apply                      # every module under live/prod

Why Terragrunt here: the backend and provider live in one place instead of being copy-pasted into every module; inputs is overridden per environment (dev / stage / prod) without forking the module; and run-all orchestrates dependencies across modules. Reach for it once you have more than one environment or more than a handful of modules — for a single stack, the plain Quickstart above is enough.

Inputs

Name	Type	Default	Required	Description
`name`	`string`	—	Yes	Account name; also the default custom subdomain, so it must be globally unique.
`resource_group_name`	`string`	—	Yes	Resource group for the account and child resources.
`location`	`string`	—	Yes	Azure region; must support your chosen `kind`/models.
`kind`	`string`	—	Yes	API surface (e.g. `OpenAI`, `AIServices`, `ComputerVision`, `FormRecognizer`, `SpeechServices`, `ContentSafety`).
`sku_name`	`string`	`"S0"`	No	Pricing tier (`F0`, `S0`, `GlobalStandard`, `DataZoneStandard`, `ProvisionedManaged`, …).
`custom_subdomain_name`	`string`	`null`	No	Globally unique subdomain; required for AAD auth/private endpoints. Defaults to `name`.
`public_network_access_enabled`	`bool`	`false`	No	Whether the public endpoint is reachable.
`local_auth_enabled`	`bool`	`false`	No	Allow API-key auth. When `false`, no keys are written to Key Vault.
`outbound_network_access_restricted`	`bool`	`false`	No	Restrict the account’s outbound egress.
`identity_type`	`string`	`"SystemAssigned"`	No	`SystemAssigned`, `UserAssigned`, `"SystemAssigned, UserAssigned"`, or `null`.
`user_assigned_identity_ids`	`list(string)`	`[]`	No	UAMI resource IDs, required when `identity_type` includes `UserAssigned`.
`network_acls`	`object`	`null`	No	`default_action` (`Allow`/`Deny`) plus `ip_rules` and `virtual_network_subnet_ids`.
`private_endpoint`	`object`	`null`	No	`subnet_id` (required), optional `name`, and `private_dns_zone_ids`.
`customer_managed_key`	`object`	`null`	No	BYOK: `key_vault_key_id` and optional `identity_client_id`.
`key_vault_id`	`string`	`null`	No	Vault to store primary/secondary keys (only when `local_auth_enabled = true`).
`tags`	`map(string)`	`{}`	No	Tags applied to all created resources.

Outputs

Name	Description
`id`	Resource ID of the Cognitive Services account.
`name`	Name of the account.
`endpoint`	HTTPS endpoint used by SDKs/REST clients.
`custom_subdomain_name`	The custom subdomain assigned to the account.
`identity_principal_id`	Principal ID of the system-assigned identity (null if none).
`private_endpoint_id`	Resource ID of the private endpoint, if created.
`primary_key_secret_id`	Key Vault secret ID for the primary key (null when local auth is off).
`primary_access_key`	Primary access key (sensitive; empty when local auth is disabled).

Enterprise scenario

A financial-services platform team runs a RAG copilot over internal documents and must keep every AI call off the public internet for regulatory reasons. They consume this module once per region (swedencentral, eastus2) with kind = "OpenAI", public_network_access_enabled = false, local_auth_enabled = false, and a private endpoint into the shared hub’s privatelink.openai.azure.com zone — so the GPT-4o endpoint resolves only inside the VNet and every request carries an Entra token mapped to the Cognitive Services OpenAI User role. Because keys are disabled at the source, an Azure Policy “deny local auth” audit passes with zero exceptions, and the platform team can grant or revoke app access purely through role assignments, no secret rotation involved.

Best practices

Disable local keys, use Entra RBAC. Keep local_auth_enabled = false and assign data-plane roles (Cognitive Services OpenAI User, Cognitive Services User) to managed identities. It removes an entire class of leaked-key incidents and the rotation burden that comes with them.
Always set the custom subdomain — and never change it. AAD auth and private endpoints both require it, and it is immutable; renaming forces a destroy/recreate that nukes any attached deployments. Let the module default it to the account name and pick the name carefully.
Lock the network at the source. Prefer public_network_access_enabled = false with a private endpoint; if you must keep the public endpoint, pair it with network_acls { default_action = "Deny" } and an explicit allow-list rather than leaving it open.
Watch the cost model per kind. F0 is one-per-subscription and rate-limited; S0 is pay-as-you-go. For OpenAI, choose between token-based GlobalStandard/DataZoneStandard and capacity-reserved ProvisionedManaged (PTUs) deliberately — PTUs bill whether or not you send traffic, so size deployment capacity to real TPM demand.
Tag for showback and pin data residency. Stamp workload/costcenter tags so AI spend is attributable, and deploy into a region that satisfies your data-residency requirement (use DataZoneStandard to keep OpenAI inference within a geography).
If you ever do store keys, store them in Key Vault only. Never expose primary_access_key through a non-sensitive output or a local-exec; consume the primary_key_secret_id reference and let the app read the secret at runtime.

Terraform Module: Azure Cognitive Services — private-by-default AI accounts with key vault wiring

Quickstart (copy-paste)

What this module is

When to use it

Module structure

How to use it

With Terragrunt

Inputs

Outputs

Enterprise scenario

Best practices

Written by Vinod

Comments

Keep Reading

The Terraform Architecting Ladder: From a Single Module to an Enterprise IaC Platform

HashiCorp Terraform Associate (003) Prep Kit: Objectives, Practice Questions & Cheat Sheet

Terraform Fundamentals: HCL, Providers, State & the Core Workflow