IaC Azure

Terraform Module: Azure Cognitive Services — private-by-default AI accounts with key vault wiring

Quick take — A reusable hashicorp/azurerm ~> 4.0 Terraform module for azurerm_cognitive_account: pick the kind/SKU, lock it behind private endpoints and network rules, disable local keys, and stash secrets in Key Vault. New here? Jump to the Quickstart below to deploy it in minutes; read on for how it works and when to reach for it.

Quickstart (copy-paste)

Minimal, runnable configuration — drop this in a .tf file and fill in the "..." placeholders (each required input is commented):

provider "azurerm" {
  features {}
}

module "cognitive_services" {
  source = "git::https://dev.azure.com/teknohut/kloudvin/_git/terraform-modules//terraform-module-azure-cognitive-services?ref=v1.0.0"

  name                = "..."  # Account name; also the default custom subdomain, so it …
  resource_group_name = "..."  # Resource group for the account and child resources.
  location            = "..."  # Azure region; must support your chosen `kind`/models.
  kind                = "..."  # API surface (e.g. `OpenAI`, `AIServices`, `ComputerVisi…
}

Then terraform init && terraform apply. Every other input has a sensible default — see Inputs below to override behaviour.

What this module is

Azure Cognitive Services (now grouped under Azure AI Services) is the family of managed AI APIs — OpenAI, ComputerVision, TextAnalytics, SpeechServices, FormRecognizer (Document Intelligence), ContentSafety and a dozen more — all provisioned through a single Terraform resource: azurerm_cognitive_account. The kind argument decides which API surface you get, and the sku_name (F0 free, S0 standard, or the data-zone/provisioned tiers for OpenAI) decides how it bills.

The raw resource is deceptively simple, but a production-grade Cognitive account has a lot of footguns: it ships with a public endpoint and two live access keys by default, the custom_subdomain_name is mandatory the moment you want Azure AD (Entra) token auth or a private endpoint, and the subdomain must be globally unique. This module wraps all of that so every AI account in your estate is born the same way — custom subdomain set, public network access off, local key auth disabled in favour of Entra RBAC, a system-assigned managed identity attached, an optional private endpoint, and (when you do need keys) the primary/secondary keys pushed straight into Key Vault instead of leaking into state outputs.

When to use it

If you only need a throwaway F0 account for a spike, the bare resource is fine — reach for this module when the account is long-lived, networked, or governed.

Module structure

terraform-module-azure-cognitive-services/
├── versions.tf
├── main.tf
├── variables.tf
└── outputs.tf
# versions.tf
terraform {
  required_version = ">= 1.6.0"

  required_providers {
    azurerm = {
      source  = "hashicorp/azurerm"
      version = "~> 4.0"
    }
  }
}
# main.tf

locals {
  # custom_subdomain_name must be globally unique and is REQUIRED for
  # AAD/token auth and for private endpoints. Default to the account name.
  custom_subdomain = coalesce(var.custom_subdomain_name, var.name)

  # Only push keys to Key Vault when both a vault is supplied AND local
  # auth is actually enabled (no keys exist when local_auth is disabled).
  store_keys = var.key_vault_id != null && var.local_auth_enabled
}

resource "azurerm_cognitive_account" "this" {
  name                = var.name
  resource_group_name = var.resource_group_name
  location            = var.location

  kind     = var.kind
  sku_name = var.sku_name

  custom_subdomain_name         = local.custom_subdomain
  public_network_access_enabled = var.public_network_access_enabled
  local_auth_enabled            = var.local_auth_enabled
  outbound_network_access_restricted = var.outbound_network_access_restricted

  # Pin the data residency for OpenAI/AI accounts where it is supported.
  dynamic "identity" {
    for_each = var.identity_type == null ? [] : [1]
    content {
      type         = var.identity_type
      identity_ids = var.identity_type == "UserAssigned" || var.identity_type == "SystemAssigned, UserAssigned" ? var.user_assigned_identity_ids : null
    }
  }

  # Network ACLs only make sense when the public endpoint is on but locked
  # to specific IPs/subnets, OR when a private endpoint needs a default deny.
  dynamic "network_acls" {
    for_each = var.network_acls == null ? [] : [var.network_acls]
    content {
      default_action = network_acls.value.default_action
      ip_rules       = network_acls.value.ip_rules

      dynamic "virtual_network_rules" {
        for_each = network_acls.value.virtual_network_subnet_ids
        content {
          subnet_id = virtual_network_rules.value
        }
      }
    }
  }

  # Customer-managed key encryption (BYOK) for compliance estates.
  dynamic "customer_managed_key" {
    for_each = var.customer_managed_key == null ? [] : [var.customer_managed_key]
    content {
      key_vault_key_id   = customer_managed_key.value.key_vault_key_id
      identity_client_id = customer_managed_key.value.identity_client_id
    }
  }

  tags = var.tags

  lifecycle {
    # The subdomain is immutable; protect it (and the data it fronts) from
    # accidental recreation on drift.
    ignore_changes = []
  }
}

# Optional private endpoint so the account is reachable only over the VNet.
resource "azurerm_private_endpoint" "this" {
  count = var.private_endpoint == null ? 0 : 1

  name                = coalesce(var.private_endpoint.name, "${var.name}-pe")
  resource_group_name = var.resource_group_name
  location            = var.location
  subnet_id           = var.private_endpoint.subnet_id

  private_service_connection {
    name                           = "${var.name}-psc"
    private_connection_resource_id = azurerm_cognitive_account.this.id
    subresource_names              = ["account"]
    is_manual_connection           = false
  }

  dynamic "private_dns_zone_group" {
    for_each = length(var.private_endpoint.private_dns_zone_ids) == 0 ? [] : [1]
    content {
      name                 = "default"
      private_dns_zone_ids = var.private_endpoint.private_dns_zone_ids
    }
  }

  tags = var.tags
}

# Stash the live keys in Key Vault (only when local auth is enabled).
resource "azurerm_key_vault_secret" "primary_key" {
  count = local.store_keys ? 1 : 0

  name         = "${var.name}-primary-key"
  value        = azurerm_cognitive_account.this.primary_access_key
  key_vault_id = var.key_vault_id
  content_type = "cognitive-services-key"
  tags         = var.tags
}

resource "azurerm_key_vault_secret" "secondary_key" {
  count = local.store_keys ? 1 : 0

  name         = "${var.name}-secondary-key"
  value        = azurerm_cognitive_account.this.secondary_access_key
  key_vault_id = var.key_vault_id
  content_type = "cognitive-services-key"
  tags         = var.tags
}
# variables.tf

variable "name" {
  type        = string
  description = "Name of the Cognitive Services account. Also used as the default custom subdomain, so it must be globally unique."

  validation {
    condition     = can(regex("^[a-zA-Z0-9][a-zA-Z0-9-]{1,62}[a-zA-Z0-9]$", var.name))
    error_message = "name must be 3-64 chars, alphanumeric or hyphens, and start/end with an alphanumeric character."
  }
}

variable "resource_group_name" {
  type        = string
  description = "Resource group that will hold the account and its child resources."
}

variable "location" {
  type        = string
  description = "Azure region for the account (e.g. swedencentral, eastus2). Pick a region where your chosen kind/models are available."
}

variable "kind" {
  type        = string
  description = "The API surface to provision (decides which Cognitive Service you get)."

  validation {
    condition = contains([
      "OpenAI", "AIServices", "ComputerVision", "CustomVision.Training",
      "CustomVision.Prediction", "TextAnalytics", "Language", "SpeechServices",
      "FormRecognizer", "ContentSafety", "Face", "ContentModerator",
      "TextTranslation", "HealthInsights", "MetricsAdvisor", "Personalizer"
    ], var.kind)
    error_message = "kind must be a supported azurerm_cognitive_account kind (e.g. OpenAI, AIServices, ComputerVision, FormRecognizer, SpeechServices, ContentSafety)."
  }
}

variable "sku_name" {
  type        = string
  default     = "S0"
  description = "Pricing tier. F0 = free (one per subscription per kind), S0 = standard. OpenAI also supports DataZoneStandard, GlobalStandard, ProvisionedManaged, etc."
}

variable "custom_subdomain_name" {
  type        = string
  default     = null
  description = "Custom subdomain (globally unique). Required for AAD token auth and private endpoints. Defaults to var.name when null."
}

variable "public_network_access_enabled" {
  type        = bool
  default     = false
  description = "Whether the public endpoint is reachable. Defaults to false (private-by-default); set true only with network_acls or for dev."
}

variable "local_auth_enabled" {
  type        = bool
  default     = false
  description = "Whether API-key (local) auth is allowed. Defaults to false so callers must use Entra ID tokens + RBAC. When false, no keys are stored in Key Vault."
}

variable "outbound_network_access_restricted" {
  type        = bool
  default     = false
  description = "Restrict the account's outbound network access (e.g. for OpenAI 'on your data' egress control)."
}

variable "identity_type" {
  type        = string
  default     = "SystemAssigned"
  description = "Managed identity type: SystemAssigned, UserAssigned, 'SystemAssigned, UserAssigned', or null for none."

  validation {
    condition = var.identity_type == null || contains([
      "SystemAssigned", "UserAssigned", "SystemAssigned, UserAssigned"
    ], var.identity_type)
    error_message = "identity_type must be one of: SystemAssigned, UserAssigned, 'SystemAssigned, UserAssigned', or null."
  }
}

variable "user_assigned_identity_ids" {
  type        = list(string)
  default     = []
  description = "User-assigned identity resource IDs (required when identity_type includes UserAssigned)."
}

variable "network_acls" {
  type = object({
    default_action             = string
    ip_rules                   = optional(list(string), [])
    virtual_network_subnet_ids = optional(list(string), [])
  })
  default     = null
  description = "Optional network rules. default_action is Allow or Deny; combine Deny with ip_rules/subnet_ids to allow-list."

  validation {
    condition     = var.network_acls == null || contains(["Allow", "Deny"], try(var.network_acls.default_action, ""))
    error_message = "network_acls.default_action must be either 'Allow' or 'Deny'."
  }
}

variable "private_endpoint" {
  type = object({
    name                 = optional(string)
    subnet_id            = string
    private_dns_zone_ids = optional(list(string), [])
  })
  default     = null
  description = "Optional private endpoint. private_dns_zone_ids should point at the privatelink.cognitiveservices.azure.com (or .openai.azure.com) zone."
}

variable "customer_managed_key" {
  type = object({
    key_vault_key_id   = string
    identity_client_id = optional(string)
  })
  default     = null
  description = "Optional BYOK encryption. key_vault_key_id is the versionless/ versioned key ID; identity_client_id is the UAMI used to reach the vault."
}

variable "key_vault_id" {
  type        = string
  default     = null
  description = "Key Vault to store the primary/secondary keys in. Only used when local_auth_enabled = true."
}

variable "tags" {
  type        = map(string)
  default     = {}
  description = "Tags applied to all resources created by the module."
}
# outputs.tf

output "id" {
  description = "Resource ID of the Cognitive Services account."
  value       = azurerm_cognitive_account.this.id
}

output "name" {
  description = "Name of the Cognitive Services account."
  value       = azurerm_cognitive_account.this.name
}

output "endpoint" {
  description = "The HTTPS endpoint used by SDKs/REST clients to reach the account."
  value       = azurerm_cognitive_account.this.endpoint
}

output "custom_subdomain_name" {
  description = "The custom subdomain assigned to the account (used in the token-auth endpoint)."
  value       = azurerm_cognitive_account.this.custom_subdomain_name
}

output "identity_principal_id" {
  description = "Principal ID of the system-assigned identity (null if no system identity)."
  value       = try(azurerm_cognitive_account.this.identity[0].principal_id, null)
}

output "private_endpoint_id" {
  description = "Resource ID of the private endpoint, if one was created."
  value       = try(azurerm_private_endpoint.this[0].id, null)
}

output "primary_key_secret_id" {
  description = "Key Vault secret ID holding the primary key (null when local auth is disabled or no vault supplied)."
  value       = try(azurerm_key_vault_secret.primary_key[0].id, null)
}

output "primary_access_key" {
  description = "Primary access key. Empty when local_auth_enabled = false. Marked sensitive."
  value       = azurerm_cognitive_account.this.primary_access_key
  sensitive   = true
}

How to use it

The example below provisions a private Azure OpenAI account with Entra-only auth (no keys), wires it to a VNet subnet via private endpoint, then references the module’s id output from a downstream azurerm_cognitive_deployment for GPT-4o.

module "cognitive_services" {
  source = "git::https://dev.azure.com/teknohut/kloudvin/_git/terraform-modules//terraform-module-azure-cognitive-services?ref=v1.0.0"

  name                = "kv-prod-openai-swc"
  resource_group_name = azurerm_resource_group.ai.name
  location            = "swedencentral"

  kind     = "OpenAI"
  sku_name = "S0"

  # Private-by-default posture: no public endpoint, no local keys.
  public_network_access_enabled = false
  local_auth_enabled            = false

  identity_type = "SystemAssigned"

  private_endpoint = {
    subnet_id            = azurerm_subnet.privatelink.id
    private_dns_zone_ids = [azurerm_private_dns_zone.openai.id]
  }

  tags = {
    workload    = "rag-assistant"
    environment = "prod"
    costcenter  = "ml-platform"
  }
}

# Downstream: a model deployment hung off the account created by the module.
resource "azurerm_cognitive_deployment" "gpt4o" {
  name                 = "gpt-4o"
  cognitive_account_id = module.cognitive_services.id

  model {
    format  = "OpenAI"
    name    = "gpt-4o"
    version = "2024-08-06"
  }

  sku {
    name     = "DataZoneStandard"
    capacity = 50
  }
}

# Grant the app's managed identity data-plane access (Entra RBAC, no keys).
resource "azurerm_role_assignment" "app_openai_user" {
  scope                = module.cognitive_services.id
  role_definition_name = "Cognitive Services OpenAI User"
  principal_id         = azurerm_user_assigned_identity.app.principal_id
}

With Terragrunt

Terragrunt keeps this module DRY across environments — define the backend and provider once in a root config, then a thin terragrunt.hcl per environment supplies only the inputs that differ.

1. Root configlive/terragrunt.hcl (inherited by every module):

remote_state {
  backend = "azurerm"
  generate = { path = "backend.tf", if_exists = "overwrite" }
  config = {
    # ...azurerm state bucket/container + key per path...
  }
}

2. Module configlive/prod/cognitive_services/terragrunt.hcl:

include "root" {
  path = find_in_parent_folders()
}

terraform {
  source = "git::https://dev.azure.com/teknohut/kloudvin/_git/terraform-modules//terraform-module-azure-cognitive-services?ref=v1.0.0"
}

inputs = {
  name = "..."
  resource_group_name = "..."
  location = "..."
  kind = "..."
}

3. Deploy one environment, or roll out all modules together:

cd live/prod/cognitive_services && terragrunt apply        # this module
terragrunt run-all apply                      # every module under live/prod

Why Terragrunt here: the backend and provider live in one place instead of being copy-pasted into every module; inputs is overridden per environment (dev / stage / prod) without forking the module; and run-all orchestrates dependencies across modules. Reach for it once you have more than one environment or more than a handful of modules — for a single stack, the plain Quickstart above is enough.

Inputs

Name Type Default Required Description
name string Yes Account name; also the default custom subdomain, so it must be globally unique.
resource_group_name string Yes Resource group for the account and child resources.
location string Yes Azure region; must support your chosen kind/models.
kind string Yes API surface (e.g. OpenAI, AIServices, ComputerVision, FormRecognizer, SpeechServices, ContentSafety).
sku_name string "S0" No Pricing tier (F0, S0, GlobalStandard, DataZoneStandard, ProvisionedManaged, …).
custom_subdomain_name string null No Globally unique subdomain; required for AAD auth/private endpoints. Defaults to name.
public_network_access_enabled bool false No Whether the public endpoint is reachable.
local_auth_enabled bool false No Allow API-key auth. When false, no keys are written to Key Vault.
outbound_network_access_restricted bool false No Restrict the account’s outbound egress.
identity_type string "SystemAssigned" No SystemAssigned, UserAssigned, "SystemAssigned, UserAssigned", or null.
user_assigned_identity_ids list(string) [] No UAMI resource IDs, required when identity_type includes UserAssigned.
network_acls object null No default_action (Allow/Deny) plus ip_rules and virtual_network_subnet_ids.
private_endpoint object null No subnet_id (required), optional name, and private_dns_zone_ids.
customer_managed_key object null No BYOK: key_vault_key_id and optional identity_client_id.
key_vault_id string null No Vault to store primary/secondary keys (only when local_auth_enabled = true).
tags map(string) {} No Tags applied to all created resources.

Outputs

Name Description
id Resource ID of the Cognitive Services account.
name Name of the account.
endpoint HTTPS endpoint used by SDKs/REST clients.
custom_subdomain_name The custom subdomain assigned to the account.
identity_principal_id Principal ID of the system-assigned identity (null if none).
private_endpoint_id Resource ID of the private endpoint, if created.
primary_key_secret_id Key Vault secret ID for the primary key (null when local auth is off).
primary_access_key Primary access key (sensitive; empty when local auth is disabled).

Enterprise scenario

A financial-services platform team runs a RAG copilot over internal documents and must keep every AI call off the public internet for regulatory reasons. They consume this module once per region (swedencentral, eastus2) with kind = "OpenAI", public_network_access_enabled = false, local_auth_enabled = false, and a private endpoint into the shared hub’s privatelink.openai.azure.com zone — so the GPT-4o endpoint resolves only inside the VNet and every request carries an Entra token mapped to the Cognitive Services OpenAI User role. Because keys are disabled at the source, an Azure Policy “deny local auth” audit passes with zero exceptions, and the platform team can grant or revoke app access purely through role assignments, no secret rotation involved.

Best practices

TerraformAzureCognitive ServicesModuleIaC
Need this built for real?

Vinod is a Senior Cloud Architect (22+ yrs) — available for Azure / AWS / GCP architecture, landing zones, and migrations.

Work with me

Comments

Keep Reading