Quick take — A reusable hashicorp/azurerm ~> 4.0 Terraform module for azurerm_cognitive_account: pick the kind/SKU, lock it behind private endpoints and network rules, disable local keys, and stash secrets in Key Vault. New here? Jump to the Quickstart below to deploy it in minutes; read on for how it works and when to reach for it.
Quickstart (copy-paste)
Minimal, runnable configuration — drop this in a .tf file and fill in the "..." placeholders (each required input is commented):
provider "azurerm" {
features {}
}
module "cognitive_services" {
source = "git::https://dev.azure.com/teknohut/kloudvin/_git/terraform-modules//terraform-module-azure-cognitive-services?ref=v1.0.0"
name = "..." # Account name; also the default custom subdomain, so it …
resource_group_name = "..." # Resource group for the account and child resources.
location = "..." # Azure region; must support your chosen `kind`/models.
kind = "..." # API surface (e.g. `OpenAI`, `AIServices`, `ComputerVisi…
}
Then terraform init && terraform apply. Every other input has a sensible default — see Inputs below to override behaviour.
What this module is
Azure Cognitive Services (now grouped under Azure AI Services) is the family of managed AI APIs — OpenAI, ComputerVision, TextAnalytics, SpeechServices, FormRecognizer (Document Intelligence), ContentSafety and a dozen more — all provisioned through a single Terraform resource: azurerm_cognitive_account. The kind argument decides which API surface you get, and the sku_name (F0 free, S0 standard, or the data-zone/provisioned tiers for OpenAI) decides how it bills.
The raw resource is deceptively simple, but a production-grade Cognitive account has a lot of footguns: it ships with a public endpoint and two live access keys by default, the custom_subdomain_name is mandatory the moment you want Azure AD (Entra) token auth or a private endpoint, and the subdomain must be globally unique. This module wraps all of that so every AI account in your estate is born the same way — custom subdomain set, public network access off, local key auth disabled in favour of Entra RBAC, a system-assigned managed identity attached, an optional private endpoint, and (when you do need keys) the primary/secondary keys pushed straight into Key Vault instead of leaking into state outputs.
When to use it
- You are standing up Azure OpenAI and need the account, a custom subdomain, and private networking before any
azurerm_cognitive_deployment(GPT-4o, embeddings) can be attached. - You run many Cognitive accounts of different
kinds (Vision for one app, Document Intelligence for another, Speech for a third) and want one consistent, audited shape instead of hand-rolled resources. - You have a “no public endpoints” or “no local keys” policy (Azure Policy / landing-zone guardrail) and need the module to enforce
public_network_access_enabled = falseandlocal_auth_enabled = falseby default. - You want secrets handled correctly: keys (if used at all) land in Key Vault, never in a plaintext
terraform output.
If you only need a throwaway F0 account for a spike, the bare resource is fine — reach for this module when the account is long-lived, networked, or governed.
Module structure
terraform-module-azure-cognitive-services/
├── versions.tf
├── main.tf
├── variables.tf
└── outputs.tf
# versions.tf
terraform {
required_version = ">= 1.6.0"
required_providers {
azurerm = {
source = "hashicorp/azurerm"
version = "~> 4.0"
}
}
}
# main.tf
locals {
# custom_subdomain_name must be globally unique and is REQUIRED for
# AAD/token auth and for private endpoints. Default to the account name.
custom_subdomain = coalesce(var.custom_subdomain_name, var.name)
# Only push keys to Key Vault when both a vault is supplied AND local
# auth is actually enabled (no keys exist when local_auth is disabled).
store_keys = var.key_vault_id != null && var.local_auth_enabled
}
resource "azurerm_cognitive_account" "this" {
name = var.name
resource_group_name = var.resource_group_name
location = var.location
kind = var.kind
sku_name = var.sku_name
custom_subdomain_name = local.custom_subdomain
public_network_access_enabled = var.public_network_access_enabled
local_auth_enabled = var.local_auth_enabled
outbound_network_access_restricted = var.outbound_network_access_restricted
# Pin the data residency for OpenAI/AI accounts where it is supported.
dynamic "identity" {
for_each = var.identity_type == null ? [] : [1]
content {
type = var.identity_type
identity_ids = var.identity_type == "UserAssigned" || var.identity_type == "SystemAssigned, UserAssigned" ? var.user_assigned_identity_ids : null
}
}
# Network ACLs only make sense when the public endpoint is on but locked
# to specific IPs/subnets, OR when a private endpoint needs a default deny.
dynamic "network_acls" {
for_each = var.network_acls == null ? [] : [var.network_acls]
content {
default_action = network_acls.value.default_action
ip_rules = network_acls.value.ip_rules
dynamic "virtual_network_rules" {
for_each = network_acls.value.virtual_network_subnet_ids
content {
subnet_id = virtual_network_rules.value
}
}
}
}
# Customer-managed key encryption (BYOK) for compliance estates.
dynamic "customer_managed_key" {
for_each = var.customer_managed_key == null ? [] : [var.customer_managed_key]
content {
key_vault_key_id = customer_managed_key.value.key_vault_key_id
identity_client_id = customer_managed_key.value.identity_client_id
}
}
tags = var.tags
lifecycle {
# The subdomain is immutable; protect it (and the data it fronts) from
# accidental recreation on drift.
ignore_changes = []
}
}
# Optional private endpoint so the account is reachable only over the VNet.
resource "azurerm_private_endpoint" "this" {
count = var.private_endpoint == null ? 0 : 1
name = coalesce(var.private_endpoint.name, "${var.name}-pe")
resource_group_name = var.resource_group_name
location = var.location
subnet_id = var.private_endpoint.subnet_id
private_service_connection {
name = "${var.name}-psc"
private_connection_resource_id = azurerm_cognitive_account.this.id
subresource_names = ["account"]
is_manual_connection = false
}
dynamic "private_dns_zone_group" {
for_each = length(var.private_endpoint.private_dns_zone_ids) == 0 ? [] : [1]
content {
name = "default"
private_dns_zone_ids = var.private_endpoint.private_dns_zone_ids
}
}
tags = var.tags
}
# Stash the live keys in Key Vault (only when local auth is enabled).
resource "azurerm_key_vault_secret" "primary_key" {
count = local.store_keys ? 1 : 0
name = "${var.name}-primary-key"
value = azurerm_cognitive_account.this.primary_access_key
key_vault_id = var.key_vault_id
content_type = "cognitive-services-key"
tags = var.tags
}
resource "azurerm_key_vault_secret" "secondary_key" {
count = local.store_keys ? 1 : 0
name = "${var.name}-secondary-key"
value = azurerm_cognitive_account.this.secondary_access_key
key_vault_id = var.key_vault_id
content_type = "cognitive-services-key"
tags = var.tags
}
# variables.tf
variable "name" {
type = string
description = "Name of the Cognitive Services account. Also used as the default custom subdomain, so it must be globally unique."
validation {
condition = can(regex("^[a-zA-Z0-9][a-zA-Z0-9-]{1,62}[a-zA-Z0-9]$", var.name))
error_message = "name must be 3-64 chars, alphanumeric or hyphens, and start/end with an alphanumeric character."
}
}
variable "resource_group_name" {
type = string
description = "Resource group that will hold the account and its child resources."
}
variable "location" {
type = string
description = "Azure region for the account (e.g. swedencentral, eastus2). Pick a region where your chosen kind/models are available."
}
variable "kind" {
type = string
description = "The API surface to provision (decides which Cognitive Service you get)."
validation {
condition = contains([
"OpenAI", "AIServices", "ComputerVision", "CustomVision.Training",
"CustomVision.Prediction", "TextAnalytics", "Language", "SpeechServices",
"FormRecognizer", "ContentSafety", "Face", "ContentModerator",
"TextTranslation", "HealthInsights", "MetricsAdvisor", "Personalizer"
], var.kind)
error_message = "kind must be a supported azurerm_cognitive_account kind (e.g. OpenAI, AIServices, ComputerVision, FormRecognizer, SpeechServices, ContentSafety)."
}
}
variable "sku_name" {
type = string
default = "S0"
description = "Pricing tier. F0 = free (one per subscription per kind), S0 = standard. OpenAI also supports DataZoneStandard, GlobalStandard, ProvisionedManaged, etc."
}
variable "custom_subdomain_name" {
type = string
default = null
description = "Custom subdomain (globally unique). Required for AAD token auth and private endpoints. Defaults to var.name when null."
}
variable "public_network_access_enabled" {
type = bool
default = false
description = "Whether the public endpoint is reachable. Defaults to false (private-by-default); set true only with network_acls or for dev."
}
variable "local_auth_enabled" {
type = bool
default = false
description = "Whether API-key (local) auth is allowed. Defaults to false so callers must use Entra ID tokens + RBAC. When false, no keys are stored in Key Vault."
}
variable "outbound_network_access_restricted" {
type = bool
default = false
description = "Restrict the account's outbound network access (e.g. for OpenAI 'on your data' egress control)."
}
variable "identity_type" {
type = string
default = "SystemAssigned"
description = "Managed identity type: SystemAssigned, UserAssigned, 'SystemAssigned, UserAssigned', or null for none."
validation {
condition = var.identity_type == null || contains([
"SystemAssigned", "UserAssigned", "SystemAssigned, UserAssigned"
], var.identity_type)
error_message = "identity_type must be one of: SystemAssigned, UserAssigned, 'SystemAssigned, UserAssigned', or null."
}
}
variable "user_assigned_identity_ids" {
type = list(string)
default = []
description = "User-assigned identity resource IDs (required when identity_type includes UserAssigned)."
}
variable "network_acls" {
type = object({
default_action = string
ip_rules = optional(list(string), [])
virtual_network_subnet_ids = optional(list(string), [])
})
default = null
description = "Optional network rules. default_action is Allow or Deny; combine Deny with ip_rules/subnet_ids to allow-list."
validation {
condition = var.network_acls == null || contains(["Allow", "Deny"], try(var.network_acls.default_action, ""))
error_message = "network_acls.default_action must be either 'Allow' or 'Deny'."
}
}
variable "private_endpoint" {
type = object({
name = optional(string)
subnet_id = string
private_dns_zone_ids = optional(list(string), [])
})
default = null
description = "Optional private endpoint. private_dns_zone_ids should point at the privatelink.cognitiveservices.azure.com (or .openai.azure.com) zone."
}
variable "customer_managed_key" {
type = object({
key_vault_key_id = string
identity_client_id = optional(string)
})
default = null
description = "Optional BYOK encryption. key_vault_key_id is the versionless/ versioned key ID; identity_client_id is the UAMI used to reach the vault."
}
variable "key_vault_id" {
type = string
default = null
description = "Key Vault to store the primary/secondary keys in. Only used when local_auth_enabled = true."
}
variable "tags" {
type = map(string)
default = {}
description = "Tags applied to all resources created by the module."
}
# outputs.tf
output "id" {
description = "Resource ID of the Cognitive Services account."
value = azurerm_cognitive_account.this.id
}
output "name" {
description = "Name of the Cognitive Services account."
value = azurerm_cognitive_account.this.name
}
output "endpoint" {
description = "The HTTPS endpoint used by SDKs/REST clients to reach the account."
value = azurerm_cognitive_account.this.endpoint
}
output "custom_subdomain_name" {
description = "The custom subdomain assigned to the account (used in the token-auth endpoint)."
value = azurerm_cognitive_account.this.custom_subdomain_name
}
output "identity_principal_id" {
description = "Principal ID of the system-assigned identity (null if no system identity)."
value = try(azurerm_cognitive_account.this.identity[0].principal_id, null)
}
output "private_endpoint_id" {
description = "Resource ID of the private endpoint, if one was created."
value = try(azurerm_private_endpoint.this[0].id, null)
}
output "primary_key_secret_id" {
description = "Key Vault secret ID holding the primary key (null when local auth is disabled or no vault supplied)."
value = try(azurerm_key_vault_secret.primary_key[0].id, null)
}
output "primary_access_key" {
description = "Primary access key. Empty when local_auth_enabled = false. Marked sensitive."
value = azurerm_cognitive_account.this.primary_access_key
sensitive = true
}
How to use it
The example below provisions a private Azure OpenAI account with Entra-only auth (no keys), wires it to a VNet subnet via private endpoint, then references the module’s id output from a downstream azurerm_cognitive_deployment for GPT-4o.
module "cognitive_services" {
source = "git::https://dev.azure.com/teknohut/kloudvin/_git/terraform-modules//terraform-module-azure-cognitive-services?ref=v1.0.0"
name = "kv-prod-openai-swc"
resource_group_name = azurerm_resource_group.ai.name
location = "swedencentral"
kind = "OpenAI"
sku_name = "S0"
# Private-by-default posture: no public endpoint, no local keys.
public_network_access_enabled = false
local_auth_enabled = false
identity_type = "SystemAssigned"
private_endpoint = {
subnet_id = azurerm_subnet.privatelink.id
private_dns_zone_ids = [azurerm_private_dns_zone.openai.id]
}
tags = {
workload = "rag-assistant"
environment = "prod"
costcenter = "ml-platform"
}
}
# Downstream: a model deployment hung off the account created by the module.
resource "azurerm_cognitive_deployment" "gpt4o" {
name = "gpt-4o"
cognitive_account_id = module.cognitive_services.id
model {
format = "OpenAI"
name = "gpt-4o"
version = "2024-08-06"
}
sku {
name = "DataZoneStandard"
capacity = 50
}
}
# Grant the app's managed identity data-plane access (Entra RBAC, no keys).
resource "azurerm_role_assignment" "app_openai_user" {
scope = module.cognitive_services.id
role_definition_name = "Cognitive Services OpenAI User"
principal_id = azurerm_user_assigned_identity.app.principal_id
}
With Terragrunt
Terragrunt keeps this module DRY across environments — define the backend and provider once in a root config, then a thin terragrunt.hcl per environment supplies only the inputs that differ.
1. Root config — live/terragrunt.hcl (inherited by every module):
remote_state {
backend = "azurerm"
generate = { path = "backend.tf", if_exists = "overwrite" }
config = {
# ...azurerm state bucket/container + key per path...
}
}
2. Module config — live/prod/cognitive_services/terragrunt.hcl:
include "root" {
path = find_in_parent_folders()
}
terraform {
source = "git::https://dev.azure.com/teknohut/kloudvin/_git/terraform-modules//terraform-module-azure-cognitive-services?ref=v1.0.0"
}
inputs = {
name = "..."
resource_group_name = "..."
location = "..."
kind = "..."
}
3. Deploy one environment, or roll out all modules together:
cd live/prod/cognitive_services && terragrunt apply # this module
terragrunt run-all apply # every module under live/prod
Why Terragrunt here: the backend and provider live in one place instead of being copy-pasted into every module; inputs is overridden per environment (dev / stage / prod) without forking the module; and run-all orchestrates dependencies across modules. Reach for it once you have more than one environment or more than a handful of modules — for a single stack, the plain Quickstart above is enough.
Inputs
| Name | Type | Default | Required | Description |
|---|---|---|---|---|
name |
string |
— | Yes | Account name; also the default custom subdomain, so it must be globally unique. |
resource_group_name |
string |
— | Yes | Resource group for the account and child resources. |
location |
string |
— | Yes | Azure region; must support your chosen kind/models. |
kind |
string |
— | Yes | API surface (e.g. OpenAI, AIServices, ComputerVision, FormRecognizer, SpeechServices, ContentSafety). |
sku_name |
string |
"S0" |
No | Pricing tier (F0, S0, GlobalStandard, DataZoneStandard, ProvisionedManaged, …). |
custom_subdomain_name |
string |
null |
No | Globally unique subdomain; required for AAD auth/private endpoints. Defaults to name. |
public_network_access_enabled |
bool |
false |
No | Whether the public endpoint is reachable. |
local_auth_enabled |
bool |
false |
No | Allow API-key auth. When false, no keys are written to Key Vault. |
outbound_network_access_restricted |
bool |
false |
No | Restrict the account’s outbound egress. |
identity_type |
string |
"SystemAssigned" |
No | SystemAssigned, UserAssigned, "SystemAssigned, UserAssigned", or null. |
user_assigned_identity_ids |
list(string) |
[] |
No | UAMI resource IDs, required when identity_type includes UserAssigned. |
network_acls |
object |
null |
No | default_action (Allow/Deny) plus ip_rules and virtual_network_subnet_ids. |
private_endpoint |
object |
null |
No | subnet_id (required), optional name, and private_dns_zone_ids. |
customer_managed_key |
object |
null |
No | BYOK: key_vault_key_id and optional identity_client_id. |
key_vault_id |
string |
null |
No | Vault to store primary/secondary keys (only when local_auth_enabled = true). |
tags |
map(string) |
{} |
No | Tags applied to all created resources. |
Outputs
| Name | Description |
|---|---|
id |
Resource ID of the Cognitive Services account. |
name |
Name of the account. |
endpoint |
HTTPS endpoint used by SDKs/REST clients. |
custom_subdomain_name |
The custom subdomain assigned to the account. |
identity_principal_id |
Principal ID of the system-assigned identity (null if none). |
private_endpoint_id |
Resource ID of the private endpoint, if created. |
primary_key_secret_id |
Key Vault secret ID for the primary key (null when local auth is off). |
primary_access_key |
Primary access key (sensitive; empty when local auth is disabled). |
Enterprise scenario
A financial-services platform team runs a RAG copilot over internal documents and must keep every AI call off the public internet for regulatory reasons. They consume this module once per region (swedencentral, eastus2) with kind = "OpenAI", public_network_access_enabled = false, local_auth_enabled = false, and a private endpoint into the shared hub’s privatelink.openai.azure.com zone — so the GPT-4o endpoint resolves only inside the VNet and every request carries an Entra token mapped to the Cognitive Services OpenAI User role. Because keys are disabled at the source, an Azure Policy “deny local auth” audit passes with zero exceptions, and the platform team can grant or revoke app access purely through role assignments, no secret rotation involved.
Best practices
- Disable local keys, use Entra RBAC. Keep
local_auth_enabled = falseand assign data-plane roles (Cognitive Services OpenAI User,Cognitive Services User) to managed identities. It removes an entire class of leaked-key incidents and the rotation burden that comes with them. - Always set the custom subdomain — and never change it. AAD auth and private endpoints both require it, and it is immutable; renaming forces a destroy/recreate that nukes any attached deployments. Let the module default it to the account name and pick the name carefully.
- Lock the network at the source. Prefer
public_network_access_enabled = falsewith a private endpoint; if you must keep the public endpoint, pair it withnetwork_acls { default_action = "Deny" }and an explicit allow-list rather than leaving it open. - Watch the cost model per
kind.F0is one-per-subscription and rate-limited;S0is pay-as-you-go. For OpenAI, choose between token-basedGlobalStandard/DataZoneStandardand capacity-reservedProvisionedManaged(PTUs) deliberately — PTUs bill whether or not you send traffic, so size deploymentcapacityto real TPM demand. - Tag for showback and pin data residency. Stamp
workload/costcentertags so AI spend is attributable, and deploy into a region that satisfies your data-residency requirement (useDataZoneStandardto keep OpenAI inference within a geography). - If you ever do store keys, store them in Key Vault only. Never expose
primary_access_keythrough a non-sensitive output or alocal-exec; consume theprimary_key_secret_idreference and let the app read the secret at runtime.