IaC Azure

Terraform Module: Azure AI Search — private, identity-bound search clusters in one call

Quick take — Reusable Terraform module for azurerm_search_service on azurerm ~> 4.0: tuned SKU, replica/partition capacity, RBAC-only auth, system-assigned identity, and a locked-down private endpoint. New here? Jump to the Quickstart below to deploy it in minutes; read on for how it works and when to reach for it.

Quickstart (copy-paste)

Minimal, runnable configuration — drop this in a .tf file and fill in the "..." placeholders (each required input is commented):

provider "azurerm" {
  features {}
}

module "search_service" {
  source = "git::https://dev.azure.com/teknohut/kloudvin/_git/terraform-modules//terraform-module-azure-search-service?ref=v1.0.0"

  name                = "..."  # Globally unique service name; 2-60 lowercase alphanumer…
  resource_group_name = "..."  # Resource group for the service and private endpoint.
  location            = "..."  # Azure region (e.g. `centralindia`).
}

Then terraform init && terraform apply. Every other input has a sensible default — see Inputs below to override behaviour.

What this module is

Azure AI Search (formerly Cognitive Search) is a managed search-as-a-service: you push documents into an index and it gives you full-text, faceted, vector, and semantic ranking over them with a query API. It’s the retrieval engine behind site search, product catalogs, and — increasingly — the R in RAG, where it stores embeddings and serves the nearest-neighbour chunks your LLM grounds on.

The trouble is that a production-grade search service is never just azurerm_search_service { sku = "standard" }. You need replicas sized for query QPS and SLA, partitions sized for index size, an identity so the indexer can reach Blob/SQL/Cosmos without keys, RBAC-only authentication so nobody is passing admin keys around, and a private endpoint so the service never answers on its public IP. Wrapping all of that in a module means every search service across your estate is born compliant, with the same naming, the same network posture, and the same diagnostic wiring — instead of each team rediscovering that partition_count is immutable the hard way.

When to use it

Module structure

terraform-module-azure-search-service/
├── versions.tf      # provider + Terraform version pins
├── main.tf          # azurerm_search_service + private endpoint + diagnostics
├── variables.tf     # var-driven inputs with validation
└── outputs.tf       # id, name, endpoint, identity principal, key attrs

versions.tf

terraform {
  required_version = ">= 1.5.0"

  required_providers {
    azurerm = {
      source  = "hashicorp/azurerm"
      version = "~> 4.0"
    }
  }
}

main.tf

locals {
  # AI Search only supports 1, 2, 3, 4, 6, or 12 partitions.
  valid_partition_counts = [1, 2, 3, 4, 6, 12]

  # Free/Basic tiers cannot use customer identity-bound features the same way
  # and don't support high replica counts; we surface that in validation below.
  is_billable_tier = !contains(["free", "basic"], lower(var.sku))
}

resource "azurerm_search_service" "this" {
  name                = var.name
  resource_group_name = var.resource_group_name
  location            = var.location
  sku                 = var.sku

  replica_count   = var.replica_count
  partition_count = var.partition_count

  # Hosting mode "highDensity" is only valid for the standard3 SKU and lets a
  # single service hold up to 1,000 indexes. Anything else must be "default".
  hosting_mode = var.hosting_mode

  # Security posture: turn the public endpoint off and force AAD/RBAC.
  public_network_access_enabled = var.public_network_access_enabled

  # "false" disables admin/query API keys entirely (RBAC only).
  # "true" allows both keys and RBAC. We default to RBAC-only.
  local_authentication_enabled = var.local_authentication_enabled

  # Only meaningful when local auth is enabled; pins data-plane calls to RBAC
  # roles instead of accepting an admin key as a superuser.
  authentication_failure_mode = var.local_authentication_enabled ? var.authentication_failure_mode : null

  # Restrict which networks may hit the (still-public) endpoint. Ignored when
  # public access is disabled, but kept for services that must stay public.
  dynamic "allowed_ips" {
    for_each = var.public_network_access_enabled ? toset(var.allowed_ips) : toset([])
    content {
      value = allowed_ips.value
    }
  }

  # A system-assigned identity lets indexers and skillsets reach data sources
  # and Azure OpenAI without keys. User-assigned can be layered in via var.
  identity {
    type         = var.identity_type
    identity_ids = var.identity_type == "UserAssigned" || var.identity_type == "SystemAssigned, UserAssigned" ? var.identity_ids : null
  }

  tags = var.tags

  lifecycle {
    precondition {
      condition     = contains(local.valid_partition_counts, var.partition_count)
      error_message = "partition_count must be one of 1, 2, 3, 4, 6, or 12."
    }
    precondition {
      condition     = !(local.is_billable_tier == false && var.replica_count > 3)
      error_message = "free and basic SKUs support at most 3 replicas; scale up the SKU for more."
    }
  }
}

# Private endpoint: the recommended way to reach the service from a VNet with
# no public exposure. Created only when a subnet is supplied.
resource "azurerm_private_endpoint" "this" {
  count = var.private_endpoint_subnet_id != null ? 1 : 0

  name                = "pe-${var.name}"
  resource_group_name = var.resource_group_name
  location            = var.location
  subnet_id           = var.private_endpoint_subnet_id

  private_service_connection {
    name                           = "psc-${var.name}"
    private_connection_resource_id = azurerm_search_service.this.id
    subresource_names              = ["searchService"]
    is_manual_connection           = false
  }

  dynamic "private_dns_zone_group" {
    for_each = var.private_dns_zone_ids != null ? [1] : []
    content {
      name                 = "search-dns"
      private_dns_zone_ids = var.private_dns_zone_ids
    }
  }

  tags = var.tags
}

# Optional diagnostic settings -> Log Analytics for query/indexer telemetry.
resource "azurerm_monitor_diagnostic_setting" "this" {
  count = var.log_analytics_workspace_id != null ? 1 : 0

  name                       = "diag-${var.name}"
  target_resource_id         = azurerm_search_service.this.id
  log_analytics_workspace_id = var.log_analytics_workspace_id

  enabled_log {
    category = "OperationLogs"
  }

  enabled_metric {
    category = "AllMetrics"
  }
}

variables.tf

variable "name" {
  description = "Name of the Azure AI Search service. Globally unique, 2-60 chars, lowercase letters/digits/hyphens, no leading/trailing hyphen."
  type        = string

  validation {
    condition     = can(regex("^[a-z0-9](?:[a-z0-9-]{0,58}[a-z0-9])?$", var.name))
    error_message = "name must be 2-60 chars, lowercase alphanumerics and hyphens only, and may not start or end with a hyphen."
  }
}

variable "resource_group_name" {
  description = "Resource group that will hold the search service and its private endpoint."
  type        = string
}

variable "location" {
  description = "Azure region (e.g. southeastasia, centralindia)."
  type        = string
}

variable "sku" {
  description = "Pricing tier: free, basic, standard, standard2, standard3, storage_optimized_l1, or storage_optimized_l2. Immutable after creation."
  type        = string
  default     = "standard"

  validation {
    condition = contains([
      "free", "basic", "standard", "standard2", "standard3",
      "storage_optimized_l1", "storage_optimized_l2"
    ], lower(var.sku))
    error_message = "sku must be one of free, basic, standard, standard2, standard3, storage_optimized_l1, storage_optimized_l2."
  }
}

variable "replica_count" {
  description = "Number of replicas. >=2 is required for read (query) SLA, >=3 for read+write SLA. Free/basic cap at 3."
  type        = number
  default     = 3

  validation {
    condition     = var.replica_count >= 1 && var.replica_count <= 12
    error_message = "replica_count must be between 1 and 12."
  }
}

variable "partition_count" {
  description = "Number of partitions (index storage/scale). Allowed values: 1, 2, 3, 4, 6, 12. Immutable on free/basic SKUs."
  type        = number
  default     = 1
}

variable "hosting_mode" {
  description = "default, or highDensity (standard3 SKU only) for up to 1,000 indexes per service."
  type        = string
  default     = "default"

  validation {
    condition     = contains(["default", "highDensity"], var.hosting_mode)
    error_message = "hosting_mode must be either default or highDensity."
  }
}

variable "public_network_access_enabled" {
  description = "Allow the public endpoint. Set false when fronting the service with a private endpoint."
  type        = bool
  default     = false
}

variable "local_authentication_enabled" {
  description = "Allow admin/query API keys. false = RBAC (Azure AD) authentication only, which is the recommended posture."
  type        = bool
  default     = false
}

variable "authentication_failure_mode" {
  description = "When local auth is enabled, how key-based calls degrade: http401WithBearerChallenge or http403. Ignored when local auth is disabled."
  type        = string
  default     = "http403"

  validation {
    condition     = contains(["http401WithBearerChallenge", "http403"], var.authentication_failure_mode)
    error_message = "authentication_failure_mode must be http401WithBearerChallenge or http403."
  }
}

variable "identity_type" {
  description = "Managed identity for the service: SystemAssigned, UserAssigned, or 'SystemAssigned, UserAssigned'."
  type        = string
  default     = "SystemAssigned"

  validation {
    condition     = contains(["SystemAssigned", "UserAssigned", "SystemAssigned, UserAssigned"], var.identity_type)
    error_message = "identity_type must be SystemAssigned, UserAssigned, or 'SystemAssigned, UserAssigned'."
  }
}

variable "identity_ids" {
  description = "User-assigned identity resource IDs. Required when identity_type includes UserAssigned."
  type        = list(string)
  default     = []
}

variable "allowed_ips" {
  description = "IPv4 addresses or CIDR ranges permitted to reach the public endpoint. Only applied when public_network_access_enabled is true."
  type        = list(string)
  default     = []
}

variable "private_endpoint_subnet_id" {
  description = "Subnet ID for the private endpoint. When null, no private endpoint is created."
  type        = string
  default     = null
}

variable "private_dns_zone_ids" {
  description = "Private DNS zone IDs (privatelink.search.windows.net) to bind the private endpoint A record. Null skips DNS integration."
  type        = list(string)
  default     = null
}

variable "log_analytics_workspace_id" {
  description = "Log Analytics workspace ID for diagnostic settings. When null, no diagnostics are configured."
  type        = string
  default     = null
}

variable "tags" {
  description = "Tags applied to the search service and private endpoint."
  type        = map(string)
  default     = {}
}

outputs.tf

output "id" {
  description = "Resource ID of the Azure AI Search service."
  value       = azurerm_search_service.this.id
}

output "name" {
  description = "Name of the search service."
  value       = azurerm_search_service.this.name
}

output "endpoint" {
  description = "HTTPS query/management endpoint for the service."
  value       = "https://${azurerm_search_service.this.name}.search.windows.net"
}

output "identity_principal_id" {
  description = "Principal ID of the system-assigned identity, for granting it data-source RBAC roles. Null when no system identity is enabled."
  value       = try(azurerm_search_service.this.identity[0].principal_id, null)
}

output "primary_key" {
  description = "Primary admin API key. Empty when local authentication is disabled (RBAC-only)."
  value       = azurerm_search_service.this.primary_key
  sensitive   = true
}

output "query_keys" {
  description = "List of read-only query keys (key + name). Empty when local authentication is disabled."
  value       = azurerm_search_service.this.query_keys
  sensitive   = true
}

output "private_endpoint_ip" {
  description = "Private IP allocated to the search service private endpoint, or null when no private endpoint was created."
  value       = try(azurerm_private_endpoint.this[0].private_service_connection[0].private_ip_address, null)
}

How to use it

module "ai_search" {
  source = "git::https://dev.azure.com/teknohut/kloudvin/_git/terraform-modules//terraform-module-azure-search-service?ref=v1.0.0"

  name                = "kv-rag-search-prod"
  resource_group_name = azurerm_resource_group.platform.name
  location            = "centralindia"

  # Standard tier, sized for SLA: 3 replicas (read+write SLA), 2 partitions.
  sku             = "standard"
  replica_count   = 3
  partition_count = 2

  # Locked down: no public endpoint, RBAC-only (no admin/query keys).
  public_network_access_enabled = false
  local_authentication_enabled  = false

  # System identity so the indexer can pull from Blob and call Azure OpenAI.
  identity_type = "SystemAssigned"

  # Private endpoint into the app subnet, with DNS integration.
  private_endpoint_subnet_id = azurerm_subnet.app.id
  private_dns_zone_ids       = [azurerm_private_dns_zone.search.id]

  log_analytics_workspace_id = azurerm_log_analytics_workspace.platform.id

  tags = {
    environment = "prod"
    workload    = "rag"
    owner       = "platform-team"
  }
}

# Downstream: grant the search service's managed identity read access to the
# Blob container holding documents to be indexed, using the module's output.
resource "azurerm_role_assignment" "search_reads_blob" {
  scope                = azurerm_storage_account.documents.id
  role_definition_name = "Storage Blob Data Reader"
  principal_id         = module.ai_search.identity_principal_id
}

# Downstream: wire the private endpoint to a Front Door / App Service that
# reaches the service over its endpoint URL.
output "search_endpoint" {
  value = module.ai_search.endpoint
}

With Terragrunt

Terragrunt keeps this module DRY across environments — define the backend and provider once in a root config, then a thin terragrunt.hcl per environment supplies only the inputs that differ.

1. Root configlive/terragrunt.hcl (inherited by every module):

remote_state {
  backend = "azurerm"
  generate = { path = "backend.tf", if_exists = "overwrite" }
  config = {
    # ...azurerm state bucket/container + key per path...
  }
}

2. Module configlive/prod/search_service/terragrunt.hcl:

include "root" {
  path = find_in_parent_folders()
}

terraform {
  source = "git::https://dev.azure.com/teknohut/kloudvin/_git/terraform-modules//terraform-module-azure-search-service?ref=v1.0.0"
}

inputs = {
  name = "..."
  resource_group_name = "..."
  location = "..."
}

3. Deploy one environment, or roll out all modules together:

cd live/prod/search_service && terragrunt apply        # this module
terragrunt run-all apply                      # every module under live/prod

Why Terragrunt here: the backend and provider live in one place instead of being copy-pasted into every module; inputs is overridden per environment (dev / stage / prod) without forking the module; and run-all orchestrates dependencies across modules. Reach for it once you have more than one environment or more than a handful of modules — for a single stack, the plain Quickstart above is enough.

Inputs

Name Type Default Required Description
name string Yes Globally unique service name; 2-60 lowercase alphanumerics/hyphens, no leading/trailing hyphen.
resource_group_name string Yes Resource group for the service and private endpoint.
location string Yes Azure region (e.g. centralindia).
sku string "standard" No Tier: free, basic, standard, standard2, standard3, storage_optimized_l1/l2. Immutable.
replica_count number 3 No Replicas; >=2 for read SLA, >=3 for read+write SLA (1-12).
partition_count number 1 No Partitions for index scale; must be 1, 2, 3, 4, 6, or 12.
hosting_mode string "default" No default or highDensity (standard3 only).
public_network_access_enabled bool false No Allow the public endpoint; set false when using a private endpoint.
local_authentication_enabled bool false No Allow admin/query keys; false = RBAC-only.
authentication_failure_mode string "http403" No Key-call failure behaviour when local auth is on: http401WithBearerChallenge or http403.
identity_type string "SystemAssigned" No SystemAssigned, UserAssigned, or SystemAssigned, UserAssigned.
identity_ids list(string) [] No User-assigned identity IDs; required when identity_type includes UserAssigned.
allowed_ips list(string) [] No IPv4/CIDR allowlist for the public endpoint (applied only when public access is on).
private_endpoint_subnet_id string null No Subnet for the private endpoint; null skips creation.
private_dns_zone_ids list(string) null No privatelink.search.windows.net DNS zone IDs for the endpoint A record.
log_analytics_workspace_id string null No Log Analytics workspace for diagnostics; null skips.
tags map(string) {} No Tags applied to all created resources.

Outputs

Name Description
id Resource ID of the search service.
name Name of the search service.
endpoint HTTPS endpoint URL (https://<name>.search.windows.net).
identity_principal_id System-assigned identity principal ID for granting data-source RBAC; null if disabled.
primary_key Primary admin API key (sensitive); empty under RBAC-only auth.
query_keys Read-only query keys (sensitive); empty under RBAC-only auth.
private_endpoint_ip Private IP of the search private endpoint; null if none created.

Enterprise scenario

A retail bank builds an internal “policy copilot” that lets relationship managers ask plain-language questions over thousands of product and compliance PDFs. The platform team uses this module to deploy one standard AI Search service per environment with public_network_access_enabled = false and local_authentication_enabled = false, so the only path to the index is a private endpoint from the AKS cluster running the RAG API, and the only credential is an Azure AD token. The module’s identity_principal_id output is fed straight into role assignments granting the search service’s managed identity Storage Blob Data Reader on the document store and Cognitive Services OpenAI User on the embedding model — meaning no admin key, query key, or storage connection string exists anywhere in the pipeline. When indexing volume grew, they bumped partition_count from 2 to 4 and replica_count to 4 in one PR, keeping the read+write SLA intact.

Best practices

TerraformAzureAI SearchModuleIaC
Need this built for real?

Vinod is a Senior Cloud Architect (22+ yrs) — available for Azure / AWS / GCP architecture, landing zones, and migrations.

Work with me

Comments

Keep Reading