Quick take — A reusable Terraform module for Azure Elastic SAN on azurerm ~> 4.0: provision the SAN, size base/extra TiB, carve volume groups with private endpoints, and expose iSCSI targets to your VMs and AKS. New here? Jump to the Quickstart below to deploy it in minutes; read on for how it works and when to reach for it.
Quickstart (copy-paste)
Minimal, runnable configuration — drop this in a .tf file and fill in the "..." placeholders (each required input is commented):
provider "azurerm" {
features {}
}
module "elastic_san" {
source = "git::https://dev.azure.com/teknohut/kloudvin/_git/terraform-modules//terraform-module-azure-elastic-san?ref=v1.0.0"
name = "..." # Name of the Elastic SAN (3-63 chars, lowercase alphanum…
resource_group_name = "..." # Resource group that will hold the SAN.
location = "..." # Azure region (Elastic SAN is region-limited).
base_size_in_tib = 0 # Provisioned base capacity in TiB (carries performance),…
volume_groups = {} # Volume groups with per-group network/encryption posture…
}
Then terraform init && terraform apply. Every other input has a sensible default — see Inputs below to override behaviour.
What this module is
Azure Elastic SAN is a fully managed, cloud-native storage area network. Instead of attaching one managed disk per VM, you provision a single SAN appliance with a pool of capacity, slice that pool into volume groups, and carve individual volumes that are exposed as iSCSI targets. Multiple compute clients — IaaS VMs, VMSS, or AKS nodes via the CSI driver — connect to those targets over the storage network. The big wins are shared throughput/IOPS that can be dynamically distributed across volumes, large scale (hundreds of TiB from one resource), and a billing model split into base capacity (provisioned performance) and additional capacity (cheaper, capacity-only).
The raw provider surface is fiddly: capacity is expressed in TiB with a minimum base size, SKU encodes both tier and redundancy (Premium_LRS vs Premium_ZRS), volume groups own the network ACLs and encryption settings, and each volume’s size_in_gib interacts with the SAN’s total provisioned TiB. Wrapping azurerm_elastic_san, azurerm_elastic_san_volume_group, and azurerm_elastic_san_volume in one module gives you validated inputs (no more “ZRS is only valid in some regions” surprises), consistent naming, optional private-endpoint lockdown per volume group, and clean outputs (the SAN id, volume ids, and the iSCSI target IQNs) that downstream compute can consume directly.
When to use it
- You need shared, scalable block storage for clustered workloads (SQL FCI, SAP, Kafka/Cassandra) where many nodes attach iSCSI LUNs from one managed pool.
- You are consolidating dozens of individual premium managed disks and want a single capacity pool with dynamically shared performance and simpler cost accounting (base TiB + additional TiB).
- You run AKS and want persistent volumes backed by Elastic SAN via the CSI driver, with the SAN’s network locked to the cluster’s subnets.
- You want zone-redundant block storage (
Premium_ZRS) without managing replication yourself, and need it reproducible across dev/test/prod via code. - You do not need Elastic SAN if a handful of standalone managed disks already satisfy your IOPS/capacity — the SAN’s base capacity has a minimum spend that only pays off at scale.
Module structure
terraform-module-azure-elastic-san/
├── versions.tf # provider + Terraform version pins
├── main.tf # elastic_san + volume_group(s) + volume(s) + optional private endpoints
├── variables.tf # var-driven inputs with validation
└── outputs.tf # SAN id/name, volume group ids, volume ids + iSCSI targets
# versions.tf
terraform {
required_version = ">= 1.5.0"
required_providers {
azurerm = {
source = "hashicorp/azurerm"
version = "~> 4.0"
}
}
}
# main.tf
# Map zone -> SKU-compatible value is encoded via the sku_name variable.
# Elastic SAN base capacity is measured in TiB and carries the provisioned
# performance; extended (additional) capacity is capacity-only and cheaper.
resource "azurerm_elastic_san" "this" {
name = var.name
resource_group_name = var.resource_group_name
location = var.location
base_size_in_tib = var.base_size_in_tib
extended_size_in_tib = var.extended_size_in_tib
sku {
name = var.sku_name
tier = var.sku_tier
}
# Pin availability zones for LRS SANs that are zone-aware.
# For Premium_ZRS leave this null (the platform spreads across zones).
zones = var.sku_name == "Premium_ZRS" ? null : var.zones
tags = var.tags
}
resource "azurerm_elastic_san_volume_group" "this" {
for_each = var.volume_groups
name = each.key
elastic_san_id = azurerm_elastic_san.this.id
# iSCSI is the supported protocol type today.
protocol_type = "Iscsi"
# Platform-managed or customer-managed key encryption.
encryption_type = each.value.encryption_key_url == null ? "EncryptionAtRestWithPlatformKey" : "EncryptionAtRestWithCustomerManagedKey"
dynamic "encryption" {
for_each = each.value.encryption_key_url == null ? [] : [each.value]
content {
key_vault_key_id = encryption.value.encryption_key_url
user_assigned_identity_id = encryption.value.identity_id
}
}
dynamic "identity" {
for_each = each.value.identity_id == null ? [] : [each.value.identity_id]
content {
type = "UserAssigned"
identity_ids = [identity.value]
}
}
# Default-deny network rules; only the listed subnets may reach the targets.
dynamic "network_rule" {
for_each = each.value.allowed_subnet_ids
content {
subnet_id = network_rule.value
action = "Allow"
}
}
}
resource "azurerm_elastic_san_volume" "this" {
for_each = local.volumes
name = each.value.volume_name
volume_group_id = azurerm_elastic_san_volume_group.this[each.value.group_name].id
size_in_gib = each.value.size_in_gib
# Optionally seed a new volume from a disk or snapshot source.
dynamic "create_source" {
for_each = each.value.create_source_id == null ? [] : [each.value]
content {
source_type = each.value.create_source_type
source_id = each.value.create_source_id
}
}
}
# Private endpoint per volume group for full network isolation (no public path).
resource "azurerm_private_endpoint" "this" {
for_each = {
for k, v in var.volume_groups : k => v
if v.private_endpoint_subnet_id != null
}
name = "pe-${var.name}-${each.key}"
resource_group_name = var.resource_group_name
location = var.location
subnet_id = each.value.private_endpoint_subnet_id
private_service_connection {
name = "psc-${var.name}-${each.key}"
private_connection_resource_id = azurerm_elastic_san_volume_group.this[each.key].id
subresource_names = ["volumegroup"]
is_manual_connection = false
}
tags = var.tags
}
locals {
# Flatten volume_groups -> volumes into a single addressable map.
volumes = merge([
for group_name, group in var.volume_groups : {
for vol_name, vol in group.volumes :
"${group_name}/${vol_name}" => {
group_name = group_name
volume_name = vol_name
size_in_gib = vol.size_in_gib
create_source_id = vol.create_source_id
create_source_type = vol.create_source_type
}
}
]...)
}
# variables.tf
variable "name" {
type = string
description = "Name of the Elastic SAN resource."
validation {
condition = can(regex("^[a-z0-9][a-z0-9-]{1,61}[a-z0-9]$", var.name))
error_message = "name must be 3-63 chars, lowercase alphanumeric and hyphens, starting/ending alphanumeric."
}
}
variable "resource_group_name" {
type = string
description = "Resource group that will hold the Elastic SAN."
}
variable "location" {
type = string
description = "Azure region. Elastic SAN is only available in a subset of regions."
}
variable "base_size_in_tib" {
type = number
description = "Provisioned base capacity in TiB (carries the SAN performance). Minimum 1."
validation {
condition = var.base_size_in_tib >= 1 && var.base_size_in_tib <= 100
error_message = "base_size_in_tib must be between 1 and 100 TiB."
}
}
variable "extended_size_in_tib" {
type = number
default = 0
description = "Additional capacity-only TiB on top of base (cheaper, no extra performance). 0 to disable."
validation {
condition = var.extended_size_in_tib >= 0 && var.extended_size_in_tib <= 900
error_message = "extended_size_in_tib must be between 0 and 900 TiB."
}
}
variable "sku_name" {
type = string
default = "Premium_LRS"
description = "SKU name encoding tier + redundancy: Premium_LRS or Premium_ZRS."
validation {
condition = contains(["Premium_LRS", "Premium_ZRS"], var.sku_name)
error_message = "sku_name must be Premium_LRS or Premium_ZRS."
}
}
variable "sku_tier" {
type = string
default = "Premium"
description = "SKU tier. Premium is the supported tier for Elastic SAN."
validation {
condition = contains(["Premium"], var.sku_tier)
error_message = "sku_tier must be Premium."
}
}
variable "zones" {
type = list(string)
default = null
description = "Availability zones for an LRS SAN (e.g. [\"1\"]). Leave null for Premium_ZRS or region-default placement."
validation {
condition = var.zones == null ? true : alltrue([for z in var.zones : contains(["1", "2", "3"], z)])
error_message = "zones may only contain \"1\", \"2\", or \"3\"."
}
}
variable "volume_groups" {
description = "Map of volume groups. Each owns its network/encryption posture and a map of volumes."
type = map(object({
allowed_subnet_ids = optional(list(string), [])
private_endpoint_subnet_id = optional(string)
encryption_key_url = optional(string)
identity_id = optional(string)
volumes = map(object({
size_in_gib = number
create_source_id = optional(string)
create_source_type = optional(string, "Disk")
}))
}))
validation {
condition = alltrue([
for g in values(var.volume_groups) : alltrue([
for v in values(g.volumes) : v.size_in_gib >= 1 && v.size_in_gib <= 65536
])
])
error_message = "Every volume size_in_gib must be between 1 and 65536 GiB."
}
validation {
condition = alltrue([
for g in values(var.volume_groups) :
g.encryption_key_url == null || g.identity_id != null
])
error_message = "A volume group with a customer-managed encryption_key_url must also set identity_id."
}
}
variable "tags" {
type = map(string)
default = {}
description = "Tags applied to the Elastic SAN and private endpoints."
}
# outputs.tf
output "id" {
description = "Resource ID of the Elastic SAN."
value = azurerm_elastic_san.this.id
}
output "name" {
description = "Name of the Elastic SAN."
value = azurerm_elastic_san.this.name
}
output "total_size_in_tib" {
description = "Total provisioned capacity (base + extended) in TiB."
value = azurerm_elastic_san.this.total_size_in_tib
}
output "total_volume_size_in_gib" {
description = "Total size of all volumes currently provisioned across the SAN, in GiB."
value = azurerm_elastic_san.this.total_volume_size_in_gib
}
output "volume_group_ids" {
description = "Map of volume group name => volume group resource ID."
value = { for k, vg in azurerm_elastic_san_volume_group.this : k => vg.id }
}
output "volume_ids" {
description = "Map of \"group/volume\" => volume resource ID."
value = { for k, v in azurerm_elastic_san_volume.this : k => v.id }
}
output "volume_iscsi_targets" {
description = "Map of \"group/volume\" => iSCSI target details (target IQN, portal hostname, portal port)."
value = {
for k, v in azurerm_elastic_san_volume.this : k => {
target_iqn = v.target_iqn
target_portal_host = v.target_portal_hostname
target_portal_port = v.target_portal_port
volume_id_for_mount = v.volume_id
}
}
}
output "private_endpoint_ids" {
description = "Map of volume group name => private endpoint resource ID (only for groups with a PE subnet)."
value = { for k, pe in azurerm_private_endpoint.this : k => pe.id }
}
How to use it
module "elastic_san" {
source = "git::https://dev.azure.com/teknohut/kloudvin/_git/terraform-modules//terraform-module-azure-elastic-san?ref=v1.0.0"
name = "esan-prod-sql-weu"
resource_group_name = azurerm_resource_group.storage.name
location = "westeurope"
# 4 TiB of provisioned performance, +20 TiB cheaper capacity, zone-redundant.
base_size_in_tib = 4
extended_size_in_tib = 20
sku_name = "Premium_ZRS"
sku_tier = "Premium"
volume_groups = {
"vg-sqlcluster" = {
allowed_subnet_ids = [azurerm_subnet.db.id]
private_endpoint_subnet_id = azurerm_subnet.privatelink.id
volumes = {
"data01" = { size_in_gib = 2048 }
"data02" = { size_in_gib = 2048 }
"log01" = { size_in_gib = 512 }
}
}
}
tags = {
environment = "prod"
workload = "sql-fci"
owner = "platform-team"
}
}
# Downstream: hand the iSCSI target IQN of the data01 volume to a VM
# extension / cloud-init that runs `iscsiadm` to log in and mount the LUN.
output "sql_data01_iscsi_iqn" {
value = module.elastic_san.volume_iscsi_targets["vg-sqlcluster/data01"].target_iqn
}
resource "azurerm_role_assignment" "san_reader" {
scope = module.elastic_san.id
role_definition_name = "Reader"
principal_id = azuread_group.dba_team.object_id
}
With Terragrunt
Terragrunt keeps this module DRY across environments — define the backend and provider once in a root config, then a thin terragrunt.hcl per environment supplies only the inputs that differ.
1. Root config — live/terragrunt.hcl (inherited by every module):
remote_state {
backend = "azurerm"
generate = { path = "backend.tf", if_exists = "overwrite" }
config = {
# ...azurerm state bucket/container + key per path...
}
}
2. Module config — live/prod/elastic_san/terragrunt.hcl:
include "root" {
path = find_in_parent_folders()
}
terraform {
source = "git::https://dev.azure.com/teknohut/kloudvin/_git/terraform-modules//terraform-module-azure-elastic-san?ref=v1.0.0"
}
inputs = {
name = "..."
resource_group_name = "..."
location = "..."
base_size_in_tib = 0
volume_groups = {}
}
3. Deploy one environment, or roll out all modules together:
cd live/prod/elastic_san && terragrunt apply # this module
terragrunt run-all apply # every module under live/prod
Why Terragrunt here: the backend and provider live in one place instead of being copy-pasted into every module; inputs is overridden per environment (dev / stage / prod) without forking the module; and run-all orchestrates dependencies across modules. Reach for it once you have more than one environment or more than a handful of modules — for a single stack, the plain Quickstart above is enough.
Inputs
| Name | Type | Default | Required | Description |
|---|---|---|---|---|
name |
string |
— | Yes | Name of the Elastic SAN (3-63 chars, lowercase alphanumeric + hyphens). |
resource_group_name |
string |
— | Yes | Resource group that will hold the SAN. |
location |
string |
— | Yes | Azure region (Elastic SAN is region-limited). |
base_size_in_tib |
number |
— | Yes | Provisioned base capacity in TiB (carries performance), 1-100. |
extended_size_in_tib |
number |
0 |
No | Additional capacity-only TiB (cheaper), 0-900. |
sku_name |
string |
"Premium_LRS" |
No | Premium_LRS or Premium_ZRS (tier + redundancy). |
sku_tier |
string |
"Premium" |
No | SKU tier; Premium is supported. |
zones |
list(string) |
null |
No | Zones for an LRS SAN (e.g. ["1"]); null for ZRS/region default. |
volume_groups |
map(object) |
— | Yes | Volume groups with per-group network/encryption posture and a map of volumes (size_in_gib, optional create_source_id/create_source_type). |
tags |
map(string) |
{} |
No | Tags for the SAN and private endpoints. |
Outputs
| Name | Description |
|---|---|
id |
Resource ID of the Elastic SAN. |
name |
Name of the Elastic SAN. |
total_size_in_tib |
Total provisioned capacity (base + extended) in TiB. |
total_volume_size_in_gib |
Total size of all provisioned volumes in GiB. |
volume_group_ids |
Map of volume group name => volume group resource ID. |
volume_ids |
Map of "group/volume" => volume resource ID. |
volume_iscsi_targets |
Map of "group/volume" => iSCSI target details (IQN, portal host, portal port, volume id). |
private_endpoint_ids |
Map of volume group name => private endpoint resource ID. |
Enterprise scenario
A financial-services platform team migrates a SQL Server Failover Cluster Instance off a sprawl of 30+ individual P40 managed disks onto a single zone-redundant Elastic SAN. They provision a 4 TiB base + 20 TiB extended Premium_ZRS SAN, expose three volumes (two data, one log) from a vg-sqlcluster volume group locked to the database subnet via a private endpoint, and let both cluster nodes attach the same LUNs over iSCSI. Shared performance across the pool absorbs end-of-month reporting spikes that previously required over-provisioning every disk, and the extended (capacity-only) TiB keeps the cold archive volumes cheap while staying inside the same SAN and the same Terraform state.
Best practices
- Lock down the network per volume group. Always set
allowed_subnet_idsand, for production, aprivate_endpoint_subnet_idso iSCSI targets are unreachable from the public internet — the module defaults volume groups to deny-by-default network rules. - Right-size base vs extended capacity. Base TiB carries the SAN’s IOPS/throughput and is the expensive part; put steady high-performance workloads against base and push cold/capacity-only data into
extended_size_in_tibto cut cost without a second resource. - Choose redundancy by region and RTO. Use
Premium_ZRSfor synchronous zone resilience where the region supports it; only fall back toPremium_LRSwith pinnedzoneswhen you intentionally co-locate the SAN with single-zone compute to avoid cross-zone latency. - Use customer-managed keys for regulated data. Pass
encryption_key_urlplus a user-assignedidentity_idon sensitive volume groups; the module’s validation rejects a CMK URL without an identity so you never deploy a half-configured key setup. - Name for scale and observability. Follow a convention like
esan-<env>-<workload>-<region>for the SAN and stable group/volume keys (vg-sqlcluster,data01) since those keys flow straight into thevolume_iscsi_targetsoutput that compute consumes — renaming them later forces LUN re-attachment. - Watch the capacity ceiling. Monitor
total_volume_size_in_gibagainsttotal_size_in_tib; you cannot over-allocate volumes beyond provisioned capacity, so wire an alert before volumes fill the pool and block new LUNs.