IaC Azure

Terraform Module: Azure Storage Sync (File Sync) — Centralise file shares in Azure with a VNet-locked sync service

Quick take — A reusable hashicorp/azurerm ~> 4.0 Terraform module for Azure Storage Sync (File Sync): a sync service with traffic policy, sync groups, and cloud endpoints wiring Azure file shares to your on-prem servers. New here? Jump to the Quickstart below to deploy it in minutes; read on for how it works and when to reach for it.

Quickstart (copy-paste)

Minimal, runnable configuration — drop this in a .tf file and fill in the "..." placeholders (each required input is commented):

provider "azurerm" {
  features {}
}

module "storage_sync" {
  source = "git::https://dev.azure.com/teknohut/kloudvin/_git/terraform-modules//terraform-module-azure-storage-sync?ref=v1.0.0"

  name                = "..."  # Storage Sync Service name; 1-260 chars, starts with a l…
  resource_group_name = "..."  # Resource group to create the sync service in.
  location            = "..."  # Azure region (e.g. `centralindia`).
}

Then terraform init && terraform apply. Every other input has a sensible default — see Inputs below to override behaviour.

What this module is

Azure File Sync turns an Azure file share into the authoritative copy of a Windows file server’s data, then caches a hot subset of that data on each registered server through cloud tiering. The control-plane object that anchors the whole thing is the Storage Sync Service (azurerm_storage_sync) — a regional resource that owns one or more sync groups (azurerm_storage_sync_group), each of which binds exactly one cloud endpoint (an Azure file share, azurerm_storage_sync_cloud_endpoint) to many server endpoints (paths on your registered file servers).

The footprint is small, but the wiring is fiddly and easy to get wrong: the cloud endpoint needs the storage account’s full resource ID and — when the storage account and the Sync Service live in different tenants or the Microsoft.StorageSync first-party app hasn’t been granted access — an explicit storage_account_tenant_id. Forget to lock incoming_traffic_policy down and your sync service accepts registration traffic from the public internet rather than only your private network.

This module wraps azurerm_storage_sync and its cloud-side children behind a single var-driven interface so a platform team provisions a secure-by-default sync service — incoming_traffic_policy = "AllowVirtualNetworksOnly", sync groups and cloud endpoints created as data — and hands application/server teams the one thing they actually need: the sync service ID to register their servers against. Terraform owns everything in the cloud; the per-server agent registration (which depends on a registered_server_id that only exists after the agent is installed) stays correctly out of scope.

When to use it

Reach for the raw resources only for a one-off lab. Note that the server endpoint itself is deliberately not in this module: it requires a registered_server_id produced by installing the Azure File Sync agent on a server and registering it — a day-2, out-of-band step that does not belong in the same apply that stands up the cloud topology.

Module structure

terraform-module-azure-storage-sync/
├── versions.tf
├── main.tf
├── variables.tf
└── outputs.tf

versions.tf

terraform {
  required_version = ">= 1.5.0"

  required_providers {
    azurerm = {
      source  = "hashicorp/azurerm"
      version = "~> 4.0"
    }
  }
}

main.tf

# The Storage Sync Service: the regional control-plane object that owns sync
# groups and against which file servers register.
resource "azurerm_storage_sync" "this" {
  name                = var.name
  resource_group_name = var.resource_group_name
  location            = var.location

  # AllowVirtualNetworksOnly rejects server registration / sync traffic from the
  # public internet, accepting it only over private networking. Lock this down
  # in production; AllowAllTraffic is the permissive default we override.
  incoming_traffic_policy = var.incoming_traffic_policy

  tags = var.tags
}

# Sync groups: each group ties ONE Azure file share (cloud endpoint) to MANY
# server endpoints. A sync service can host many groups (e.g. one per share set).
resource "azurerm_storage_sync_group" "this" {
  for_each = var.sync_groups

  name            = each.key
  storage_sync_id = azurerm_storage_sync.this.id
}

# Cloud endpoints: the Azure file share side of a sync group. Exactly one per
# group. storage_account_tenant_id is only required for cross-tenant scenarios
# or when the Microsoft.StorageSync app lacks default access to the account.
resource "azurerm_storage_sync_cloud_endpoint" "this" {
  for_each = var.sync_groups

  name                  = "${each.key}-cloud-endpoint"
  storage_sync_group_id = azurerm_storage_sync_group.this[each.key].id

  file_share_name           = each.value.file_share_name
  storage_account_id        = each.value.storage_account_id
  storage_account_tenant_id = each.value.storage_account_tenant_id
}

variables.tf

variable "name" {
  description = "Name of the Storage Sync Service. 1-260 chars; letters, numbers, spaces, and . - _ are allowed."
  type        = string

  validation {
    condition     = can(regex("^[A-Za-z0-9][A-Za-z0-9 ._-]{0,259}$", var.name))
    error_message = "name must be 1-260 characters and start with a letter or number (letters, numbers, spaces, '.', '-', '_')."
  }
}

variable "resource_group_name" {
  description = "Name of the resource group in which to create the Storage Sync Service."
  type        = string
}

variable "location" {
  description = "Azure region for the Storage Sync Service (e.g. centralindia, eastus)."
  type        = string
}

variable "incoming_traffic_policy" {
  description = "Network policy for incoming sync/registration traffic: AllowAllTraffic or AllowVirtualNetworksOnly."
  type        = string
  default     = "AllowVirtualNetworksOnly"

  validation {
    condition     = contains(["AllowAllTraffic", "AllowVirtualNetworksOnly"], var.incoming_traffic_policy)
    error_message = "incoming_traffic_policy must be either AllowAllTraffic or AllowVirtualNetworksOnly."
  }
}

variable "sync_groups" {
  description = <<-EOT
    Map of sync groups to create, keyed by sync group name. Each group binds one
    Azure file share (the cloud endpoint) and is the target servers register into.
    - file_share_name:            name of an existing Azure file share in the storage account.
    - storage_account_id:         resource ID of the storage account hosting the file share.
    - storage_account_tenant_id:  tenant ID of the storage account; only needed for cross-tenant
                                  setups or when Microsoft.StorageSync lacks default access. Leave null otherwise.
  EOT
  type = map(object({
    file_share_name           = string
    storage_account_id        = string
    storage_account_tenant_id = optional(string)
  }))
  default = {}

  validation {
    condition = alltrue([
      for g in values(var.sync_groups) : can(regex("^/subscriptions/.+/storageAccounts/.+$", g.storage_account_id))
    ])
    error_message = "Each sync_groups[*].storage_account_id must be a full storage account resource ID (/subscriptions/.../storageAccounts/<name>)."
  }
}

variable "tags" {
  description = "Tags to apply to the Storage Sync Service."
  type        = map(string)
  default     = {}
}

outputs.tf

output "id" {
  description = "The resource ID of the Storage Sync Service. Register file servers against this."
  value       = azurerm_storage_sync.this.id
}

output "name" {
  description = "The name of the Storage Sync Service."
  value       = azurerm_storage_sync.this.name
}

output "registered_servers" {
  description = "List of registered server IDs currently associated with the Storage Sync Service."
  value       = azurerm_storage_sync.this.registered_servers
}

output "sync_group_ids" {
  description = "Map of sync group name => sync group resource ID."
  value       = { for k, g in azurerm_storage_sync_group.this : k => g.id }
}

output "cloud_endpoint_ids" {
  description = "Map of sync group name => cloud endpoint resource ID."
  value       = { for k, e in azurerm_storage_sync_cloud_endpoint.this : k => e.id }
}

How to use it

module "storage_sync_file_sync_branch" {
  source = "git::https://dev.azure.com/teknohut/kloudvin/_git/terraform-modules//terraform-module-azure-storage-sync?ref=v1.0.0"

  name                = "ss-kv-branch-prod-cin"
  resource_group_name = azurerm_resource_group.platform.name
  location            = azurerm_resource_group.platform.location

  # Reject public-internet registration; only private networking may sync.
  incoming_traffic_policy = "AllowVirtualNetworksOnly"

  sync_groups = {
    "finance-shares" = {
      file_share_name    = azurerm_storage_share.finance.name
      storage_account_id = azurerm_storage_account.files.id
    }
    "engineering-shares" = {
      file_share_name    = azurerm_storage_share.engineering.name
      storage_account_id = azurerm_storage_account.files.id
    }
  }

  tags = {
    environment = "production"
    owner       = "platform-team"
    costcenter  = "cc-1042"
  }
}

# Downstream: grant the Azure File Sync first-party identity (or an automation
# principal) the rights it needs on the sync service scope, using the module's
# id output. Server endpoints are then created out-of-band after each agent
# is installed and registered against this same sync service.
resource "azurerm_role_assignment" "sync_admin" {
  scope                = module.storage_sync_file_sync_branch.id
  role_definition_name = "Storage File Data Privileged Contributor"
  principal_id         = azuread_group.filesync_admins.object_id
}

# Feed the finance sync group's cloud endpoint ID into a monitoring/alert config.
output "finance_cloud_endpoint_id" {
  value = module.storage_sync_file_sync_branch.cloud_endpoint_ids["finance-shares"]
}

With Terragrunt

Terragrunt keeps this module DRY across environments — define the backend and provider once in a root config, then a thin terragrunt.hcl per environment supplies only the inputs that differ.

1. Root configlive/terragrunt.hcl (inherited by every module):

remote_state {
  backend = "azurerm"
  generate = { path = "backend.tf", if_exists = "overwrite" }
  config = {
    # ...azurerm state bucket/container + key per path...
  }
}

2. Module configlive/prod/storage_sync/terragrunt.hcl:

include "root" {
  path = find_in_parent_folders()
}

terraform {
  source = "git::https://dev.azure.com/teknohut/kloudvin/_git/terraform-modules//terraform-module-azure-storage-sync?ref=v1.0.0"
}

inputs = {
  name = "..."
  resource_group_name = "..."
  location = "..."
}

3. Deploy one environment, or roll out all modules together:

cd live/prod/storage_sync && terragrunt apply        # this module
terragrunt run-all apply                      # every module under live/prod

Why Terragrunt here: the backend and provider live in one place instead of being copy-pasted into every module; inputs is overridden per environment (dev / stage / prod) without forking the module; and run-all orchestrates dependencies across modules. Reach for it once you have more than one environment or more than a handful of modules — for a single stack, the plain Quickstart above is enough.

Inputs

Name Type Default Required Description
name string Yes Storage Sync Service name; 1-260 chars, starts with a letter/number.
resource_group_name string Yes Resource group to create the sync service in.
location string Yes Azure region (e.g. centralindia).
incoming_traffic_policy string "AllowVirtualNetworksOnly" No Incoming traffic policy: AllowAllTraffic or AllowVirtualNetworksOnly.
sync_groups map(object) {} No Sync groups keyed by name; each defines file_share_name, storage_account_id, optional storage_account_tenant_id.
tags map(string) {} No Tags applied to the Storage Sync Service.

Outputs

Name Description
id The resource ID of the Storage Sync Service (register servers against this).
name The name of the Storage Sync Service.
registered_servers List of registered server IDs currently associated with the sync service.
sync_group_ids Map of sync group name to sync group resource ID.
cloud_endpoint_ids Map of sync group name to cloud endpoint resource ID.

Enterprise scenario

A national retailer runs a Windows file server in each of 180 stores, every one of them previously backed up over the WAN to head office. They deploy this module once per region with incoming_traffic_policy = "AllowVirtualNetworksOnly" and a finance-shares sync group whose cloud endpoint points at a GZRS Azure file share. Each store’s server runs the File Sync agent, registers against the module’s id output, and gets a server endpoint with cloud tiering set to a 20% free-space policy — so the store keeps only its hot files locally while the authoritative copy lives in Azure. A failed store server is replaced by registering a new box against the same sync service and letting it rehydrate, turning a multi-day restore into an afternoon.

Best practices

TerraformAzureStorage Sync (File Sync)ModuleIaC
Need this built for real?

Vinod is a Senior Cloud Architect (22+ yrs) — available for Azure / AWS / GCP architecture, landing zones, and migrations.

Work with me

Comments

Keep Reading