Quick take — A reusable hashicorp/azurerm ~> 4.0 Terraform module for Azure Storage Sync (File Sync): a sync service with traffic policy, sync groups, and cloud endpoints wiring Azure file shares to your on-prem servers. New here? Jump to the Quickstart below to deploy it in minutes; read on for how it works and when to reach for it.
Quickstart (copy-paste)
Minimal, runnable configuration — drop this in a .tf file and fill in the "..." placeholders (each required input is commented):
provider "azurerm" {
features {}
}
module "storage_sync" {
source = "git::https://dev.azure.com/teknohut/kloudvin/_git/terraform-modules//terraform-module-azure-storage-sync?ref=v1.0.0"
name = "..." # Storage Sync Service name; 1-260 chars, starts with a l…
resource_group_name = "..." # Resource group to create the sync service in.
location = "..." # Azure region (e.g. `centralindia`).
}
Then terraform init && terraform apply. Every other input has a sensible default — see Inputs below to override behaviour.
What this module is
Azure File Sync turns an Azure file share into the authoritative copy of a Windows file server’s data, then caches a hot subset of that data on each registered server through cloud tiering. The control-plane object that anchors the whole thing is the Storage Sync Service (azurerm_storage_sync) — a regional resource that owns one or more sync groups (azurerm_storage_sync_group), each of which binds exactly one cloud endpoint (an Azure file share, azurerm_storage_sync_cloud_endpoint) to many server endpoints (paths on your registered file servers).
The footprint is small, but the wiring is fiddly and easy to get wrong: the cloud endpoint needs the storage account’s full resource ID and — when the storage account and the Sync Service live in different tenants or the Microsoft.StorageSync first-party app hasn’t been granted access — an explicit storage_account_tenant_id. Forget to lock incoming_traffic_policy down and your sync service accepts registration traffic from the public internet rather than only your private network.
This module wraps azurerm_storage_sync and its cloud-side children behind a single var-driven interface so a platform team provisions a secure-by-default sync service — incoming_traffic_policy = "AllowVirtualNetworksOnly", sync groups and cloud endpoints created as data — and hands application/server teams the one thing they actually need: the sync service ID to register their servers against. Terraform owns everything in the cloud; the per-server agent registration (which depends on a registered_server_id that only exists after the agent is installed) stays correctly out of scope.
When to use it
- You are consolidating distributed Windows file servers (branch offices, remote sites) onto Azure Files and want the Azure-side topology — sync service, sync groups, cloud endpoints — defined once and reviewed in a PR.
- You run a hub-and-spoke or branch-cache pattern where many servers in many regions need to share one set of file shares, and you want
incoming_traffic_policypinned to private networks across the fleet. - You need a repeatable lift-and-shift or DR target: an Azure file share kept continuously in sync with on-prem, ready to be the recovery copy.
- You are building a landing zone or service catalog where teams request a sync service + sync group through a module call rather than clicking through the portal.
Reach for the raw resources only for a one-off lab. Note that the server endpoint itself is deliberately not in this module: it requires a registered_server_id produced by installing the Azure File Sync agent on a server and registering it — a day-2, out-of-band step that does not belong in the same apply that stands up the cloud topology.
Module structure
terraform-module-azure-storage-sync/
├── versions.tf
├── main.tf
├── variables.tf
└── outputs.tf
versions.tf
terraform {
required_version = ">= 1.5.0"
required_providers {
azurerm = {
source = "hashicorp/azurerm"
version = "~> 4.0"
}
}
}
main.tf
# The Storage Sync Service: the regional control-plane object that owns sync
# groups and against which file servers register.
resource "azurerm_storage_sync" "this" {
name = var.name
resource_group_name = var.resource_group_name
location = var.location
# AllowVirtualNetworksOnly rejects server registration / sync traffic from the
# public internet, accepting it only over private networking. Lock this down
# in production; AllowAllTraffic is the permissive default we override.
incoming_traffic_policy = var.incoming_traffic_policy
tags = var.tags
}
# Sync groups: each group ties ONE Azure file share (cloud endpoint) to MANY
# server endpoints. A sync service can host many groups (e.g. one per share set).
resource "azurerm_storage_sync_group" "this" {
for_each = var.sync_groups
name = each.key
storage_sync_id = azurerm_storage_sync.this.id
}
# Cloud endpoints: the Azure file share side of a sync group. Exactly one per
# group. storage_account_tenant_id is only required for cross-tenant scenarios
# or when the Microsoft.StorageSync app lacks default access to the account.
resource "azurerm_storage_sync_cloud_endpoint" "this" {
for_each = var.sync_groups
name = "${each.key}-cloud-endpoint"
storage_sync_group_id = azurerm_storage_sync_group.this[each.key].id
file_share_name = each.value.file_share_name
storage_account_id = each.value.storage_account_id
storage_account_tenant_id = each.value.storage_account_tenant_id
}
variables.tf
variable "name" {
description = "Name of the Storage Sync Service. 1-260 chars; letters, numbers, spaces, and . - _ are allowed."
type = string
validation {
condition = can(regex("^[A-Za-z0-9][A-Za-z0-9 ._-]{0,259}$", var.name))
error_message = "name must be 1-260 characters and start with a letter or number (letters, numbers, spaces, '.', '-', '_')."
}
}
variable "resource_group_name" {
description = "Name of the resource group in which to create the Storage Sync Service."
type = string
}
variable "location" {
description = "Azure region for the Storage Sync Service (e.g. centralindia, eastus)."
type = string
}
variable "incoming_traffic_policy" {
description = "Network policy for incoming sync/registration traffic: AllowAllTraffic or AllowVirtualNetworksOnly."
type = string
default = "AllowVirtualNetworksOnly"
validation {
condition = contains(["AllowAllTraffic", "AllowVirtualNetworksOnly"], var.incoming_traffic_policy)
error_message = "incoming_traffic_policy must be either AllowAllTraffic or AllowVirtualNetworksOnly."
}
}
variable "sync_groups" {
description = <<-EOT
Map of sync groups to create, keyed by sync group name. Each group binds one
Azure file share (the cloud endpoint) and is the target servers register into.
- file_share_name: name of an existing Azure file share in the storage account.
- storage_account_id: resource ID of the storage account hosting the file share.
- storage_account_tenant_id: tenant ID of the storage account; only needed for cross-tenant
setups or when Microsoft.StorageSync lacks default access. Leave null otherwise.
EOT
type = map(object({
file_share_name = string
storage_account_id = string
storage_account_tenant_id = optional(string)
}))
default = {}
validation {
condition = alltrue([
for g in values(var.sync_groups) : can(regex("^/subscriptions/.+/storageAccounts/.+$", g.storage_account_id))
])
error_message = "Each sync_groups[*].storage_account_id must be a full storage account resource ID (/subscriptions/.../storageAccounts/<name>)."
}
}
variable "tags" {
description = "Tags to apply to the Storage Sync Service."
type = map(string)
default = {}
}
outputs.tf
output "id" {
description = "The resource ID of the Storage Sync Service. Register file servers against this."
value = azurerm_storage_sync.this.id
}
output "name" {
description = "The name of the Storage Sync Service."
value = azurerm_storage_sync.this.name
}
output "registered_servers" {
description = "List of registered server IDs currently associated with the Storage Sync Service."
value = azurerm_storage_sync.this.registered_servers
}
output "sync_group_ids" {
description = "Map of sync group name => sync group resource ID."
value = { for k, g in azurerm_storage_sync_group.this : k => g.id }
}
output "cloud_endpoint_ids" {
description = "Map of sync group name => cloud endpoint resource ID."
value = { for k, e in azurerm_storage_sync_cloud_endpoint.this : k => e.id }
}
How to use it
module "storage_sync_file_sync_branch" {
source = "git::https://dev.azure.com/teknohut/kloudvin/_git/terraform-modules//terraform-module-azure-storage-sync?ref=v1.0.0"
name = "ss-kv-branch-prod-cin"
resource_group_name = azurerm_resource_group.platform.name
location = azurerm_resource_group.platform.location
# Reject public-internet registration; only private networking may sync.
incoming_traffic_policy = "AllowVirtualNetworksOnly"
sync_groups = {
"finance-shares" = {
file_share_name = azurerm_storage_share.finance.name
storage_account_id = azurerm_storage_account.files.id
}
"engineering-shares" = {
file_share_name = azurerm_storage_share.engineering.name
storage_account_id = azurerm_storage_account.files.id
}
}
tags = {
environment = "production"
owner = "platform-team"
costcenter = "cc-1042"
}
}
# Downstream: grant the Azure File Sync first-party identity (or an automation
# principal) the rights it needs on the sync service scope, using the module's
# id output. Server endpoints are then created out-of-band after each agent
# is installed and registered against this same sync service.
resource "azurerm_role_assignment" "sync_admin" {
scope = module.storage_sync_file_sync_branch.id
role_definition_name = "Storage File Data Privileged Contributor"
principal_id = azuread_group.filesync_admins.object_id
}
# Feed the finance sync group's cloud endpoint ID into a monitoring/alert config.
output "finance_cloud_endpoint_id" {
value = module.storage_sync_file_sync_branch.cloud_endpoint_ids["finance-shares"]
}
With Terragrunt
Terragrunt keeps this module DRY across environments — define the backend and provider once in a root config, then a thin terragrunt.hcl per environment supplies only the inputs that differ.
1. Root config — live/terragrunt.hcl (inherited by every module):
remote_state {
backend = "azurerm"
generate = { path = "backend.tf", if_exists = "overwrite" }
config = {
# ...azurerm state bucket/container + key per path...
}
}
2. Module config — live/prod/storage_sync/terragrunt.hcl:
include "root" {
path = find_in_parent_folders()
}
terraform {
source = "git::https://dev.azure.com/teknohut/kloudvin/_git/terraform-modules//terraform-module-azure-storage-sync?ref=v1.0.0"
}
inputs = {
name = "..."
resource_group_name = "..."
location = "..."
}
3. Deploy one environment, or roll out all modules together:
cd live/prod/storage_sync && terragrunt apply # this module
terragrunt run-all apply # every module under live/prod
Why Terragrunt here: the backend and provider live in one place instead of being copy-pasted into every module; inputs is overridden per environment (dev / stage / prod) without forking the module; and run-all orchestrates dependencies across modules. Reach for it once you have more than one environment or more than a handful of modules — for a single stack, the plain Quickstart above is enough.
Inputs
| Name | Type | Default | Required | Description |
|---|---|---|---|---|
name |
string |
— | Yes | Storage Sync Service name; 1-260 chars, starts with a letter/number. |
resource_group_name |
string |
— | Yes | Resource group to create the sync service in. |
location |
string |
— | Yes | Azure region (e.g. centralindia). |
incoming_traffic_policy |
string |
"AllowVirtualNetworksOnly" |
No | Incoming traffic policy: AllowAllTraffic or AllowVirtualNetworksOnly. |
sync_groups |
map(object) |
{} |
No | Sync groups keyed by name; each defines file_share_name, storage_account_id, optional storage_account_tenant_id. |
tags |
map(string) |
{} |
No | Tags applied to the Storage Sync Service. |
Outputs
| Name | Description |
|---|---|
id |
The resource ID of the Storage Sync Service (register servers against this). |
name |
The name of the Storage Sync Service. |
registered_servers |
List of registered server IDs currently associated with the sync service. |
sync_group_ids |
Map of sync group name to sync group resource ID. |
cloud_endpoint_ids |
Map of sync group name to cloud endpoint resource ID. |
Enterprise scenario
A national retailer runs a Windows file server in each of 180 stores, every one of them previously backed up over the WAN to head office. They deploy this module once per region with incoming_traffic_policy = "AllowVirtualNetworksOnly" and a finance-shares sync group whose cloud endpoint points at a GZRS Azure file share. Each store’s server runs the File Sync agent, registers against the module’s id output, and gets a server endpoint with cloud tiering set to a 20% free-space policy — so the store keeps only its hot files locally while the authoritative copy lives in Azure. A failed store server is replaced by registering a new box against the same sync service and letting it rehydrate, turning a multi-day restore into an afternoon.
Best practices
- Pin
incoming_traffic_policy = "AllowVirtualNetworksOnly". This is the single most important File Sync hardening switch: it stops server registration and sync traffic from traversing the public internet, accepting it only over your private network (ExpressRoute/VPN/VNet). The module defaults to it; override toAllowAllTrafficonly with a deliberate reason. - Set
storage_account_tenant_idonly when you must. For same-tenant deployments leave it null and let the Microsoft.StorageSync first-party app use its default access. Supplying a tenant ID is exclusively for cross-tenant shares or where that app has been denied default access — getting this wrong is the usual cause of a “cannot create cloud endpoint” failure. - One cloud endpoint per sync group, sized for the share. Because each sync group binds exactly one file share, model your shares-to-groups mapping up front; use ZRS/GZRS on the underlying storage account so the cloud copy — your source of truth and DR target — survives a zone or region event.
- Keep server endpoints and cloud tiering out of this apply. Register agents and create server endpoints as a day-2 step (they need a
registered_server_idthat only exists post-registration). Tune cloud tiering volume-free-space / date policies there to control how much data each branch caches locally — that is where File Sync cost and WAN savings are actually won. - Name for region and role. Adopt a convention like
ss-<org>-<workload>-<env>-<region>(e.g.ss-kv-branch-prod-cin) so a sync service is self-describing; name sync groups after the data they carry (finance-shares) rather than a server, since servers come and go but the group is permanent. - Tag for ownership and recovery tier. Apply
environment,owner, andcostcentertags so the sync service shows up correctly in cost and DR-scope reports; File Sync itself is inexpensive, but the file shares behind it are not, and clear ownership speeds incident response.