Quick take — A production-ready Terraform module for AWS Storage Gateway: activation, gateway type and timezone, cache/upload disks, CloudWatch log group, SMB/AD settings, and a maintenance window — fully var-driven. New here? Jump to the Quickstart below to deploy it in minutes; read on for how it works and when to reach for it.
Quickstart (copy-paste)
Minimal, runnable configuration — drop this in a .tf file and fill in the "..." placeholders (each required input is commented):
provider "aws" {
region = "us-east-1"
}
module "storage_gateway" {
source = "git::https://dev.azure.com/teknohut/kloudvin/_git/terraform-modules//terraform-module-aws-storage-gateway?ref=v1.0.0"
gateway_name = "..." # Name of the gateway; used as tag and default log group …
}
Then terraform init && terraform apply. Every other input has a sensible default — see Inputs below to override behaviour.
What this module is
AWS Storage Gateway is a hybrid-cloud service that puts an on-premises (or in-VPC) appliance in front of AWS storage so your existing applications keep talking to local NFS, SMB, or iSCSI endpoints while the data actually lands in S3, FSx for Windows File Server, EBS snapshots, or virtual tape in Glacier. It is how you give a legacy file server, a backup application that only speaks iSCSI tape, or a branch office a fast local cache while the durable copy lives in the cloud — without rewriting the application.
The piece Terraform owns is aws_storagegateway_gateway: the activation of a gateway appliance and its core configuration — the gateway type (FILE_S3, FILE_FSX_SMB, STORED, CACHED, or VTL), its timezone, the CloudWatch log group it ships health events to, the SMB security strategy and Active Directory join for file gateways, and the weekly maintenance window. But a gateway is never useful on its own: a CACHED/STORED/VTL gateway needs local disks assigned as cache and upload buffer before it can serve anything, and operators want those configured in the same apply. Doing activation by hand — fetching the activation key from the appliance’s web UI, picking disk roles, wiring the log group, joining AD — is fiddly and easy to get wrong per environment. This module turns the whole bring-up into a few validated variables: it activates the gateway, attaches the CloudWatch log group, configures SMB/AD where relevant, and assigns the cache (and, for CACHED/STORED/VTL, the upload buffer) disks by aws_storagegateway_cache and aws_storagegateway_working_storage.
When to use it
Reach for this module when:
- An on-prem application needs cloud-backed file storage but expects a local share — a CAD workstation farm, a media ingest server, or a department file server that mounts SMB/NFS and shouldn’t know the bytes end up in S3 or FSx (File Gateway:
FILE_S3/FILE_FSX_SMB). - A backup product only knows how to write to tape. Veeam, NetBackup, Commvault, and friends can write to a virtual tape library (
VTL) that the gateway exposes over iSCSI, then offloads to S3 and Glacier — replacing a physical tape robot. - You want low-latency block volumes with cloud durability — a
CACHEDvolume gateway keeps the hot working set local and stores the full volume in AWS, or aSTOREDgateway keeps the full dataset on-prem and asynchronously snapshots it to EBS for DR. - You are standardising hybrid edge sites and need every branch’s gateway activated, logging to CloudWatch, joined to AD, and patched on the same maintenance window — identically, from code.
Skip Storage Gateway when the workload is already cloud-native and can talk to S3/EFS/FSx directly — the gateway exists to bridge existing on-prem or VM-based applications to AWS storage, and it adds an appliance, a cache disk, and bandwidth to manage. If you just need a shared POSIX filesystem inside AWS, use EFS; if you need an object store, use S3 directly.
Module structure
terraform-module-aws-storage-gateway/
├── versions.tf # provider + Terraform version pins
├── main.tf # aws_storagegateway_gateway + log group + cache + upload buffer
├── variables.tf # var-driven inputs with validations
└── outputs.tf # id/arn/name + gateway network/state outputs
versions.tf
terraform {
required_version = ">= 1.5.0"
required_providers {
aws = {
source = "hashicorp/aws"
version = "~> 5.0"
}
}
}
main.tf
locals {
name_tag = { Name = var.gateway_name }
# File gateways (FILE_S3 / FILE_FSX_SMB) use a cache only — they have no
# upload buffer. Volume/tape gateways (CACHED/STORED/VTL) also take an
# upload (working storage) buffer disk.
is_file_gateway = contains(["FILE_S3", "FILE_FSX_SMB"], var.gateway_type)
# SMB settings only apply to file gateways that serve SMB
# (FILE_FSX_SMB always; FILE_S3 when smb shares are used).
configure_smb = local.is_file_gateway && var.smb_security_strategy != null
}
# CloudWatch log group the gateway ships health and audit events to.
resource "aws_cloudwatch_log_group" "this" {
count = var.create_log_group ? 1 : 0
name = coalesce(var.log_group_name, "/aws/storagegateway/${var.gateway_name}")
retention_in_days = var.log_retention_in_days
kms_key_id = var.log_kms_key_arn
tags = merge(var.tags, local.name_tag)
}
# The gateway activation itself. The activation_key is read once from the
# appliance's HTTP endpoint (?activationRegion=...); after first apply AWS
# stores it and the key is not re-checked. gateway_ip_address is an
# alternative that lets the provider fetch the key for you.
resource "aws_storagegateway_gateway" "this" {
gateway_name = var.gateway_name
gateway_timezone = var.gateway_timezone
gateway_type = var.gateway_type
gateway_vpc_endpoint = var.gateway_vpc_endpoint
# Provide exactly one of these (enforced by a precondition below).
activation_key = var.activation_key
gateway_ip_address = var.gateway_ip_address
# Ship events to CloudWatch Logs.
cloudwatch_log_group_arn = var.create_log_group ? aws_cloudwatch_log_group.this[0].arn : var.cloudwatch_log_group_arn
# Weekly maintenance/patch window (UTC). Either day+hour (weekly) or
# day_of_month+hour (monthly), set via the typed variable.
maintenance_start_time {
hour_of_day = var.maintenance_start_time.hour_of_day
minute_of_hour = var.maintenance_start_time.minute_of_hour
day_of_week = var.maintenance_start_time.day_of_week
day_of_month = var.maintenance_start_time.day_of_month
}
# --- SMB / Active Directory (file gateways serving SMB only) ---
smb_security_strategy = local.configure_smb ? var.smb_security_strategy : null
smb_guest_password = local.configure_smb ? var.smb_guest_password : null
dynamic "smb_active_directory_settings" {
for_each = local.configure_smb && var.smb_active_directory_settings != null ? [var.smb_active_directory_settings] : []
content {
domain_name = smb_active_directory_settings.value.domain_name
username = smb_active_directory_settings.value.username
password = smb_active_directory_settings.value.password
organizational_unit = smb_active_directory_settings.value.organizational_unit
domain_controllers = smb_active_directory_settings.value.domain_controllers
timeout_in_seconds = smb_active_directory_settings.value.timeout_in_seconds
}
}
tags = merge(var.tags, local.name_tag)
lifecycle {
precondition {
condition = (var.activation_key != null) != (var.gateway_ip_address != null)
error_message = "Provide exactly one of activation_key or gateway_ip_address."
}
}
}
# Look up the local disks the appliance presents so we can assign roles.
data "aws_storagegateway_local_disk" "cache" {
for_each = toset(var.cache_disk_node_paths)
gateway_arn = aws_storagegateway_gateway.this.arn
disk_node = each.value
}
data "aws_storagegateway_local_disk" "upload" {
for_each = local.is_file_gateway ? toset([]) : toset(var.upload_buffer_disk_node_paths)
gateway_arn = aws_storagegateway_gateway.this.arn
disk_node = each.value
}
# Assign local disks as the read/write cache. Required before a CACHED/STORED/VTL
# gateway can serve data; file gateways also use a cache disk.
resource "aws_storagegateway_cache" "this" {
for_each = toset(var.cache_disk_node_paths)
gateway_arn = aws_storagegateway_gateway.this.arn
disk_id = data.aws_storagegateway_local_disk.cache[each.value].disk_id
}
# Assign the upload (working storage) buffer for CACHED/STORED/VTL gateways.
resource "aws_storagegateway_working_storage" "this" {
for_each = local.is_file_gateway ? toset([]) : toset(var.upload_buffer_disk_node_paths)
gateway_arn = aws_storagegateway_gateway.this.arn
disk_id = data.aws_storagegateway_local_disk.upload[each.value].disk_id
}
variables.tf
variable "gateway_name" {
description = "Name of the Storage Gateway; used as a tag and the default log group suffix."
type = string
validation {
condition = can(regex("^[a-zA-Z0-9][a-zA-Z0-9-_.]{1,254}$", var.gateway_name))
error_message = "gateway_name must be 2-255 chars, start alphanumeric, and contain only letters, digits, hyphens, underscores, or dots."
}
}
variable "gateway_type" {
description = "Gateway type: FILE_S3, FILE_FSX_SMB, CACHED, STORED, or VTL."
type = string
default = "FILE_S3"
validation {
condition = contains(["FILE_S3", "FILE_FSX_SMB", "CACHED", "STORED", "VTL"], var.gateway_type)
error_message = "gateway_type must be one of FILE_S3, FILE_FSX_SMB, CACHED, STORED, or VTL."
}
}
variable "gateway_timezone" {
description = "Gateway timezone in GMT[+/-]hh:mm form, used for maintenance scheduling (e.g. GMT-5:00, GMT+5:30)."
type = string
default = "GMT"
validation {
condition = can(regex("^GMT([+-](1[0-2]|0?[0-9]):[0-5][0-9])?$", var.gateway_timezone))
error_message = "gateway_timezone must be 'GMT' or 'GMT[+/-]hh:mm' (e.g. GMT-5:00, GMT+5:30)."
}
}
variable "activation_key" {
description = "Activation key obtained from the gateway appliance's HTTP endpoint. Provide this OR gateway_ip_address, not both."
type = string
default = null
}
variable "gateway_ip_address" {
description = "IP address of the gateway appliance; the provider fetches the activation key from it. Provide this OR activation_key, not both."
type = string
default = null
}
variable "gateway_vpc_endpoint" {
description = "VPC endpoint DNS name to activate the gateway against a private Storage Gateway endpoint (PrivateLink). Null for the public endpoint."
type = string
default = null
}
variable "maintenance_start_time" {
description = "Weekly/monthly maintenance window (UTC). Set day_of_week (0=Sun..6=Sat) OR day_of_month (1-28), and hour/minute."
type = object({
hour_of_day = number
minute_of_hour = optional(number, 0)
day_of_week = optional(number)
day_of_month = optional(string)
})
default = {
hour_of_day = 3
day_of_week = 0
}
validation {
condition = var.maintenance_start_time.hour_of_day >= 0 && var.maintenance_start_time.hour_of_day <= 23
error_message = "maintenance_start_time.hour_of_day must be between 0 and 23."
}
validation {
condition = (
(var.maintenance_start_time.day_of_week != null) != (var.maintenance_start_time.day_of_month != null)
)
error_message = "Set exactly one of day_of_week (weekly) or day_of_month (monthly) in maintenance_start_time."
}
}
variable "create_log_group" {
description = "Create and attach a CloudWatch log group for gateway health/audit events."
type = bool
default = true
}
variable "log_group_name" {
description = "Name for the created CloudWatch log group. Defaults to /aws/storagegateway/<gateway_name>."
type = string
default = null
}
variable "log_retention_in_days" {
description = "Retention for the created log group (days). 0 = never expire."
type = number
default = 90
validation {
condition = contains([0, 1, 3, 5, 7, 14, 30, 60, 90, 120, 150, 180, 365, 400, 545, 731, 1096, 1827, 2192, 2557, 2922, 3288, 3653], var.log_retention_in_days)
error_message = "log_retention_in_days must be a value CloudWatch Logs accepts."
}
}
variable "log_kms_key_arn" {
description = "Optional KMS key ARN to encrypt the created CloudWatch log group."
type = string
default = null
}
variable "cloudwatch_log_group_arn" {
description = "ARN of an existing log group to use when create_log_group = false."
type = string
default = null
}
variable "smb_security_strategy" {
description = "SMB security strategy for file gateways: ClientSpecified, MandatorySigning, or MandatoryEncryption. Null to leave default."
type = string
default = null
validation {
condition = var.smb_security_strategy == null || contains(["ClientSpecified", "MandatorySigning", "MandatoryEncryption"], var.smb_security_strategy)
error_message = "smb_security_strategy must be ClientSpecified, MandatorySigning, MandatoryEncryption, or null."
}
}
variable "smb_guest_password" {
description = "Guest password for SMB shares that allow guest access. Sensitive; leave null to disable guest access."
type = string
default = null
sensitive = true
}
variable "smb_active_directory_settings" {
description = "Join a file gateway to Active Directory for authenticated SMB. Null to skip AD join."
type = object({
domain_name = string
username = string
password = string
organizational_unit = optional(string)
domain_controllers = optional(list(string))
timeout_in_seconds = optional(number, 20)
})
default = null
sensitive = true
}
variable "cache_disk_node_paths" {
description = "Local disk node paths (e.g. /dev/sdb) to assign as the gateway read/write cache."
type = list(string)
default = []
}
variable "upload_buffer_disk_node_paths" {
description = "Local disk node paths to assign as the upload (working storage) buffer. CACHED/STORED/VTL only; ignored for file gateways."
type = list(string)
default = []
}
variable "tags" {
description = "Tags applied to all resources created by the module."
type = map(string)
default = {}
}
outputs.tf
output "id" {
description = "The ID of the Storage Gateway (e.g. sgw-12A3456B)."
value = aws_storagegateway_gateway.this.gateway_id
}
output "arn" {
description = "The ARN of the Storage Gateway, used by shares, volumes, and tape resources."
value = aws_storagegateway_gateway.this.arn
}
output "gateway_name" {
description = "The name of the gateway."
value = aws_storagegateway_gateway.this.gateway_name
}
output "gateway_type" {
description = "The activated gateway type (FILE_S3, FILE_FSX_SMB, CACHED, STORED, VTL)."
value = aws_storagegateway_gateway.this.gateway_type
}
output "ec2_instance_id" {
description = "EC2 instance ID of the gateway when it runs as an EC2 appliance (empty for on-prem/VM gateways)."
value = aws_storagegateway_gateway.this.ec2_instance_id
}
output "host_environment" {
description = "Where the gateway runs (EC2, VMWARE, HYPER-V, KVM, SNOWBALL, etc.)."
value = aws_storagegateway_gateway.this.host_environment
}
output "gateway_network_interface" {
description = "Network interfaces the gateway uses, including the IPv4 address(es) clients connect to."
value = aws_storagegateway_gateway.this.gateway_network_interface
}
output "log_group_arn" {
description = "ARN of the CloudWatch log group attached to the gateway (created or supplied)."
value = var.create_log_group ? aws_cloudwatch_log_group.this[0].arn : var.cloudwatch_log_group_arn
}
output "cache_disk_ids" {
description = "Map of disk node path to the disk_id assigned as cache."
value = { for p, d in data.aws_storagegateway_local_disk.cache : p => d.disk_id }
}
How to use it
module "storage_gateway" {
source = "git::https://dev.azure.com/teknohut/kloudvin/_git/terraform-modules//terraform-module-aws-storage-gateway?ref=v1.0.0"
gateway_name = "branch-sydney-fgw"
gateway_type = "FILE_S3"
gateway_timezone = "GMT+10:00"
# Let the provider pull the activation key from the appliance on the VPC.
gateway_ip_address = aws_instance.gateway_appliance.private_ip
# Cache disk presented to the appliance (e.g. a 500 GiB EBS volume).
cache_disk_node_paths = ["/dev/sdb"]
# Authenticated SMB: join the corporate domain and require signing.
smb_security_strategy = "MandatorySigning"
smb_active_directory_settings = {
domain_name = "corp.kloudvin.com"
username = "svc-storagegw"
password = var.ad_join_password
organizational_unit = "OU=Gateways,OU=Servers,DC=corp,DC=kloudvin,DC=com"
}
# Patch on Sunday 03:00 local; keep gateway logs for a year.
maintenance_start_time = {
hour_of_day = 3
day_of_week = 0
}
log_retention_in_days = 365
tags = {
Environment = "production"
Site = "sydney"
Team = "infra-storage"
}
}
# Downstream: an SMB file share that maps this gateway onto an S3 bucket.
resource "aws_storagegateway_smb_file_share" "team_share" {
gateway_arn = module.storage_gateway.arn
location_arn = aws_s3_bucket.team_files.arn
role_arn = aws_iam_role.gateway_s3.arn
authentication = "ActiveDirectory"
audit_destination_arn = module.storage_gateway.log_group_arn
valid_user_list = ["@corp.kloudvin.com\\file-users"]
}
With Terragrunt
Terragrunt keeps this module DRY across environments — define the backend and provider once in a root config, then a thin terragrunt.hcl per environment supplies only the inputs that differ.
1. Root config — live/terragrunt.hcl (inherited by every module):
remote_state {
backend = "s3"
generate = { path = "backend.tf", if_exists = "overwrite" }
config = {
# ...s3 state bucket/container + key per path...
}
}
2. Module config — live/prod/storage_gateway/terragrunt.hcl:
include "root" {
path = find_in_parent_folders()
}
terraform {
source = "git::https://dev.azure.com/teknohut/kloudvin/_git/terraform-modules//terraform-module-aws-storage-gateway?ref=v1.0.0"
}
inputs = {
gateway_name = "..."
}
3. Deploy one environment, or roll out all modules together:
cd live/prod/storage_gateway && terragrunt apply # this module
terragrunt run-all apply # every module under live/prod
Why Terragrunt here: the backend and provider live in one place instead of being copy-pasted into every module; inputs is overridden per environment (dev / stage / prod) without forking the module; and run-all orchestrates dependencies across modules. Reach for it once you have more than one environment or more than a handful of modules — for a single stack, the plain Quickstart above is enough.
Inputs
| Name | Type | Default | Required | Description |
|---|---|---|---|---|
gateway_name |
string |
— | Yes | Name of the gateway; used as tag and default log group suffix. |
gateway_type |
string |
"FILE_S3" |
No | FILE_S3, FILE_FSX_SMB, CACHED, STORED, or VTL. |
gateway_timezone |
string |
"GMT" |
No | Timezone as GMT or GMT[+/-]hh:mm for maintenance scheduling. |
activation_key |
string |
null |
No* | Activation key from the appliance. Provide this OR gateway_ip_address. |
gateway_ip_address |
string |
null |
No* | Appliance IP the provider fetches the key from. Provide this OR activation_key. |
gateway_vpc_endpoint |
string |
null |
No | PrivateLink VPC endpoint DNS to activate against a private endpoint. |
maintenance_start_time |
object |
{hour_of_day=3, day_of_week=0} |
No | Weekly/monthly patch window (UTC); set day_of_week OR day_of_month. |
create_log_group |
bool |
true |
No | Create and attach a CloudWatch log group for the gateway. |
log_group_name |
string |
null |
No | Name for the created log group; defaults to /aws/storagegateway/<name>. |
log_retention_in_days |
number |
90 |
No | Retention for the created log group (0 = never expire). |
log_kms_key_arn |
string |
null |
No | KMS key ARN to encrypt the created log group. |
cloudwatch_log_group_arn |
string |
null |
No | Existing log group ARN to use when create_log_group = false. |
smb_security_strategy |
string |
null |
No | ClientSpecified, MandatorySigning, or MandatoryEncryption (file gateways). |
smb_guest_password |
string |
null |
No | Guest password for SMB shares allowing guest access (sensitive). |
smb_active_directory_settings |
object |
null |
No | Active Directory join settings for authenticated SMB (sensitive). |
cache_disk_node_paths |
list(string) |
[] |
No | Local disk node paths to assign as the read/write cache. |
upload_buffer_disk_node_paths |
list(string) |
[] |
No | Local disk node paths for the upload buffer (CACHED/STORED/VTL only). |
tags |
map(string) |
{} |
No | Tags applied to all created resources. |
* Exactly one of activation_key or gateway_ip_address is required (enforced by a precondition).
Outputs
| Name | Description |
|---|---|
id |
The ID of the Storage Gateway (e.g. sgw-12A3456B). |
arn |
The ARN of the gateway, used by share/volume/tape resources. |
gateway_name |
The name of the gateway. |
gateway_type |
The activated gateway type. |
ec2_instance_id |
EC2 instance ID when the gateway runs as an EC2 appliance. |
host_environment |
Where the gateway runs (EC2, VMWARE, HYPER-V, KVM, SNOWBALL). |
gateway_network_interface |
Network interfaces including the IPv4 address clients connect to. |
log_group_arn |
ARN of the attached CloudWatch log group. |
cache_disk_ids |
Map of disk node path to the disk_id assigned as cache. |
Enterprise scenario
A national engineering firm runs forty branch offices, each with a local file server full of large CAD and survey datasets that staff open over SMB all day. The infrastructure team deploys this module per site against an EC2 or VMware FILE_S3 gateway: a 1 TiB EBS cache disk keeps the active drawings local for LAN-speed reads, every gateway joins the corporate AD with MandatorySigning, all SMB audit events flow to a KMS-encrypted CloudWatch log group, and patching is pinned to Sunday 03:00 in each site’s own timezone. The durable copy of every drawing lands in a single versioned S3 bucket, so a failed branch appliance is recovered by re-activating a new gateway against the same bucket instead of restoring terabytes from tape.
Best practices
- Never commit the AD join password or SMB guest password. Source
smb_active_directory_settingsandsmb_guest_passwordfrom a secret store (AWS Secrets Manager / SSM SecureString or your CI’s secret vault) and mark the variablessensitive— they are domain credentials that grant the gateway machine-account rights. - Activate over a PrivateLink VPC endpoint by setting
gateway_vpc_endpointso activation and data transfer to AWS never traverse the public internet — important for regulated workloads and for gateways with no NAT egress. - Size the cache disk for the hot working set, not the whole dataset. File and
CACHEDgateways only need enough local cache (and, for volume/tape, upload buffer) to hold actively used data; over-provisioning EBS for the cache is wasted spend since the durable copy lives in S3/FSx/Glacier. - Enforce SMB signing or encryption. Prefer
MandatorySigning(orMandatoryEncryptionfor sensitive data) overClientSpecifiedso older clients can’t negotiate down to unauthenticated SMB on a share backed by cloud storage. - Stagger maintenance windows by site and keep audit logs. Use each gateway’s real
gateway_timezoneand a per-sitemaintenance_start_timeto avoid a fleet-wide reboot, and retain the CloudWatch log group (e.g. 365 days) so SMB file-access auditing survives an appliance rebuild. - Name and tag for the bridge they are. Encode site and gateway type in
gateway_name(e.g.branch-sydney-fgw) and tagEnvironment/Siteso the inevitable forty-gateway fleet stays discoverable and cost-attributable per branch.