IaC AWS

Terraform Module: AWS FSx for Lustre — high-throughput HPC scratch storage as code

Quick take — Provision an Amazon FSx for Lustre file system with Terraform: deployment type, throughput-per-TiB, S3 data repository linkage, compression, logging and encryption — all var-driven and reusable. New here? Jump to the Quickstart below to deploy it in minutes; read on for how it works and when to reach for it.

Quickstart (copy-paste)

Minimal, runnable configuration — drop this in a .tf file and fill in the "..." placeholders (each required input is commented):

provider "aws" {
  region = "us-east-1"
}

module "fsx" {
  source = "git::https://dev.azure.com/teknohut/kloudvin/_git/terraform-modules//terraform-module-aws-fsx?ref=v1.0.0"

  name               = "..."           # Logical name; applied as the Name tag.
  subnet_id          = "..."           # Subnet for the single-AZ file system.
  security_group_ids = ["...", "..."]  # SGs allowing Lustre ports 988 and 1018-1023.
}

Then terraform init && terraform apply. Every other input has a sensible default — see Inputs below to override behaviour.

What this module is

Amazon FSx for Lustre is a fully managed, POSIX-compliant parallel file system purpose-built for workloads that need to chew through data fast: HPC simulations, genomics pipelines, machine-learning training, seismic processing, and large-scale media rendering. Lustre delivers sub-millisecond latencies and aggregate throughput that scales linearly with capacity — hundreds of GB/s and millions of IOPS — by striping files across many storage targets and mounting over a high-performance client.

The catch is that aws_fsx_lustre_file_system has a lot of inter-dependent knobs that only make sense in certain combinations: the deployment_type dictates whether per_unit_storage_throughput is even allowed, which throughput tiers are valid, what the minimum storage_capacity is, and whether data_compression_type or metadata_configuration apply. Wiring an S3 data repository association, enabling Lustre logging to CloudWatch, and pinning a KMS key on top of that is easy to get subtly wrong.

This module wraps all of that into a single, opinionated, var-driven unit. You hand it a subnet, a security group, a deployment type, and a throughput tier; it returns a correctly configured file system plus the mount name and DNS name your compute fleet needs. Validations catch the most common foot-guns (illegal throughput tiers, bad capacity increments) at plan time instead of as a failed apply after a five-minute provisioning wait.

When to use it

Reach for plain aws_fsx_lustre_file_system directly only for a one-off experiment. For anything that lives in a pipeline, the module’s guardrails pay for themselves.

Module structure

terraform-module-aws-fsx/
├── versions.tf      # provider + Terraform version pins
├── main.tf          # FSx Lustre file system + optional S3 data repo association
├── variables.tf     # all inputs, with validations
└── outputs.tf       # id, dns_name, mount_name, ARN, network interfaces
# versions.tf
terraform {
  required_version = ">= 1.5.0"

  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "~> 5.0"
    }
  }
}
# main.tf
locals {
  # PERSISTENT_2 is the only deployment type that supports a data repository
  # association resource; SCRATCH_* use the inline import_path on the FS.
  enable_dra = var.deployment_type == "PERSISTENT_2" && var.data_repository_path != null

  tags = merge(
    var.tags,
    {
      Name      = var.name
      ManagedBy = "terraform"
      Module    = "terraform-module-aws-fsx"
    }
  )
}

resource "aws_fsx_lustre_file_system" "this" {
  storage_capacity = var.storage_capacity
  subnet_ids       = [var.subnet_id]
  security_group_ids = var.security_group_ids

  deployment_type             = var.deployment_type
  storage_type                = var.storage_type
  per_unit_storage_throughput = var.per_unit_storage_throughput
  data_compression_type       = var.data_compression_type
  file_system_type_version    = var.file_system_type_version

  # KMS encryption at rest (PERSISTENT_* only; SCRATCH uses an AWS-owned key).
  kms_key_id = var.deployment_type == "SCRATCH_2" ? null : var.kms_key_id

  # Inline S3 linkage for SCRATCH deployments (DRA is used for PERSISTENT_2).
  import_path                    = var.deployment_type == "SCRATCH_2" ? var.data_repository_path : null
  export_path                    = var.deployment_type == "SCRATCH_2" ? var.export_path : null
  imported_file_chunk_size       = var.deployment_type == "SCRATCH_2" ? var.imported_file_chunk_size : null
  auto_import_policy             = var.deployment_type == "SCRATCH_2" ? var.auto_import_policy : null

  # Weekly maintenance + backups apply to PERSISTENT deployments.
  weekly_maintenance_start_time   = var.weekly_maintenance_start_time
  automatic_backup_retention_days = var.deployment_type == "SCRATCH_2" ? null : var.automatic_backup_retention_days
  daily_automatic_backup_start_time = (
    var.deployment_type != "SCRATCH_2" && var.automatic_backup_retention_days > 0
  ) ? var.daily_automatic_backup_start_time : null

  dynamic "log_configuration" {
    for_each = var.logging_destination_arn != null ? [1] : []
    content {
      level       = var.logging_level
      destination = var.logging_destination_arn
    }
  }

  dynamic "metadata_configuration" {
    for_each = (
      var.deployment_type == "PERSISTENT_2" && var.metadata_iops != null
    ) ? [1] : []
    content {
      mode = "USER_PROVISIONED"
      iops = var.metadata_iops
    }
  }

  tags = local.tags
}

# Data Repository Association: links a sub-path of the FS to an S3 prefix and
# keeps them in sync. Only valid on PERSISTENT_2 file systems.
resource "aws_fsx_data_repository_association" "s3" {
  count = local.enable_dra ? 1 : 0

  file_system_id       = aws_fsx_lustre_file_system.this.id
  data_repository_path = var.data_repository_path
  file_system_path     = var.file_system_mount_path
  batch_import_meta_data_on_create = var.batch_import_metadata

  imported_file_chunk_size = var.imported_file_chunk_size

  s3 {
    auto_import_policy {
      events = var.dra_auto_import_events
    }
    auto_export_policy {
      events = var.dra_auto_export_events
    }
  }

  tags = local.tags
}
# variables.tf
variable "name" {
  description = "Logical name for the file system; applied as the Name tag."
  type        = string
}

variable "subnet_id" {
  description = "Subnet ID in which to create the (single-AZ) Lustre file system."
  type        = string
}

variable "security_group_ids" {
  description = "Security group IDs to attach. Must allow Lustre ports 988 and 1018-1023 from clients."
  type        = list(string)
}

variable "deployment_type" {
  description = "FSx Lustre deployment type: SCRATCH_2 (ephemeral) or PERSISTENT_2 (durable)."
  type        = string
  default     = "PERSISTENT_2"

  validation {
    condition     = contains(["SCRATCH_2", "PERSISTENT_2"], var.deployment_type)
    error_message = "deployment_type must be SCRATCH_2 or PERSISTENT_2 (SCRATCH_1 and PERSISTENT_1 are legacy and not supported by this module)."
  }
}

variable "storage_type" {
  description = "SSD or HDD. HDD is only valid with PERSISTENT_2 at 12/40 MB/s tiers."
  type        = string
  default     = "SSD"

  validation {
    condition     = contains(["SSD", "HDD"], var.storage_type)
    error_message = "storage_type must be SSD or HDD."
  }
}

variable "storage_capacity" {
  description = "Capacity in GiB. SCRATCH_2: min 1200, then 2400+ in 2400 steps. PERSISTENT_2 SSD: min 1200, then 2400+ in 2400 steps."
  type        = number
  default     = 1200

  validation {
    condition     = var.storage_capacity == 1200 || (var.storage_capacity >= 2400 && var.storage_capacity % 1200 == 0)
    error_message = "storage_capacity must be 1200, or 2400 and above in multiples of 1200 GiB."
  }
}

variable "per_unit_storage_throughput" {
  description = "Throughput per TiB of storage (MB/s/TiB). PERSISTENT_2 SSD: 125, 250, 500, or 1000."
  type        = number
  default     = 250

  validation {
    condition     = contains([12, 40, 50, 100, 125, 200, 250, 500, 1000], var.per_unit_storage_throughput)
    error_message = "per_unit_storage_throughput must be one of the valid tiers; for PERSISTENT_2 SSD use 125, 250, 500, or 1000."
  }
}

variable "data_compression_type" {
  description = "In-file compression for data at rest and in transit: NONE or LZ4."
  type        = string
  default     = "LZ4"

  validation {
    condition     = contains(["NONE", "LZ4"], var.data_compression_type)
    error_message = "data_compression_type must be NONE or LZ4."
  }
}

variable "file_system_type_version" {
  description = "Lustre software version (e.g. 2.15). Leave null to use the AWS default."
  type        = string
  default     = null
}

variable "kms_key_id" {
  description = "KMS key ARN for encryption at rest (PERSISTENT_2 only). Null uses the AWS-managed FSx key."
  type        = string
  default     = null
}

variable "metadata_iops" {
  description = "User-provisioned metadata IOPS for PERSISTENT_2 (1500/3000/6000/12000...). Null lets AWS auto-provision."
  type        = number
  default     = null

  validation {
    condition     = var.metadata_iops == null || contains([1500, 3000, 6000, 12000], var.metadata_iops)
    error_message = "metadata_iops must be one of 1500, 3000, 6000, or 12000 when set."
  }
}

# ---- S3 linkage ----

variable "data_repository_path" {
  description = "S3 path (s3://bucket/prefix) to link. PERSISTENT_2 uses a DRA; SCRATCH_2 uses import_path."
  type        = string
  default     = null
}

variable "export_path" {
  description = "S3 export path for SCRATCH_2 inline export. Ignored for PERSISTENT_2."
  type        = string
  default     = null
}

variable "file_system_mount_path" {
  description = "Mount point inside the FS for the DRA (e.g. /data). PERSISTENT_2 only."
  type        = string
  default     = "/data"
}

variable "imported_file_chunk_size" {
  description = "Chunk size (MiB) used when importing files from S3 (1 - 512000)."
  type        = number
  default     = 1024
}

variable "auto_import_policy" {
  description = "SCRATCH_2 inline auto-import policy: NONE, NEW, NEW_CHANGED, or NEW_CHANGED_DELETED."
  type        = string
  default     = "NEW_CHANGED"
}

variable "batch_import_metadata" {
  description = "Whether the DRA runs a batch metadata import of all existing S3 objects on creation."
  type        = bool
  default     = true
}

variable "dra_auto_import_events" {
  description = "S3 events that trigger import into the FS via the DRA."
  type        = list(string)
  default     = ["NEW", "CHANGED", "DELETED"]
}

variable "dra_auto_export_events" {
  description = "FS events that trigger export to S3 via the DRA."
  type        = list(string)
  default     = ["NEW", "CHANGED", "DELETED"]
}

# ---- Logging / backups / maintenance ----

variable "logging_destination_arn" {
  description = "CloudWatch Logs log group ARN for Lustre access logging. Null disables logging."
  type        = string
  default     = null
}

variable "logging_level" {
  description = "Lustre logging level: DISABLED, WARN_ONLY, ERROR_ONLY, or WARN_ERROR."
  type        = string
  default     = "WARN_ERROR"

  validation {
    condition     = contains(["DISABLED", "WARN_ONLY", "ERROR_ONLY", "WARN_ERROR"], var.logging_level)
    error_message = "logging_level must be DISABLED, WARN_ONLY, ERROR_ONLY, or WARN_ERROR."
  }
}

variable "automatic_backup_retention_days" {
  description = "Days to retain automatic backups (PERSISTENT_2 only). 0 disables backups."
  type        = number
  default     = 7

  validation {
    condition     = var.automatic_backup_retention_days >= 0 && var.automatic_backup_retention_days <= 90
    error_message = "automatic_backup_retention_days must be between 0 and 90."
  }
}

variable "daily_automatic_backup_start_time" {
  description = "Daily backup window start time in UTC, HH:MM format."
  type        = string
  default     = "03:00"
}

variable "weekly_maintenance_start_time" {
  description = "Weekly maintenance window, d:HH:MM (1 = Monday). Null lets AWS choose."
  type        = string
  default     = null
}

variable "tags" {
  description = "Additional tags to apply to all created resources."
  type        = map(string)
  default     = {}
}
# outputs.tf
output "id" {
  description = "FSx for Lustre file system ID."
  value       = aws_fsx_lustre_file_system.this.id
}

output "arn" {
  description = "ARN of the file system."
  value       = aws_fsx_lustre_file_system.this.arn
}

output "name" {
  description = "Name tag of the file system."
  value       = var.name
}

output "dns_name" {
  description = "DNS name used to mount the file system."
  value       = aws_fsx_lustre_file_system.this.dns_name
}

output "mount_name" {
  description = "Lustre mount name (the fsxZ... token used in the client mount command)."
  value       = aws_fsx_lustre_file_system.this.mount_name
}

output "network_interface_ids" {
  description = "ENIs created for the file system."
  value       = aws_fsx_lustre_file_system.this.network_interface_ids
}

output "mount_command" {
  description = "Ready-to-run Lustre mount command for clients."
  value = format(
    "sudo mount -t lustre -o relatime,flock %s@tcp:/%s /fsx",
    aws_fsx_lustre_file_system.this.dns_name,
    aws_fsx_lustre_file_system.this.mount_name
  )
}

output "data_repository_association_id" {
  description = "ID of the S3 Data Repository Association, if one was created."
  value       = try(aws_fsx_data_repository_association.s3[0].id, null)
}

How to use it

module "fsx_for_lustre" {
  source = "git::https://dev.azure.com/teknohut/kloudvin/_git/terraform-modules//terraform-module-aws-fsx?ref=v1.0.0"

  name               = "genomics-training-prod"
  subnet_id          = module.network.private_subnet_ids[0]
  security_group_ids = [aws_security_group.fsx_lustre.id]

  deployment_type             = "PERSISTENT_2"
  storage_type                = "SSD"
  storage_capacity            = 4800
  per_unit_storage_throughput = 500
  data_compression_type       = "LZ4"
  kms_key_id                  = aws_kms_key.fsx.arn

  # Lazy-load reference datasets straight from S3, export results back.
  data_repository_path   = "s3://kv-genomics-datasets/grch38"
  file_system_mount_path = "/refdata"
  batch_import_metadata  = true

  logging_destination_arn         = aws_cloudwatch_log_group.fsx.arn
  logging_level                   = "WARN_ERROR"
  automatic_backup_retention_days = 14

  tags = {
    Environment = "prod"
    CostCenter  = "hpc-genomics"
  }
}

# Downstream: feed the mount details into a compute launch template so every
# node in the Auto Scaling group mounts the Lustre file system on boot.
resource "aws_launch_template" "hpc_node" {
  name_prefix = "hpc-node-"
  image_id    = data.aws_ami.al2023.id

  user_data = base64encode(<<-EOT
    #!/bin/bash
    yum install -y lustre-client
    mkdir -p /fsx
    ${module.fsx_for_lustre.mount_command}
    echo "${module.fsx_for_lustre.dns_name}@tcp:/${module.fsx_for_lustre.mount_name} /fsx lustre relatime,flock,_netdev 0 0" >> /etc/fstab
  EOT
  )
}

With Terragrunt

Terragrunt keeps this module DRY across environments — define the backend and provider once in a root config, then a thin terragrunt.hcl per environment supplies only the inputs that differ.

1. Root configlive/terragrunt.hcl (inherited by every module):

remote_state {
  backend = "s3"
  generate = { path = "backend.tf", if_exists = "overwrite" }
  config = {
    # ...s3 state bucket/container + key per path...
  }
}

2. Module configlive/prod/fsx/terragrunt.hcl:

include "root" {
  path = find_in_parent_folders()
}

terraform {
  source = "git::https://dev.azure.com/teknohut/kloudvin/_git/terraform-modules//terraform-module-aws-fsx?ref=v1.0.0"
}

inputs = {
  name = "..."
  subnet_id = "..."
  security_group_ids = ["...", "..."]
}

3. Deploy one environment, or roll out all modules together:

cd live/prod/fsx && terragrunt apply        # this module
terragrunt run-all apply                      # every module under live/prod

Why Terragrunt here: the backend and provider live in one place instead of being copy-pasted into every module; inputs is overridden per environment (dev / stage / prod) without forking the module; and run-all orchestrates dependencies across modules. Reach for it once you have more than one environment or more than a handful of modules — for a single stack, the plain Quickstart above is enough.

Inputs

Name Type Default Required Description
name string yes Logical name; applied as the Name tag.
subnet_id string yes Subnet for the single-AZ file system.
security_group_ids list(string) yes SGs allowing Lustre ports 988 and 1018-1023.
deployment_type string "PERSISTENT_2" no SCRATCH_2 or PERSISTENT_2.
storage_type string "SSD" no SSD or HDD (HDD only for low PERSISTENT_2 tiers).
storage_capacity number 1200 no GiB; 1200 or 2400+ in 1200 increments.
per_unit_storage_throughput number 250 no MB/s/TiB throughput tier.
data_compression_type string "LZ4" no NONE or LZ4 in-file compression.
file_system_type_version string null no Lustre version (e.g. 2.15).
kms_key_id string null no KMS key ARN (PERSISTENT_2 only).
metadata_iops number null no User-provisioned metadata IOPS for PERSISTENT_2.
data_repository_path string null no S3 path to link (DRA for PERSISTENT_2, import_path for SCRATCH_2).
export_path string null no SCRATCH_2 inline export path.
file_system_mount_path string "/data" no DRA mount point inside the FS.
imported_file_chunk_size number 1024 no S3 import chunk size in MiB.
auto_import_policy string "NEW_CHANGED" no SCRATCH_2 inline auto-import policy.
batch_import_metadata bool true no Batch-import existing S3 metadata on DRA create.
dra_auto_import_events list(string) ["NEW","CHANGED","DELETED"] no S3 events importing into the FS.
dra_auto_export_events list(string) ["NEW","CHANGED","DELETED"] no FS events exporting to S3.
logging_destination_arn string null no CloudWatch Logs ARN for access logging.
logging_level string "WARN_ERROR" no Lustre logging level.
automatic_backup_retention_days number 7 no Backup retention (0-90); PERSISTENT_2 only.
daily_automatic_backup_start_time string "03:00" no UTC HH:MM backup window start.
weekly_maintenance_start_time string null no d:HH:MM maintenance window.
tags map(string) {} no Extra tags for all resources.

Outputs

Name Description
id FSx for Lustre file system ID.
arn ARN of the file system.
name Name tag of the file system.
dns_name DNS name used to mount the file system.
mount_name Lustre mount name token for the client mount command.
network_interface_ids ENIs created for the file system.
mount_command Ready-to-run mount -t lustre command for clients.
data_repository_association_id ID of the S3 DRA, if created.

Enterprise scenario

A genomics platform team runs nightly variant-calling pipelines on a 600-node EC2 Spot fleet. They call this module once per region to stand up a 4,800 GiB PERSISTENT_2 SSD file system at 500 MB/s/TiB (about 2.4 GB/s aggregate), with a Data Repository Association lazy-loading the GRCh38 reference panel from an S3 bucket so the pipeline starts reading immediately without a full pre-copy. LZ4 compression cuts the on-disk footprint of the highly compressible FASTQ inputs, KMS encryption satisfies their HIPAA controls, and the mount_command output is injected directly into the compute launch template so every Spot node mounts /fsx on boot with zero manual steps.

Best practices

TerraformAWSFSx for LustreModuleIaC
Need this built for real?

Vinod is a Senior Cloud Architect (22+ yrs) — available for Azure / AWS / GCP architecture, landing zones, and migrations.

Work with me

Comments

Keep Reading