IaC Azure

Terraform Module: Azure Linux Virtual Machine — production-ready compute with hardened defaults

Quick take — A reusable hashicorp/azurerm ~> 4.0 module for azurerm_linux_virtual_machine: SSH-key-only auth, managed identity, boot diagnostics, OS-disk encryption and optional data disks, all var-driven. New here? Jump to the Quickstart below to deploy it in minutes; read on for how it works and when to reach for it.

Quickstart (copy-paste)

Minimal, runnable configuration — drop this in a .tf file and fill in the "..." placeholders (each required input is commented):

provider "azurerm" {
  features {}
}

module "linux_virtual_machine" {
  source = "git::https://dev.azure.com/teknohut/kloudvin/_git/terraform-modules//terraform-module-azure-linux-virtual-machine?ref=v1.0.0"

  name                  = "..."           # VM name; also prefixes disk names. 2-64 chars, alphanum…
  resource_group_name   = "..."           # Resource group the VM and disks live in.
  location              = "..."           # Azure region (e.g. `centralindia`).
  admin_username        = "..."           # SSH admin user; reserved names are rejected.
  admin_ssh_public_key  = "..."           # OpenSSH public key for password-less SSH.
  network_interface_ids = ["...", "..."]  # NIC IDs; first is primary. At least one required.
}

Then terraform init && terraform apply. Every other input has a sensible default — see Inputs below to override behaviour.

What this module is

azurerm_linux_virtual_machine provisions a single Linux VM on Azure: it picks a VM size (SKU), boots a marketplace or custom image, attaches a managed OS disk, and binds one or more pre-created network interfaces. Unlike the older azurerm_virtual_machine resource, the typed azurerm_linux_virtual_machine resource is opinionated about Linux — it defaults to SSH-key authentication, exposes a first-class identity block for managed identity, and treats the OS disk and data disks as managed disks only.

The raw resource is deceptively simple but easy to ship insecurely: people leave disable_password_authentication at its default, forget to enable boot diagnostics (so you can’t see the serial console when SSH dies), skip patch orchestration, and hand-roll data-disk attachments inconsistently across environments. Wrapping it in a module lets you bake in the non-negotiables once — key-only auth, a SystemAssigned identity, boot diagnostics to a managed storage account, OS-disk encryption-at-host, and a clean pattern for data disks — then stamp out dev/test/prod VMs from the same vetted code with nothing but a tfvars change.

This module owns the VM, its identity, and (optionally) a set of data disks with their attachments. It deliberately does not create the NIC, subnet, or NSG: networking is a separate concern with its own lifecycle, so you pass in network_interface_ids and keep the module composable.

When to use it

Module structure

terraform-module-azure-linux-virtual-machine/
├── versions.tf      # provider + Terraform version pins
├── main.tf          # the VM, identity, boot diagnostics, data disks
├── variables.tf     # all inputs with validation
└── outputs.tf       # id, name, identity principal, private IP, disks

versions.tf

terraform {
  required_version = ">= 1.5.0"

  required_providers {
    azurerm = {
      source  = "hashicorp/azurerm"
      version = "~> 4.0"
    }
  }
}

main.tf

locals {
  # Normalise tags and stamp the module on every resource for traceability.
  common_tags = merge(
    var.tags,
    {
      managed_by = "terraform"
      module     = "terraform-module-azure-linux-virtual-machine"
    }
  )
}

resource "azurerm_linux_virtual_machine" "this" {
  name                = var.name
  resource_group_name = var.resource_group_name
  location            = var.location
  size                = var.size

  # Attach pre-created NIC(s); the first one is primary by default.
  network_interface_ids = var.network_interface_ids

  admin_username                  = var.admin_username
  disable_password_authentication = true

  # Optional zonal pinning for higher availability SLAs.
  zone = var.zone

  # Spread non-zonal VMs across an availability set or VMSS if provided.
  availability_set_id  = var.availability_set_id
  patch_mode           = var.patch_mode
  patch_assessment_mode = var.patch_assessment_mode

  # Encryption-at-host protects temp disk + OS/data disk caches at the host.
  encryption_at_host_enabled = var.encryption_at_host_enabled

  admin_ssh_key {
    username   = var.admin_username
    public_key = var.admin_ssh_public_key
  }

  os_disk {
    name                 = "${var.name}-osdisk"
    caching              = "ReadWrite"
    storage_account_type = var.os_disk_storage_account_type
    disk_size_gb         = var.os_disk_size_gb

    # Server-side encryption with a customer-managed key, when supplied.
    disk_encryption_set_id = var.disk_encryption_set_id
  }

  source_image_reference {
    publisher = var.source_image_reference.publisher
    offer     = var.source_image_reference.offer
    sku       = var.source_image_reference.sku
    version   = var.source_image_reference.version
  }

  identity {
    type         = var.identity_type
    identity_ids = var.identity_ids
  }

  # cloud-init / custom data, base64-encoded by the caller or via this var.
  custom_data = var.custom_data

  boot_diagnostics {
    # Empty string => managed (platform) storage account for serial console.
    storage_account_uri = var.boot_diagnostics_storage_account_uri
  }

  # Prevent accidental image-version drift from forcing a rebuild in CI.
  lifecycle {
    ignore_changes = [
      source_image_reference[0].version,
    ]
  }

  tags = local.common_tags
}

# Optional managed data disks, created and attached in lockstep with the VM.
resource "azurerm_managed_disk" "data" {
  for_each = var.data_disks

  name                 = "${var.name}-${each.key}"
  resource_group_name  = var.resource_group_name
  location             = var.location
  storage_account_type = each.value.storage_account_type
  create_option        = "Empty"
  disk_size_gb         = each.value.disk_size_gb
  zone                 = var.zone

  disk_encryption_set_id = var.disk_encryption_set_id

  tags = local.common_tags
}

resource "azurerm_virtual_machine_data_disk_attachment" "data" {
  for_each = var.data_disks

  managed_disk_id    = azurerm_managed_disk.data[each.key].id
  virtual_machine_id = azurerm_linux_virtual_machine.this.id
  lun                = each.value.lun
  caching            = each.value.caching
}

variables.tf

variable "name" {
  description = "Name of the Linux virtual machine (also used as a prefix for its disks)."
  type        = string

  validation {
    condition     = can(regex("^[a-zA-Z0-9][a-zA-Z0-9-]{0,62}[a-zA-Z0-9]$", var.name)) && length(var.name) <= 64
    error_message = "VM name must be 2-64 chars, alphanumeric or hyphen, and not start/end with a hyphen."
  }
}

variable "resource_group_name" {
  description = "Name of the resource group the VM is deployed into."
  type        = string
}

variable "location" {
  description = "Azure region for the VM and its disks (e.g. centralindia)."
  type        = string
}

variable "size" {
  description = "VM size / SKU (e.g. Standard_B2s for dev, Standard_D2as_v5 for prod)."
  type        = string
  default     = "Standard_B2s"
}

variable "admin_username" {
  description = "Administrator username for SSH access. Avoid reserved names like root or admin."
  type        = string

  validation {
    condition     = !contains(["root", "admin", "administrator", "1", "test", "user"], lower(var.admin_username))
    error_message = "admin_username must not be a disallowed value (root, admin, administrator, 1, test, user)."
  }
}

variable "admin_ssh_public_key" {
  description = "OpenSSH-format public key used for password-less SSH (ssh-rsa/ssh-ed25519 ...)."
  type        = string

  validation {
    condition     = can(regex("^(ssh-rsa|ssh-ed25519|ecdsa-sha2-) ", var.admin_ssh_public_key))
    error_message = "admin_ssh_public_key must be a valid OpenSSH public key (ssh-rsa, ssh-ed25519 or ecdsa-sha2-*)."
  }
}

variable "network_interface_ids" {
  description = "Ordered list of NIC resource IDs to attach; the first is the primary interface."
  type        = list(string)

  validation {
    condition     = length(var.network_interface_ids) >= 1
    error_message = "At least one network_interface_id must be provided."
  }
}

variable "zone" {
  description = "Availability zone to pin the VM and its disks to (\"1\", \"2\", \"3\"), or null for regional."
  type        = string
  default     = null

  validation {
    condition     = var.zone == null || contains(["1", "2", "3"], var.zone)
    error_message = "zone must be one of \"1\", \"2\", \"3\", or null."
  }
}

variable "availability_set_id" {
  description = "Optional availability set ID. Mutually exclusive with zone."
  type        = string
  default     = null
}

variable "source_image_reference" {
  description = "Marketplace image reference for the OS disk."
  type = object({
    publisher = string
    offer     = string
    sku       = string
    version   = string
  })
  default = {
    publisher = "Canonical"
    offer     = "ubuntu-24_04-lts"
    sku       = "server"
    version   = "latest"
  }
}

variable "os_disk_storage_account_type" {
  description = "OS disk SKU: Standard_LRS, StandardSSD_LRS, Premium_LRS, or Premium_ZRS."
  type        = string
  default     = "StandardSSD_LRS"

  validation {
    condition     = contains(["Standard_LRS", "StandardSSD_LRS", "Premium_LRS", "Premium_ZRS", "StandardSSD_ZRS"], var.os_disk_storage_account_type)
    error_message = "os_disk_storage_account_type must be a valid managed disk SKU."
  }
}

variable "os_disk_size_gb" {
  description = "OS disk size in GB. Null keeps the image default."
  type        = number
  default     = null

  validation {
    condition     = var.os_disk_size_gb == null || (var.os_disk_size_gb >= 30 && var.os_disk_size_gb <= 4095)
    error_message = "os_disk_size_gb must be between 30 and 4095 when set."
  }
}

variable "identity_type" {
  description = "Managed identity type: SystemAssigned, UserAssigned, or SystemAssigned, UserAssigned."
  type        = string
  default     = "SystemAssigned"

  validation {
    condition     = contains(["SystemAssigned", "UserAssigned", "SystemAssigned, UserAssigned"], var.identity_type)
    error_message = "identity_type must be SystemAssigned, UserAssigned, or \"SystemAssigned, UserAssigned\"."
  }
}

variable "identity_ids" {
  description = "User-assigned identity resource IDs (required when identity_type includes UserAssigned)."
  type        = list(string)
  default     = null
}

variable "patch_mode" {
  description = "OS patch orchestration: ImageDefault or AutomaticByPlatform."
  type        = string
  default     = "AutomaticByPlatform"

  validation {
    condition     = contains(["ImageDefault", "AutomaticByPlatform"], var.patch_mode)
    error_message = "patch_mode must be ImageDefault or AutomaticByPlatform."
  }
}

variable "patch_assessment_mode" {
  description = "Patch assessment mode: ImageDefault or AutomaticByPlatform."
  type        = string
  default     = "AutomaticByPlatform"

  validation {
    condition     = contains(["ImageDefault", "AutomaticByPlatform"], var.patch_assessment_mode)
    error_message = "patch_assessment_mode must be ImageDefault or AutomaticByPlatform."
  }
}

variable "encryption_at_host_enabled" {
  description = "Enable encryption at host for temp disk and OS/data disk caches. Requires the feature to be registered on the subscription."
  type        = bool
  default     = true
}

variable "disk_encryption_set_id" {
  description = "Optional Disk Encryption Set ID for customer-managed-key encryption of OS and data disks."
  type        = string
  default     = null
}

variable "custom_data" {
  description = "Base64-encoded cloud-init / custom data run on first boot. Null to omit."
  type        = string
  default     = null
  sensitive   = true
}

variable "boot_diagnostics_storage_account_uri" {
  description = "Boot diagnostics storage URI. Empty string uses a managed storage account (recommended); null disables it."
  type        = string
  default     = ""
}

variable "data_disks" {
  description = "Map of data disks to create and attach, keyed by a short logical name."
  type = map(object({
    disk_size_gb         = number
    lun                  = number
    storage_account_type = optional(string, "Premium_LRS")
    caching              = optional(string, "ReadWrite")
  }))
  default = {}

  validation {
    condition     = alltrue([for d in values(var.data_disks) : d.lun >= 0 && d.lun <= 63])
    error_message = "Each data disk lun must be between 0 and 63."
  }
}

variable "tags" {
  description = "Tags applied to the VM and all of its disks."
  type        = map(string)
  default     = {}
}

outputs.tf

output "id" {
  description = "Resource ID of the Linux virtual machine."
  value       = azurerm_linux_virtual_machine.this.id
}

output "name" {
  description = "Name of the Linux virtual machine."
  value       = azurerm_linux_virtual_machine.this.name
}

output "private_ip_address" {
  description = "Primary private IP address assigned to the VM."
  value       = azurerm_linux_virtual_machine.this.private_ip_address
}

output "public_ip_address" {
  description = "Primary public IP address, if the attached NIC has one (empty otherwise)."
  value       = azurerm_linux_virtual_machine.this.public_ip_address
}

output "identity_principal_id" {
  description = "Principal ID of the system-assigned identity, for RBAC role assignments."
  value       = try(azurerm_linux_virtual_machine.this.identity[0].principal_id, null)
}

output "virtual_machine_id" {
  description = "Stable VM ID (vmId) usable as a unique machine fingerprint."
  value       = azurerm_linux_virtual_machine.this.virtual_machine_id
}

output "data_disk_ids" {
  description = "Map of logical disk name to managed data disk resource ID."
  value       = { for k, d in azurerm_managed_disk.data : k => d.id }
}

How to use it

module "linux_virtual_machine" {
  source = "git::https://dev.azure.com/teknohut/kloudvin/_git/terraform-modules//terraform-module-azure-linux-virtual-machine?ref=v1.0.0"

  name                = "vm-orders-api-prod-01"
  resource_group_name = azurerm_resource_group.app.name
  location            = azurerm_resource_group.app.location
  size                = "Standard_D2as_v5"
  zone                = "1"

  admin_username       = "kvadmin"
  admin_ssh_public_key = file("${path.module}/keys/orders-api.pub")

  network_interface_ids = [azurerm_network_interface.orders_api.id]

  source_image_reference = {
    publisher = "Canonical"
    offer     = "ubuntu-24_04-lts"
    sku       = "server"
    version   = "latest"
  }

  os_disk_storage_account_type = "Premium_LRS"
  os_disk_size_gb              = 64
  encryption_at_host_enabled   = true

  # cloud-init to install the app runtime on first boot.
  custom_data = base64encode(file("${path.module}/cloud-init/orders-api.yaml"))

  data_disks = {
    data = {
      disk_size_gb         = 256
      lun                  = 0
      storage_account_type = "Premium_LRS"
      caching              = "None"
    }
  }

  tags = {
    environment = "prod"
    workload    = "orders-api"
    cost_center = "ecom-platform"
  }
}

# Downstream: grant the VM's managed identity read access to a Key Vault,
# using the identity_principal_id output.
resource "azurerm_role_assignment" "vm_kv_secrets" {
  scope                = azurerm_key_vault.app.id
  role_definition_name = "Key Vault Secrets User"
  principal_id         = module.linux_virtual_machine.identity_principal_id
}

With Terragrunt

Terragrunt keeps this module DRY across environments — define the backend and provider once in a root config, then a thin terragrunt.hcl per environment supplies only the inputs that differ.

1. Root configlive/terragrunt.hcl (inherited by every module):

remote_state {
  backend = "azurerm"
  generate = { path = "backend.tf", if_exists = "overwrite" }
  config = {
    # ...azurerm state bucket/container + key per path...
  }
}

2. Module configlive/prod/linux_virtual_machine/terragrunt.hcl:

include "root" {
  path = find_in_parent_folders()
}

terraform {
  source = "git::https://dev.azure.com/teknohut/kloudvin/_git/terraform-modules//terraform-module-azure-linux-virtual-machine?ref=v1.0.0"
}

inputs = {
  name = "..."
  resource_group_name = "..."
  location = "..."
  admin_username = "..."
  admin_ssh_public_key = "..."
  network_interface_ids = ["...", "..."]
}

3. Deploy one environment, or roll out all modules together:

cd live/prod/linux_virtual_machine && terragrunt apply        # this module
terragrunt run-all apply                      # every module under live/prod

Why Terragrunt here: the backend and provider live in one place instead of being copy-pasted into every module; inputs is overridden per environment (dev / stage / prod) without forking the module; and run-all orchestrates dependencies across modules. Reach for it once you have more than one environment or more than a handful of modules — for a single stack, the plain Quickstart above is enough.

Inputs

Name Type Default Required Description
name string Yes VM name; also prefixes disk names. 2-64 chars, alphanumeric/hyphen.
resource_group_name string Yes Resource group the VM and disks live in.
location string Yes Azure region (e.g. centralindia).
size string "Standard_B2s" No VM size / SKU.
admin_username string Yes SSH admin user; reserved names are rejected.
admin_ssh_public_key string Yes OpenSSH public key for password-less SSH.
network_interface_ids list(string) Yes NIC IDs; first is primary. At least one required.
zone string null No Availability zone "1"/"2"/"3", or null for regional.
availability_set_id string null No Availability set ID (mutually exclusive with zone).
source_image_reference object Ubuntu 24.04 LTS No Marketplace image publisher/offer/sku/version.
os_disk_storage_account_type string "StandardSSD_LRS" No OS disk SKU.
os_disk_size_gb number null No OS disk size (30-4095 GB); null keeps image default.
identity_type string "SystemAssigned" No Managed identity type.
identity_ids list(string) null No User-assigned identity IDs (when type includes UserAssigned).
patch_mode string "AutomaticByPlatform" No OS patch orchestration mode.
patch_assessment_mode string "AutomaticByPlatform" No Patch assessment mode.
encryption_at_host_enabled bool true No Encrypt temp disk and disk caches at the host.
disk_encryption_set_id string null No Disk Encryption Set ID for customer-managed keys.
custom_data string null No Base64-encoded cloud-init run on first boot.
boot_diagnostics_storage_account_uri string "" No "" = managed storage; null disables boot diagnostics.
data_disks map(object) {} No Data disks to create and attach (size, lun, SKU, caching).
tags map(string) {} No Tags applied to the VM and its disks.

Outputs

Name Description
id Resource ID of the Linux virtual machine.
name Name of the Linux virtual machine.
private_ip_address Primary private IP address of the VM.
public_ip_address Primary public IP (empty if the NIC has none).
identity_principal_id System-assigned identity principal ID for RBAC.
virtual_machine_id Stable vmId usable as a unique machine fingerprint.
data_disk_ids Map of logical disk name to managed disk resource ID.

Enterprise scenario

A fintech platform team runs self-hosted Azure DevOps build agents on Linux VMs that must reach an internal package feed and pull signing certificates from Key Vault. They instantiate this module once per agent with identity_type = "SystemAssigned", then use the identity_principal_id output to assign Key Vault Secrets User on the shared vault — no PATs or static credentials ever touch the image. patch_mode = "AutomaticByPlatform" keeps the fleet patched under Azure Update Manager maintenance windows, and a 256 GB Premium_LRS data disk mounted at /agent/_work isolates build I/O from the OS disk so a runaway build can’t fill /.

Best practices

TerraformAzureLinux Virtual MachineModuleIaC
Need this built for real?

Vinod is a Senior Cloud Architect (22+ yrs) — available for Azure / AWS / GCP architecture, landing zones, and migrations.

Work with me

Comments

Keep Reading