Quick take — A reusable hashicorp/azurerm ~> 4.0 module for azurerm_linux_virtual_machine: SSH-key-only auth, managed identity, boot diagnostics, OS-disk encryption and optional data disks, all var-driven. New here? Jump to the Quickstart below to deploy it in minutes; read on for how it works and when to reach for it.
Quickstart (copy-paste)
Minimal, runnable configuration — drop this in a .tf file and fill in the "..." placeholders (each required input is commented):
provider "azurerm" {
features {}
}
module "linux_virtual_machine" {
source = "git::https://dev.azure.com/teknohut/kloudvin/_git/terraform-modules//terraform-module-azure-linux-virtual-machine?ref=v1.0.0"
name = "..." # VM name; also prefixes disk names. 2-64 chars, alphanum…
resource_group_name = "..." # Resource group the VM and disks live in.
location = "..." # Azure region (e.g. `centralindia`).
admin_username = "..." # SSH admin user; reserved names are rejected.
admin_ssh_public_key = "..." # OpenSSH public key for password-less SSH.
network_interface_ids = ["...", "..."] # NIC IDs; first is primary. At least one required.
}
Then terraform init && terraform apply. Every other input has a sensible default — see Inputs below to override behaviour.
What this module is
azurerm_linux_virtual_machine provisions a single Linux VM on Azure: it picks a VM size (SKU), boots a marketplace or custom image, attaches a managed OS disk, and binds one or more pre-created network interfaces. Unlike the older azurerm_virtual_machine resource, the typed azurerm_linux_virtual_machine resource is opinionated about Linux — it defaults to SSH-key authentication, exposes a first-class identity block for managed identity, and treats the OS disk and data disks as managed disks only.
The raw resource is deceptively simple but easy to ship insecurely: people leave disable_password_authentication at its default, forget to enable boot diagnostics (so you can’t see the serial console when SSH dies), skip patch orchestration, and hand-roll data-disk attachments inconsistently across environments. Wrapping it in a module lets you bake in the non-negotiables once — key-only auth, a SystemAssigned identity, boot diagnostics to a managed storage account, OS-disk encryption-at-host, and a clean pattern for data disks — then stamp out dev/test/prod VMs from the same vetted code with nothing but a tfvars change.
This module owns the VM, its identity, and (optionally) a set of data disks with their attachments. It deliberately does not create the NIC, subnet, or NSG: networking is a separate concern with its own lifecycle, so you pass in network_interface_ids and keep the module composable.
When to use it
- You run stateful or long-lived Linux workloads (app servers, jump boxes, build agents, self-hosted runners, legacy middleware) that don’t fit a PaaS or container model.
- You need consistent, auditable VM baselines across many teams — identical auth, identity, diagnostics, and tagging — without copy-pasting HCL.
- You want managed identity on the VM so it can pull secrets from Key Vault or images from ACR without storing credentials.
- You need predictable data-disk layouts (e.g. a separate
/var/lib/dockeror database volume) defined as code. - Reach for a VM Scale Set (
azurerm_linux_virtual_machine_scale_set) instead when you need autoscaling stateless capacity, or AKS/Container Apps when the workload is genuinely containerised — this module is for individually-addressable, individually-patched instances.
Module structure
terraform-module-azure-linux-virtual-machine/
├── versions.tf # provider + Terraform version pins
├── main.tf # the VM, identity, boot diagnostics, data disks
├── variables.tf # all inputs with validation
└── outputs.tf # id, name, identity principal, private IP, disks
versions.tf
terraform {
required_version = ">= 1.5.0"
required_providers {
azurerm = {
source = "hashicorp/azurerm"
version = "~> 4.0"
}
}
}
main.tf
locals {
# Normalise tags and stamp the module on every resource for traceability.
common_tags = merge(
var.tags,
{
managed_by = "terraform"
module = "terraform-module-azure-linux-virtual-machine"
}
)
}
resource "azurerm_linux_virtual_machine" "this" {
name = var.name
resource_group_name = var.resource_group_name
location = var.location
size = var.size
# Attach pre-created NIC(s); the first one is primary by default.
network_interface_ids = var.network_interface_ids
admin_username = var.admin_username
disable_password_authentication = true
# Optional zonal pinning for higher availability SLAs.
zone = var.zone
# Spread non-zonal VMs across an availability set or VMSS if provided.
availability_set_id = var.availability_set_id
patch_mode = var.patch_mode
patch_assessment_mode = var.patch_assessment_mode
# Encryption-at-host protects temp disk + OS/data disk caches at the host.
encryption_at_host_enabled = var.encryption_at_host_enabled
admin_ssh_key {
username = var.admin_username
public_key = var.admin_ssh_public_key
}
os_disk {
name = "${var.name}-osdisk"
caching = "ReadWrite"
storage_account_type = var.os_disk_storage_account_type
disk_size_gb = var.os_disk_size_gb
# Server-side encryption with a customer-managed key, when supplied.
disk_encryption_set_id = var.disk_encryption_set_id
}
source_image_reference {
publisher = var.source_image_reference.publisher
offer = var.source_image_reference.offer
sku = var.source_image_reference.sku
version = var.source_image_reference.version
}
identity {
type = var.identity_type
identity_ids = var.identity_ids
}
# cloud-init / custom data, base64-encoded by the caller or via this var.
custom_data = var.custom_data
boot_diagnostics {
# Empty string => managed (platform) storage account for serial console.
storage_account_uri = var.boot_diagnostics_storage_account_uri
}
# Prevent accidental image-version drift from forcing a rebuild in CI.
lifecycle {
ignore_changes = [
source_image_reference[0].version,
]
}
tags = local.common_tags
}
# Optional managed data disks, created and attached in lockstep with the VM.
resource "azurerm_managed_disk" "data" {
for_each = var.data_disks
name = "${var.name}-${each.key}"
resource_group_name = var.resource_group_name
location = var.location
storage_account_type = each.value.storage_account_type
create_option = "Empty"
disk_size_gb = each.value.disk_size_gb
zone = var.zone
disk_encryption_set_id = var.disk_encryption_set_id
tags = local.common_tags
}
resource "azurerm_virtual_machine_data_disk_attachment" "data" {
for_each = var.data_disks
managed_disk_id = azurerm_managed_disk.data[each.key].id
virtual_machine_id = azurerm_linux_virtual_machine.this.id
lun = each.value.lun
caching = each.value.caching
}
variables.tf
variable "name" {
description = "Name of the Linux virtual machine (also used as a prefix for its disks)."
type = string
validation {
condition = can(regex("^[a-zA-Z0-9][a-zA-Z0-9-]{0,62}[a-zA-Z0-9]$", var.name)) && length(var.name) <= 64
error_message = "VM name must be 2-64 chars, alphanumeric or hyphen, and not start/end with a hyphen."
}
}
variable "resource_group_name" {
description = "Name of the resource group the VM is deployed into."
type = string
}
variable "location" {
description = "Azure region for the VM and its disks (e.g. centralindia)."
type = string
}
variable "size" {
description = "VM size / SKU (e.g. Standard_B2s for dev, Standard_D2as_v5 for prod)."
type = string
default = "Standard_B2s"
}
variable "admin_username" {
description = "Administrator username for SSH access. Avoid reserved names like root or admin."
type = string
validation {
condition = !contains(["root", "admin", "administrator", "1", "test", "user"], lower(var.admin_username))
error_message = "admin_username must not be a disallowed value (root, admin, administrator, 1, test, user)."
}
}
variable "admin_ssh_public_key" {
description = "OpenSSH-format public key used for password-less SSH (ssh-rsa/ssh-ed25519 ...)."
type = string
validation {
condition = can(regex("^(ssh-rsa|ssh-ed25519|ecdsa-sha2-) ", var.admin_ssh_public_key))
error_message = "admin_ssh_public_key must be a valid OpenSSH public key (ssh-rsa, ssh-ed25519 or ecdsa-sha2-*)."
}
}
variable "network_interface_ids" {
description = "Ordered list of NIC resource IDs to attach; the first is the primary interface."
type = list(string)
validation {
condition = length(var.network_interface_ids) >= 1
error_message = "At least one network_interface_id must be provided."
}
}
variable "zone" {
description = "Availability zone to pin the VM and its disks to (\"1\", \"2\", \"3\"), or null for regional."
type = string
default = null
validation {
condition = var.zone == null || contains(["1", "2", "3"], var.zone)
error_message = "zone must be one of \"1\", \"2\", \"3\", or null."
}
}
variable "availability_set_id" {
description = "Optional availability set ID. Mutually exclusive with zone."
type = string
default = null
}
variable "source_image_reference" {
description = "Marketplace image reference for the OS disk."
type = object({
publisher = string
offer = string
sku = string
version = string
})
default = {
publisher = "Canonical"
offer = "ubuntu-24_04-lts"
sku = "server"
version = "latest"
}
}
variable "os_disk_storage_account_type" {
description = "OS disk SKU: Standard_LRS, StandardSSD_LRS, Premium_LRS, or Premium_ZRS."
type = string
default = "StandardSSD_LRS"
validation {
condition = contains(["Standard_LRS", "StandardSSD_LRS", "Premium_LRS", "Premium_ZRS", "StandardSSD_ZRS"], var.os_disk_storage_account_type)
error_message = "os_disk_storage_account_type must be a valid managed disk SKU."
}
}
variable "os_disk_size_gb" {
description = "OS disk size in GB. Null keeps the image default."
type = number
default = null
validation {
condition = var.os_disk_size_gb == null || (var.os_disk_size_gb >= 30 && var.os_disk_size_gb <= 4095)
error_message = "os_disk_size_gb must be between 30 and 4095 when set."
}
}
variable "identity_type" {
description = "Managed identity type: SystemAssigned, UserAssigned, or SystemAssigned, UserAssigned."
type = string
default = "SystemAssigned"
validation {
condition = contains(["SystemAssigned", "UserAssigned", "SystemAssigned, UserAssigned"], var.identity_type)
error_message = "identity_type must be SystemAssigned, UserAssigned, or \"SystemAssigned, UserAssigned\"."
}
}
variable "identity_ids" {
description = "User-assigned identity resource IDs (required when identity_type includes UserAssigned)."
type = list(string)
default = null
}
variable "patch_mode" {
description = "OS patch orchestration: ImageDefault or AutomaticByPlatform."
type = string
default = "AutomaticByPlatform"
validation {
condition = contains(["ImageDefault", "AutomaticByPlatform"], var.patch_mode)
error_message = "patch_mode must be ImageDefault or AutomaticByPlatform."
}
}
variable "patch_assessment_mode" {
description = "Patch assessment mode: ImageDefault or AutomaticByPlatform."
type = string
default = "AutomaticByPlatform"
validation {
condition = contains(["ImageDefault", "AutomaticByPlatform"], var.patch_assessment_mode)
error_message = "patch_assessment_mode must be ImageDefault or AutomaticByPlatform."
}
}
variable "encryption_at_host_enabled" {
description = "Enable encryption at host for temp disk and OS/data disk caches. Requires the feature to be registered on the subscription."
type = bool
default = true
}
variable "disk_encryption_set_id" {
description = "Optional Disk Encryption Set ID for customer-managed-key encryption of OS and data disks."
type = string
default = null
}
variable "custom_data" {
description = "Base64-encoded cloud-init / custom data run on first boot. Null to omit."
type = string
default = null
sensitive = true
}
variable "boot_diagnostics_storage_account_uri" {
description = "Boot diagnostics storage URI. Empty string uses a managed storage account (recommended); null disables it."
type = string
default = ""
}
variable "data_disks" {
description = "Map of data disks to create and attach, keyed by a short logical name."
type = map(object({
disk_size_gb = number
lun = number
storage_account_type = optional(string, "Premium_LRS")
caching = optional(string, "ReadWrite")
}))
default = {}
validation {
condition = alltrue([for d in values(var.data_disks) : d.lun >= 0 && d.lun <= 63])
error_message = "Each data disk lun must be between 0 and 63."
}
}
variable "tags" {
description = "Tags applied to the VM and all of its disks."
type = map(string)
default = {}
}
outputs.tf
output "id" {
description = "Resource ID of the Linux virtual machine."
value = azurerm_linux_virtual_machine.this.id
}
output "name" {
description = "Name of the Linux virtual machine."
value = azurerm_linux_virtual_machine.this.name
}
output "private_ip_address" {
description = "Primary private IP address assigned to the VM."
value = azurerm_linux_virtual_machine.this.private_ip_address
}
output "public_ip_address" {
description = "Primary public IP address, if the attached NIC has one (empty otherwise)."
value = azurerm_linux_virtual_machine.this.public_ip_address
}
output "identity_principal_id" {
description = "Principal ID of the system-assigned identity, for RBAC role assignments."
value = try(azurerm_linux_virtual_machine.this.identity[0].principal_id, null)
}
output "virtual_machine_id" {
description = "Stable VM ID (vmId) usable as a unique machine fingerprint."
value = azurerm_linux_virtual_machine.this.virtual_machine_id
}
output "data_disk_ids" {
description = "Map of logical disk name to managed data disk resource ID."
value = { for k, d in azurerm_managed_disk.data : k => d.id }
}
How to use it
module "linux_virtual_machine" {
source = "git::https://dev.azure.com/teknohut/kloudvin/_git/terraform-modules//terraform-module-azure-linux-virtual-machine?ref=v1.0.0"
name = "vm-orders-api-prod-01"
resource_group_name = azurerm_resource_group.app.name
location = azurerm_resource_group.app.location
size = "Standard_D2as_v5"
zone = "1"
admin_username = "kvadmin"
admin_ssh_public_key = file("${path.module}/keys/orders-api.pub")
network_interface_ids = [azurerm_network_interface.orders_api.id]
source_image_reference = {
publisher = "Canonical"
offer = "ubuntu-24_04-lts"
sku = "server"
version = "latest"
}
os_disk_storage_account_type = "Premium_LRS"
os_disk_size_gb = 64
encryption_at_host_enabled = true
# cloud-init to install the app runtime on first boot.
custom_data = base64encode(file("${path.module}/cloud-init/orders-api.yaml"))
data_disks = {
data = {
disk_size_gb = 256
lun = 0
storage_account_type = "Premium_LRS"
caching = "None"
}
}
tags = {
environment = "prod"
workload = "orders-api"
cost_center = "ecom-platform"
}
}
# Downstream: grant the VM's managed identity read access to a Key Vault,
# using the identity_principal_id output.
resource "azurerm_role_assignment" "vm_kv_secrets" {
scope = azurerm_key_vault.app.id
role_definition_name = "Key Vault Secrets User"
principal_id = module.linux_virtual_machine.identity_principal_id
}
With Terragrunt
Terragrunt keeps this module DRY across environments — define the backend and provider once in a root config, then a thin terragrunt.hcl per environment supplies only the inputs that differ.
1. Root config — live/terragrunt.hcl (inherited by every module):
remote_state {
backend = "azurerm"
generate = { path = "backend.tf", if_exists = "overwrite" }
config = {
# ...azurerm state bucket/container + key per path...
}
}
2. Module config — live/prod/linux_virtual_machine/terragrunt.hcl:
include "root" {
path = find_in_parent_folders()
}
terraform {
source = "git::https://dev.azure.com/teknohut/kloudvin/_git/terraform-modules//terraform-module-azure-linux-virtual-machine?ref=v1.0.0"
}
inputs = {
name = "..."
resource_group_name = "..."
location = "..."
admin_username = "..."
admin_ssh_public_key = "..."
network_interface_ids = ["...", "..."]
}
3. Deploy one environment, or roll out all modules together:
cd live/prod/linux_virtual_machine && terragrunt apply # this module
terragrunt run-all apply # every module under live/prod
Why Terragrunt here: the backend and provider live in one place instead of being copy-pasted into every module; inputs is overridden per environment (dev / stage / prod) without forking the module; and run-all orchestrates dependencies across modules. Reach for it once you have more than one environment or more than a handful of modules — for a single stack, the plain Quickstart above is enough.
Inputs
| Name | Type | Default | Required | Description |
|---|---|---|---|---|
name |
string |
— | Yes | VM name; also prefixes disk names. 2-64 chars, alphanumeric/hyphen. |
resource_group_name |
string |
— | Yes | Resource group the VM and disks live in. |
location |
string |
— | Yes | Azure region (e.g. centralindia). |
size |
string |
"Standard_B2s" |
No | VM size / SKU. |
admin_username |
string |
— | Yes | SSH admin user; reserved names are rejected. |
admin_ssh_public_key |
string |
— | Yes | OpenSSH public key for password-less SSH. |
network_interface_ids |
list(string) |
— | Yes | NIC IDs; first is primary. At least one required. |
zone |
string |
null |
No | Availability zone "1"/"2"/"3", or null for regional. |
availability_set_id |
string |
null |
No | Availability set ID (mutually exclusive with zone). |
source_image_reference |
object |
Ubuntu 24.04 LTS | No | Marketplace image publisher/offer/sku/version. |
os_disk_storage_account_type |
string |
"StandardSSD_LRS" |
No | OS disk SKU. |
os_disk_size_gb |
number |
null |
No | OS disk size (30-4095 GB); null keeps image default. |
identity_type |
string |
"SystemAssigned" |
No | Managed identity type. |
identity_ids |
list(string) |
null |
No | User-assigned identity IDs (when type includes UserAssigned). |
patch_mode |
string |
"AutomaticByPlatform" |
No | OS patch orchestration mode. |
patch_assessment_mode |
string |
"AutomaticByPlatform" |
No | Patch assessment mode. |
encryption_at_host_enabled |
bool |
true |
No | Encrypt temp disk and disk caches at the host. |
disk_encryption_set_id |
string |
null |
No | Disk Encryption Set ID for customer-managed keys. |
custom_data |
string |
null |
No | Base64-encoded cloud-init run on first boot. |
boot_diagnostics_storage_account_uri |
string |
"" |
No | "" = managed storage; null disables boot diagnostics. |
data_disks |
map(object) |
{} |
No | Data disks to create and attach (size, lun, SKU, caching). |
tags |
map(string) |
{} |
No | Tags applied to the VM and its disks. |
Outputs
| Name | Description |
|---|---|
id |
Resource ID of the Linux virtual machine. |
name |
Name of the Linux virtual machine. |
private_ip_address |
Primary private IP address of the VM. |
public_ip_address |
Primary public IP (empty if the NIC has none). |
identity_principal_id |
System-assigned identity principal ID for RBAC. |
virtual_machine_id |
Stable vmId usable as a unique machine fingerprint. |
data_disk_ids |
Map of logical disk name to managed disk resource ID. |
Enterprise scenario
A fintech platform team runs self-hosted Azure DevOps build agents on Linux VMs that must reach an internal package feed and pull signing certificates from Key Vault. They instantiate this module once per agent with identity_type = "SystemAssigned", then use the identity_principal_id output to assign Key Vault Secrets User on the shared vault — no PATs or static credentials ever touch the image. patch_mode = "AutomaticByPlatform" keeps the fleet patched under Azure Update Manager maintenance windows, and a 256 GB Premium_LRS data disk mounted at /agent/_work isolates build I/O from the OS disk so a runaway build can’t fill /.
Best practices
- Never enable password auth. This module hard-codes
disable_password_authentication = trueand validates thatadmin_ssh_public_keyis a real OpenSSH key; rotate keys via the NIC/identity layer, and front SSH with Azure Bastion or just-in-time access rather than a public IP. - Right-size and burst deliberately. Use
Standard_B-series burstable SKUs for dev/CI to bank credits and cut spend, but switch toD/E-series for steady production load — B-series throttle hard once credits drain, which looks like a mystery outage. - Pin a zone and use Premium/ZRS disks for prod. Set
zoneandPremium_LRS(orPremium_ZRS) OS/data disks to qualify for the single-instance VM SLA; leave dev VMs regional onStandardSSD_LRSto save money. - Separate data from the OS disk. Put databases, container storage, and build workspaces on
data_diskswithcaching = "None"for write-heavy volumes, so resizing or reimaging the OS disk never risks the data. - Keep boot diagnostics on managed storage. The default
boot_diagnostics_storage_account_uri = ""gives you serial console and screenshot access with zero storage-account plumbing — invaluable when SSH is down and you need to see kernel panics or cloud-init failures. - Encrypt at host and use CMK where compliance demands it.
encryption_at_host_enabled = truecovers temp disk and caches; layer adisk_encryption_set_idfor customer-managed keys, and adopt a consistent name pattern likevm-<workload>-<env>-<nn>so cost reports and NSG rules stay legible across the fleet.