Azure Compute

Azure Managed Disks Deep Dive: Every Disk Type, Caching, Encryption & Performance

Almost every interesting failure, cost surprise, and performance complaint on Azure IaaS eventually traces back to a disk. A SQL Server that “randomly” stalls under load is usually a Standard SSD pretending to be Premium. A surprise line on the bill is usually an Ultra Disk left attached to a deallocated VM, or a forgotten 4 TB Premium snapshot. A VM that won’t boot after a “harmless” caching change is a data disk that had write-back caching turned on while the application assumed durability. Disks are where the abstraction of “just a virtual machine” meets the very physical reality of IOPS, throughput, latency, replication, and money.

This is the deep dive that makes disks stop being a mystery. An Azure managed disk is a block-storage device that Azure provisions and manages for you as a first-class Azure Resource Manager (ARM) resource — you pick a type and a size, and Azure handles the underlying storage account, replication, and placement. You will leave this lesson knowing every disk type and when to choose it, how disk size maps to performance, what host caching actually does (and when it will corrupt your data if you get it wrong), every encryption option Azure offers and how they stack, and the operational toolkit — snapshots, images, shared disks, performance tiers, online resize, OS-disk swap, and ephemeral OS disks. We will cover the settings you choose when you create a disk and the ones you can (and cannot) change afterwards, with working az CLI and Bicep for each core operation.

Learning objectives

By the end of this lesson you will be able to:

Prerequisites & where this fits

You should be comfortable creating a virtual machine and reasoning about regions and resource groups; if VMs are new, read the Azure Virtual Machines deep dive first, since the Disks tab of VM creation is where most people first meet these options. You will get more out of the encryption section if you have seen Azure Key Vault before. This lesson sits in the Compute module of the Azure Zero-to-Hero course, immediately after the VM and VM-resilience lessons and immediately before networking — disks are the storage layer that every VM stands on, so we cover them while VMs are fresh and before we move the discussion to the network.

Core concepts

Managed vs unmanaged (the history that explains the model). In the original Azure model you created storage accounts yourself and dropped VM disks into them as page blobs (“unmanaged disks”). You had to spread VMs across many storage accounts to avoid hitting a per-account IOPS cap (20,000 IOPS), you managed your own naming and containers, and an availability-set deployment could silently put two VMs’ disks in the same storage scale unit and defeat the whole point of the availability set. Managed disks, now the default and the only type you should use, make the disk itself the ARM resource: Azure picks and manages the backing storage, enforces the per-disk performance you provisioned, automatically distributes disks of availability-set VMs across fault domains, and gives you role-based access control, resource locks, tags, and Azure Policy on the disk like any other resource. Unmanaged disks are deprecated and being retired — treat “disk” and “managed disk” as synonyms from here on.

The three roles a disk can play. Every VM has exactly one OS disk (a registered, bootable disk, max 4 TiB, mounted as C: on Windows or / on Linux, with ReadWrite caching on by default). A VM can have one or more data disks — empty block devices you attach for application data, databases, and logs; the number you can attach is capped by the VM size (a small VM might allow 4, a large one 64). And almost every VM size ships a temporary disk (the “temp disk”, D: on Windows, /dev/sdb//mnt on Linux): a local SSD physically attached to the host, not a managed disk, not persisted, and wiped on deallocate, host maintenance, or resize. The temp disk is free and fast and perfect for OS paging files, tempdb, and scratch — and catastrophic if you ever store anything you care about on it.

Provisioned performance is what you pay for. With the classic tiers (Standard HDD/SSD, Premium SSD v1), performance is a fixed function of the size you pick — choosing a bigger disk is how you buy more IOPS. With the newer types (Premium SSD v2 and Ultra), capacity and performance are decoupled: you provision GiB, IOPS, and MB/s independently and are billed for each. The mental model to carry: you pay for provisioned capacity and (on v2/Ultra) provisioned performance, not for what you actually use.

IOPS vs throughput vs latency. IOPS is operations per second (matters for small random I/O — databases, OLTP). Throughput is MB/s (matters for large sequential I/O — backups, analytics, media). Latency is the time per operation (matters for chatty, latency-sensitive apps). A disk can be IOPS-bound, throughput-bound, or latency-bound, and the VM size has its own disk IOPS/throughput ceiling — your effective performance is the minimum of the disk limits and the VM’s limits. A Premium disk on an undersized VM, or a VM with no Premium support, will never hit the disk’s rated numbers.

Disk types: the master comparison

This is the single most important table in the lesson. The five managed-disk types, side by side:

Disk type Media Max size Max IOPS (per disk) Max throughput Typical latency Performance model Best for
Standard HDD Magnetic 32 TiB ~2,000 (+ bursting) ~500 MB/s ms (10ms+), variable Fixed by size tier (S) Dev/test, backup, infrequent/cold, cost-first
Standard SSD SSD 32 TiB ~6,000 (with bursting) ~750 MB/s single-digit ms Fixed by size tier (E) Web servers, light prod, dev/test that needs consistency
Premium SSD v1 SSD 32 TiB 20,000 900 MB/s low single-digit ms Fixed by size tier (P) Production, databases, latency-sensitive; required for SLA-backed single-VM
Premium SSD v2 SSD 64 TiB 80,000 1,200 MB/s sub-ms Independently provisioned IOPS + MB/s Most new production; best price/performance
Ultra Disk NVMe-class 64 TiB 400,000 10,000 MB/s sub-ms Independently provisioned IOPS + MB/s, adjustable live Top-tier OLTP, SAP HANA, high-end SQL, message queues

Read that as a ladder of price and capability: Standard HDD is the cheapest and slowest; Ultra is the fastest and (for high performance) the priciest. A few load-bearing nuances behind the numbers:

Provisioned vs on-demand, and bursting

There are two different “elasticity” stories you must not conflate:

Provisioned vs on-demand performance (a property of the type):

Bursting — two distinct models, both about handling short spikes above baseline:

Bursting model Applies to How it works Cost
Credit-based bursting Premium SSD v1 (P30 and smaller), Standard SSD Disk accrues burst credits while idle/below baseline; spends them to burst up to a fixed ceiling (e.g. 3,500 IOPS / 170 MB/s) for up to ~30 min. Free, automatic, on by default. Free
On-demand bursting Premium SSD v1 (P30 and larger) Disk can burst up to a much higher ceiling (e.g. 30,000 IOPS / 1,000 MB/s) with no credit limit — burst as long as you need. Must be explicitly enabled; billed per transaction above baseline plus an enablement fee. Paid

Premium SSD v2 and Ultra don’t “burst” in this sense — you simply provision the IOPS/MB/s you want. The interview-grade summary: credit-based bursting is free, automatic, and capped by accrued credits and a low ceiling; on-demand bursting is paid, uncapped in duration, and only on larger Premium v1 disks.

OS disk vs data disk vs temp disk

Putting the three roles together with the operational facts you must remember:

Aspect OS disk Data disk Temporary disk
Count per VM Exactly 1 0 to (VM-size limit, up to 64) 0 or 1 (depends on size)
Persisted (managed)? Yes Yes No — local to host
Survives deallocate? Yes Yes No — wiped
Default caching ReadWrite ReadOnly (Premium) / None n/a
Max size 4 TiB up to 64 TiB (type-dependent) fixed by VM size
Typical use Boot volume, OS App data, DB files, logs Page file, tempdb, scratch
Billed Yes Yes Free (included in VM)

Three rules that prevent most data-loss incidents: never put durable data on the temp disk; never store data you need on the OS disk if you can use a data disk (separating OS and data makes resize, swap, and backup cleaner); and remember that on Linux the temp disk device letter can change — mount data disks by UUID in /etc/fstab, not by /dev/sdX, or a reboot/resize can mount the wrong device.

Disk size tiers (P/E/S) and how size maps to performance

The classic types use lettered size tiers, and each tier is a fixed bundle of capacity + baseline performance:

The pattern to internalise: on classic types, bigger = faster. If a P10 doesn’t give you enough IOPS, you don’t tune IOPS — you move to P15/P20/P30. This is also why people over-provision capacity just to buy IOPS, and exactly the pain that Premium SSD v2 removes.

Need Classic-type answer v2/Ultra answer
More capacity Pick a larger P/E/S tier Increase provisioned GiB
More IOPS Pick a larger tier (or set a higher performance tier on Premium v1) Increase provisioned IOPS independently
More throughput Pick a larger tier Increase provisioned MB/s independently

A subtle gotcha: a disk smaller than 4 GiB still rounds up to the next billing size (e.g. a P4/E4/S4 is billed at its tier capacity even if you only format part of it), so there’s rarely a reason to provision tiny disks.

Host caching: None / ReadOnly / ReadWrite

Host caching uses the VM host’s RAM and local SSD as a cache in front of a managed disk. It is one of the highest-impact and most dangerous settings, because the wrong choice silently trades durability for speed.

Caching mode What it caches Effect Use it for Danger
None Nothing All reads/writes go straight to the disk Write-heavy disks, log disks, any disk where every write must be durable immediately; required for Premium v2/Ultra (they don’t support caching) None — safest
ReadOnly Reads Read hits served from host cache (fast, low-latency reads); writes go straight through to disk Read-heavy data disks, database data files (read-mostly), boot performance None for durability; cache can serve stale data only if another writer bypasses the host (rare)
ReadWrite Reads and writes Writes are acknowledged from the host cache (write-back) before hitting the disk The OS disk (default), and only apps that manage their own write consistency / flushing Data loss / corruption if the host fails before cached writes are flushed and the app assumed the write was durable

The rules that matter in practice and in exams:

The single sentence to memorise: ReadWrite caching is safe for the OS disk and dangerous for data you can’t afford to lose; logs get None; read-heavy data gets ReadOnly.

Encryption: every option and how they stack

Azure gives you several encryption mechanisms that operate at different layers. They are not mutually exclusive — some compose. Work through them as a stack.

Mechanism Where it runs Key owner Encrypts When required / chosen
SSE with platform-managed keys (PMK) Azure storage infrastructure Microsoft Data at rest on the disk (and snapshots) Always on by default, free, transparent, no action needed
SSE with customer-managed keys (CMK) Azure storage infrastructure You (in Key Vault, via a Disk Encryption Set) Data at rest Compliance/regulatory key-control, key rotation, revocation
Encryption at host The VM host (hypervisor) Microsoft (PMK) or you (CMK) OS disk, data disks, AND the temp disk + caches — end to end from the host When you need the temp disk and host caches encrypted too; modern recommended default
Azure Disk Encryption (ADE) Inside the guest OS (BitLocker on Windows, DM-Crypt on Linux) You (Key Vault) OS and data volumes from inside the guest Legacy/compliance requiring in-guest, OS-level encryption keys
Double encryption at rest Storage infra (two layers: platform + CMK) Microsoft + you Data at rest, twice with two different keys/algorithms Highest at-rest assurance / specific compliance mandates
Confidential disk encryption Confidential VM, key bound to a vTPM in the TEE You/Microsoft, bound to the VM’s TEE OS disk, with the key protected inside the confidential compute boundary Confidential VMs (AMD SEV-SNP / Intel TDX) needing the OS disk key sealed to the TEE

How to reason about them:

These compose along their layers: you can run encryption at host with a CMK, and confidential disks build on the confidential-VM TEE. ADE and encryption-at-host are generally not combined. Exam reflex: PMK is automatic and free; CMK gives you key control via a Disk Encryption Set; encryption at host is the only one that also covers the temp disk; ADE is in-guest BitLocker/DM-Crypt.

Snapshots: full vs incremental

A snapshot is a read-only, point-in-time copy of a managed disk, stored as its own ARM resource. Two flavours:

Full snapshot Incremental snapshot
What it stores The entire disk every time Only the changes since the previous snapshot of that disk
Cost Billed for full disk size each time Billed only for the delta (much cheaper for a snapshot chain)
Redundancy LRS or ZRS LRS or ZRS
Restore Standalone Each incremental is independently restorable (Azure stitches the chain)
Recommendation Legacy Use these — cheaper and the modern default

Always prefer incremental snapshots (--incremental true). Despite the name, each incremental snapshot is independently restorable — Azure manages the chain, so deleting an old one doesn’t break newer ones. Snapshots are crash-consistent by default; for application-consistent backups (e.g. quiescing a database) use Azure Backup, which coordinates with the guest. Snapshots inherit encryption from the source and can themselves be PMK/CMK-encrypted. A common cost leak: orphaned full snapshots of large disks — audit and prefer incrementals.

Images vs snapshots

A managed image captures a generalized VM (one that has been run through sysprep on Windows or waagent -deprovision on Linux to strip machine-specific identity) so you can deploy many new VMs from it. A snapshot captures a single disk’s bytes as-is (specialized, not generalized) for backup or for cloning one disk. Rule of thumb: snapshot = back up or clone one disk; image = template for new VMs. For production image management at scale, use the Azure Compute Gallery (versioning, replication across regions, scaling) rather than standalone managed images — covered in the VMSS lesson.

Shared disks (clustering)

A shared disk is a managed disk attached to multiple VMs simultaneously (maxShares > 1), exposing shared block storage for clustered applications that bring their own cluster manager — Windows Server Failover Cluster with SCSI Persistent Reservations, SQL Server Failover Cluster Instances, Linux Pacemaker/SBD, scale-out file servers. Key facts and gotchas:

Performance tiers (Premium SSD v1)

For Premium SSD v1, a performance tier lets you temporarily provision the IOPS/throughput of a larger tier without changing the disk’s capacity. A P10 (128 GiB) can be set to run at P30 performance for a predictable busy period (a sale, a batch window), then dialled back — without resizing the disk or any downtime. You pay for the higher tier while it’s active. This is distinct from credit-based bursting (free, automatic, credit-limited) and on-demand bursting (paid, transaction-billed). On Premium SSD v2 and Ultra you don’t need performance tiers at all — you just change the provisioned IOPS/MB/s dials directly.

Resize without downtime (online expansion)

You can grow a managed disk’s capacity, and on supported configurations you can do it without deallocating the VM (“online resize”/live resize for the data disk path). The constraints:

Swap OS disk

You can replace a VM’s OS disk with a different managed disk (or one restored from a snapshot) while keeping the same VM resource — its name, NICs, IP, size, and data disks all stay. This is the standard recovery move when the OS disk is corrupted or you want to roll the VM back to a known-good snapshot: create a new disk from the snapshot, stop/deallocate the VM, point the VM at the new OS disk, start. The new and old OS disks must be the same OS type and (ideally) generation. We do this with one az vm update --os-disk command in the lab below.

Ephemeral OS disks

An ephemeral OS disk stores the OS disk on the VM host’s local storage (cache or temp disk) instead of in remote managed storage. The trade:

Set it with --ephemeral-os-disk true (optionally --ephemeral-os-disk-placement CacheDisk|ResourceDisk). The mental model: ephemeral OS disk = cattle, not pets — fast and free, but the OS is disposable.

The disk landscape at a glance

The diagram below ties the pieces together: a VM with its single OS disk, multiple persistent data disks, and the non-persistent temp disk; how the size tier or provisioned dials set performance; where host caching sits between the VM host and the disk; and where the encryption layers and snapshots attach.

Azure managed disks anatomy: a VM with its OS disk, data disks and ephemeral temp disk, host caching between host and disk, the disk-type performance ladder (Standard HDD/SSD, Premium SSD v1/v2, Ultra), and the encryption and snapshot layers

Use it as the map for the rest of this lesson — every term in the diagram has a section above explaining its choices, defaults, and trade-offs.

Creating and configuring disks: every setting

When you add a disk in the portal (VM creation Disks tab, or Disks → Create and attach a new disk), or via az/Bicep, these are the fields and the what/choices/default/when/trade-off treatment:

Setting Choices Default When / trade-off / gotcha
Disk SKU (type) Standard HDD / Standard SSD / Premium SSD v1 / Premium SSD v2 / Ultra Premium SSD (varies by VM size) The master choice (see comparison table). v2/Ultra need VM support; v1 needed for single-VM SLA.
Size / capacity Tier (P/E/S) or GiB (v2/Ultra) Classic: size also sets performance. v2/Ultra: capacity is independent of performance. Round-up billing on tiny disks.
Provisioned IOPS Number (v2/Ultra only) 3,000 (v2 baseline) Independent dial; billed. Capped by disk size and VM limits.
Provisioned throughput MB/s (v2/Ultra only) 125 MB/s (v2 baseline) Independent dial; billed.
Host caching None / ReadOnly / ReadWrite OS=ReadWrite, data=ReadOnly See caching section. v2/Ultra force None.
Encryption type PMK (SSE) / CMK (SSE) / double / confidential PMK CMK needs a Disk Encryption Set + Key Vault.
Enable shared disk Yes (maxShares) / No No Premium v1/v2/Ultra only; caching must be None; cluster SW must coordinate.
Bursting On-demand on/off (Premium v1 ≥P30) Off (on-demand); credit-based auto-on On-demand is paid; credit-based is free.
Network access Public / private endpoint / deny Public (with auth) For disk export/import; lock down with private endpoints for sensitive estates.
Availability zone None / 1 / 2 / 3 Inherit VM Zonal disks must match the VM’s zone.
LUN (data disk) 0–63 next free The logical unit number the guest sees the disk on; keep stable for automation.

az CLI — the core operations

# Variables
RG=rg-disks-lab
LOC=eastus
VM=vm-disklab

# Create a standalone Premium SSD v1 data disk (256 GiB)
az disk create -g $RG -n data-premium-256 \
  --size-gb 256 --sku Premium_LRS --location $LOC

# Create a Premium SSD v2 disk with independent performance dials
az disk create -g $RG -n data-v2 \
  --sku PremiumV2_LRS --size-gb 100 \
  --disk-iops-read-write 5000 --disk-mbps-read-write 200 \
  --location $LOC --zone 1

# Attach a new Premium data disk to a VM with ReadOnly caching, on LUN 0
az vm disk attach -g $RG --vm-name $VM \
  --name data-premium-256 --caching ReadOnly --lun 0

# Create an Ultra-enabled VM (Ultra must be enabled at create, supported zone)
az vm create -g $RG -n vm-ultra --image Ubuntu2204 --zone 1 \
  --size Standard_D4s_v5 --ultra-ssd-enabled true \
  --generate-ssh-keys

# Change caching on an attached data disk (detach/reattach under the hood)
az vm update -g $RG -n $VM \
  --set storageProfile.dataDisks[0].caching=None

# Resize (grow) a data disk to 512 GiB — never shrink
az disk update -g $RG -n data-premium-256 --size-gb 512

# Set a Premium v1 performance tier without growing capacity (P10 -> P30 perf)
az disk update -g $RG -n data-premium-256 --tier P30

# Enable encryption at host on a VM (feature must be registered first)
az feature register --namespace Microsoft.Compute --name EncryptionAtHost
az vm update -g $RG -n $VM --set securityProfile.encryptionAtHost=true

Bicep — a Premium SSD v2 data disk and a CMK disk

// Premium SSD v2 data disk with independent IOPS/throughput
resource dataV2 'Microsoft.Compute/disks@2024-03-02' = {
  name: 'data-v2'
  location: resourceGroup().location
  zones: ['1']
  sku: {
    name: 'PremiumV2_LRS'
  }
  properties: {
    creationData: { createOption: 'Empty' }
    diskSizeGB: 100
    diskIOPSReadWrite: 5000
    diskMBpsReadWrite: 200
    publicNetworkAccess: 'Disabled'
  }
}

// A disk encrypted with a customer-managed key via a Disk Encryption Set
resource cmkDisk 'Microsoft.Compute/disks@2024-03-02' = {
  name: 'data-cmk'
  location: resourceGroup().location
  sku: { name: 'Premium_LRS' }
  properties: {
    creationData: { createOption: 'Empty' }
    diskSizeGB: 128
    encryption: {
      type: 'EncryptionAtRestWithCustomerKey'
      diskEncryptionSetId: diskEncryptionSet.id
    }
  }
}

After creation: what you can (and can’t) change

Operation Possible after creation? Notes
Grow capacity Yes Never shrink; extend the in-guest filesystem afterwards; online resize where supported.
Change disk SKU/type Yes (most pairs) e.g. Standard → Premium; usually requires the disk unattached or VM deallocated. Migrating to Premium v2/Ultra often means create-new + copy, not in-place.
Change host caching Yes Triggers detach/reattach; brief I/O interruption on a live data disk.
Change performance tier (Premium v1) Yes No capacity change, no downtime; billed at the higher tier while active.
Change provisioned IOPS/MB/s (v2/Ultra) Yes Ultra can change live without reboot; v2 supported with constraints.
Change encryption (PMK ↔ CMK) Yes Attach/detach or update via Disk Encryption Set.
Enable on-demand bursting (Premium v1 ≥P30) Yes Paid; can disable again (cooldown applies).
Convert to/from shared (maxShares) Yes (supported types) Disk must be detached from all VMs to change maxShares.
Detach / attach Yes Data disks hot-attach/detach on most VM sizes; OS disk requires VM stopped (swap).
Swap the OS disk Yes VM deallocated; same OS type/generation.
Move zones No (not in place) Zonal placement is fixed at create; to move, snapshot → create in new zone.

The two hard "no"s to remember: you cannot shrink a disk, and you cannot move a disk to a different availability zone in place — both require create-new + copy/snapshot.

Hands-on lab

In this lab you attach a Premium data disk to a VM, set its caching, take an incremental snapshot, then clean everything up. Uses az CLI (Cloud Shell or local). The smallest disks/VM keep cost to a few rupees if you delete promptly.

0. Setup variables and a tiny VM.

RG=rg-disks-lab
LOC=eastus
VM=vm-disklab
az group create -n $RG -l $LOC
az vm create -g $RG -n $VM --image Ubuntu2204 \
  --size Standard_B1s --generate-ssh-keys --no-wait
az vm wait -g $RG -n $VM --created

1. Create and attach a Premium SSD data disk with ReadOnly caching.

az disk create -g $RG -n lab-data --size-gb 32 --sku Premium_LRS -l $LOC
az vm disk attach -g $RG --vm-name $VM --name lab-data \
  --caching ReadOnly --lun 0

Expected: the disk shows diskState: Attached and caching: ReadOnly.

2. Validate the attachment and caching.

az vm show -g $RG -n $VM \
  --query "storageProfile.dataDisks[].{name:name, lun:lun, caching:caching, gb:diskSizeGb}" -o table

Expected output:

Name      Lun    Caching    Gb
--------  -----  ---------  ----
lab-data  0      ReadOnly   32

3. Change caching to None (e.g. this will become a log disk).

az vm update -g $RG -n $VM --set storageProfile.dataDisks[0].caching=None
az vm show -g $RG -n $VM \
  --query "storageProfile.dataDisks[0].caching" -o tsv

Expected: None.

4. Take an incremental snapshot of the data disk.

DISK_ID=$(az disk show -g $RG -n lab-data --query id -o tsv)
az snapshot create -g $RG -n lab-data-snap \
  --source "$DISK_ID" --incremental true -l $LOC
az snapshot show -g $RG -n lab-data-snap \
  --query "{name:name, incremental:incremental, state:provisioningState}" -o table

Expected: incremental: true, provisioningState: Succeeded.

5. (Optional) Prove restore works — create a new disk from the snapshot.

SNAP_ID=$(az snapshot show -g $RG -n lab-data-snap --query id -o tsv)
az disk create -g $RG -n lab-data-restored \
  --source "$SNAP_ID" --sku Premium_LRS -l $LOC

6. Cleanup — delete everything.

az group delete -n $RG --yes --no-wait

Validation: az group exists -n $RG eventually returns false. If you only want to remove the disk artifacts: detach the disk (az vm disk detach -g $RG --vm-name $VM --name lab-data), then az disk delete/az snapshot delete.

Cost note (INR-aware): a 32 GiB Premium SSD (P4-class) is only a few hundred rupees per month if left running; for a lab measured in minutes the cost is a rupee or two. The two things that quietly cost money here are the incremental snapshot (cheap — only the delta — but real if you forget it) and the data disk staying provisioned after you stop the VM (a deallocated VM stops compute charges but disks keep billing). Deleting the resource group removes the VM, both disks, and the snapshot in one shot — always finish the lab with step 6.

Common mistakes & troubleshooting

Symptom Likely cause Fix
VM not hitting the disk’s rated IOPS/throughput VM size’s disk limit is below the disk’s, or VM doesn’t support Premium Right-size the VM (Premium-capable, higher disk cap); check the VM-size disk limits, not just the disk.
Data corruption after a host failure on a data disk ReadWrite (write-back) caching on a disk whose app assumed durable writes Set caching to None (logs) or ReadOnly (read-heavy data); reserve ReadWrite for the OS disk.
“Disk size still old” after resize Grew the managed disk but didn’t extend the in-guest filesystem resize2fs/xfs_growfs (Linux) or Disk Management/Resize-Partition (Windows).
Can’t shrink a disk Disks can only grow Create a smaller disk and copy data; you cannot shrink in place.
Ultra/Premium v2 option greyed out VM size/zone doesn’t support it, or Ultra not enabled at create Use a supported size in a supported zone; set --ultra-ssd-enabled at VM create.
Surprise bill after deallocating VMs Disks (and snapshots) bill even when the VM is stopped/deallocated Delete unused disks/snapshots; deallocate stops compute, not storage.
Wrong device mounted after reboot/resize (Linux) Mounted data disk by /dev/sdX, which can shift Mount by UUID in /etc/fstab.
Shared disk filesystem corrupted Two VMs wrote without cluster coordination Shared disks need SCSI PR / cluster manager; caching must be None; or use Azure Files for shared files.

Best practices

Security notes

Cost & sizing

The levers that move the disk bill, in order of impact:

Sizing heuristic: estimate peak IOPS and MB/s and steady-state capacity; on classic types pick the smallest tier that meets both performance and capacity; on v2/Ultra provision capacity for data and dial IOPS/MB/s to peak (plus headroom). Then confirm the VM size can actually deliver those numbers.

Interview & exam questions

1. What’s the difference between a managed disk and an unmanaged disk, and why did managed win? Unmanaged disks are page blobs you place in storage accounts you manage yourself, with per-account IOPS caps and manual placement; managed disks are first-class ARM resources where Azure manages the backing storage, enforces provisioned performance, spreads availability-set disks across fault domains automatically, and supports RBAC/locks/Policy/tags. Managed is the default; unmanaged is deprecated.

2. Walk me through the five disk types and when you’d pick each. Standard HDD (cheapest, dev/test/backup, magnetic), Standard SSD (light prod/web, consistent-ish), Premium SSD v1 (production/DB, low ms latency, required for the single-VM SLA), Premium SSD v2 (best price/performance, sub-ms, independent IOPS/MB/s — the new default for most production), Ultra (extreme: up to 400k IOPS/10 GB/s, live-tunable, SAP HANA/top-tier SQL).

3. A single VM with no availability set or zone — what disks does it need for an SLA, and what’s the SLA? All OS and data disks must be Premium SSD or Ultra to get the 99.9% single-instance VM SLA. On Standard disks a single VM has no SLA.

4. Explain host caching modes and when each is correct. None (no cache, all I/O straight through — logs/write-heavy, and forced on v2/Ultra), ReadOnly (cache reads, write-through — read-heavy DB data files), ReadWrite (write-back — the OS disk default; dangerous for data because the host can fail before cached writes flush). OS=ReadWrite, data-read-heavy=ReadOnly, logs=None.

5. Why can ReadWrite caching cause data loss? ReadWrite is write-back: the write is acknowledged from the host cache before reaching the durable disk. If the host fails before the cache flushes and the application believed the write was committed (e.g. a DB log), that data is lost — so logs must use None.

6. Difference between credit-based and on-demand bursting? Credit-based: free, automatic, accrues credits while idle and spends them to burst to a low ceiling (e.g. 3,500 IOPS) for a limited time — on Premium v1 (small) and Standard SSD. On-demand: paid, no duration limit, much higher ceiling, only on larger Premium v1 disks, billed per transaction above baseline.

7. How does disk size relate to performance, and how do Premium SSD v2/Ultra change that? On classic types (P/E/S), performance is fixed by the size tier — bigger disk = more IOPS/throughput. Premium SSD v2 and Ultra decouple capacity from performance: you provision GiB, IOPS, and MB/s independently and pay for each.

8. What is the temp disk and what’s the number-one rule about it? A local SSD on the host (D://mnt), free and fast, used for page files/tempdb/scratch — not persisted: it’s wiped on deallocate, resize, or host maintenance. Rule: never store durable data on it.

9. Full vs incremental snapshot — which and why? Incremental: stores only the delta since the last snapshot (cheaper), each is independently restorable, Azure manages the chain. Prefer incremental; use full only for legacy. For app-consistent backups use Azure Backup, not raw snapshots.

10. Name the encryption options and which one also encrypts the temp disk. SSE/PMK (default, free), SSE/CMK (your key via a Disk Encryption Set), encryption at host (the one that also covers the temp disk and host caches), Azure Disk Encryption (in-guest BitLocker/DM-Crypt), double encryption at rest, and confidential disk encryption (key sealed to the VM’s TEE).

11. What’s an ephemeral OS disk and its trade-off? The OS disk lives on the host’s local storage — free and very low latency — but not persisted: a deallocate/host failure/reimage wipes it, and you can’t snapshot or back it up. Ideal for stateless, reimage-on-the-fly fleets (VMSS/AKS); wrong for stateful single VMs.

12. How do you roll a corrupted VM back to a known-good OS state without rebuilding the VM? Create a new managed disk from a known-good OS snapshot, deallocate the VM, swap the OS disk (az vm update --os-disk), and start — the VM keeps its name, NICs, IP, size, and data disks.

Quick check

  1. Which disk type is required on all disks for a single VM (no AS/zone) to have an SLA, and what is that SLA?
  2. What host caching mode belongs on a database transaction log disk, and why?
  3. True/false: you can shrink a managed disk in place.
  4. Which encryption option also encrypts the temporary disk?
  5. You deallocate a VM to save money but the bill barely drops. Why?

Answers

  1. Premium SSD (or Ultra) on all OS and data disks → 99.9% single-instance SLA. Standard disks give no SLA.
  2. None — a transaction log must be durable on every commit, and ReadWrite/write-back caching could lose acknowledged writes if the host fails before flush.
  3. False — disks can only grow; to shrink you create a smaller disk and copy data.
  4. Encryption at host (SSE PMK/CMK don’t cover the temp disk; ADE/in-guest also can, but encryption at host is the agentless platform answer).
  5. Deallocating stops compute charges, but managed disks (and snapshots) keep billing regardless of VM power state — you must delete unused disks to stop their cost.

Exercise

Take a workload you actually run (or invent a small e-commerce app: a web tier, a SQL Server, and a nightly analytics job). For each VM, produce a one-page disk plan that states, per disk: (a) the role (OS/data/temp), (b) the disk type and why, © the size and — for v2/Ultra — the provisioned IOPS and MB/s, (d) the host caching mode and the justification, (e) the encryption choice, and (f) the snapshot/backup cadence. Then write the az disk create / az vm disk attach commands that would build it, and estimate the monthly INR cost of the disks alone (capacity + any provisioned performance + snapshots). Bonus: identify one disk where Premium SSD v2 would be cheaper and faster than v1.

Certification mapping

AZ-104 (Azure Administrator):

AZ-305 (Azure Solutions Architect Expert):

Glossary

Next steps

AzureManaged DisksStorageComputeEncryptionPerformance
Need this built for real?

Vinod is a Senior Cloud Architect (22+ yrs) — available for Azure / AWS / GCP architecture, landing zones, and migrations.

Work with me

Comments

Keep Reading