Your Azure bill is creeping up and you suspect waste, but you do not know where it is hiding. Somewhere in that subscription is a Standard_D8s_v5 VM sized for a launch that never came, a managed disk left behind when its VM was deleted, and an App Service plan with no apps still billing by the hour. Finding them by hand means clicking through hundreds of resources comparing CPU graphs — so the waste persists. Azure Advisor is the free, built-in service that does this hunt for you: it watches your resources, learns their real utilization, and produces a ranked list of concrete cost cuts on its Cost tab — resize this VM, shut down that one, delete this orphaned disk — each with an estimated monthly saving in your own currency.
The catch is that a list of recommendations is not money saved. Advisor tells you a VM is underutilized; it does not press the button. And some advice is wrong for your situation — a VM idle for seven days might be a disaster-recovery standby you must keep, or a batch box that only wakes at month-end. Acting well means understanding what Advisor actually measured, deciding resize versus shut down versus dismiss, and doing it safely. This article teaches all of that, end to end, in the portal and with az CLI and Bicep, so you turn the list into a lower invoice without breaking anything. It is the cheapest cost tool in Azure, and most teams never open it.
What problem this solves
Cloud waste is rarely one dramatic mistake. It is the steady accumulation of resources once right-sized and no longer: the VM provisioned three sizes too big “to be safe”, the test database left running over a weekend, the disk that outlived its VM, the plan emptied during a migration but never deleted. Each is individually small and invisible; together they can be 20-35% of an untended bill, and finding them by hand does not scale.
What breaks without Advisor is not an outage — it is money. Finance asks why the bill grew and the team has no answer, because the spend is spread across dozens of slightly-oversized resources that each look reasonable alone. The instinct is to guess — downsize something and hope — which either does nothing or causes an incident (downsized the busy one). Advisor replaces the guess with evidence: it has watched each resource for days and can tell you its 95th-percentile CPU never crossed a few percent.
It hits everyone running Azure beyond a trivial footprint, hardest on teams without a FinOps practice — dev/test sprawl, lift-and-shift migrations where on-prem VM sizes were copied verbatim, and “we’ll right-size it later” that became “we never did”. The fix is not a third-party tool — it is to open the Cost tab you already have and act on it on a schedule.
Here is what the Cost tab surfaces, what each recommendation means, and the first thing to check before acting:
| Recommendation class | What Advisor is saying | Typical resource | Verify before acting |
|---|---|---|---|
| Rightsize underutilized VM/VMSS | “Bigger than its real load” | Standard_D*/E* VM or scale set |
Low usage by design (DR, batch, seasonal)? |
| Shut down underutilized VM/VMSS | “Barely used at all for 7 days” | A near-idle VM | Anything depend on it being up but quiet? |
| Delete unattached disk | “Attached to no VM” | microsoft.compute/disks |
Data still needed? Snapshot first |
| Delete empty App Service plan | “Runs zero apps” | microsoft.web/serverfarms |
A slot/app about to deploy onto it? |
| Buy a reservation / savings plan | “Steady usage qualifies for a discount” | Subscription-scoped | Workload stable for 1-3 years? |
| Idle Cosmos DB container | “No activity for 30 days” | microsoft.documentdb/databaseaccounts |
Rarely-used but required? |
Learning objectives
By the end of this article you can:
- Open the Azure Advisor Cost tab and read a recommendation: the resource, the action (resize / shut down / delete), and the estimated saving.
- Explain what Advisor measured for a VM recommendation — the 7-day lookback, the CPU and outbound-network thresholds, and why memory is used for resize but not shutdown.
- Decide whether to resize, shut down (deallocate), delete, postpone, or dismiss — and why “low utilization by design” is a legitimate reason to dismiss.
- Action a VM rightsizing safely in the portal and
azCLI, with a pre-resize checklist (allowed sizes, region/SKU availability) and post-resize validation. - Find and safely delete an orphaned resource (an unattached disk) — snapshot first, confirm, delete — without losing data you still need.
- Pull recommendations with
az advisor recommendation list, and configure the VM rightsizing rule per subscription (CPU filter, lookback) so they match your environment. - Understand the savings caveats — retail-rate estimates that ignore your existing reservations — so you never over-promise a number to finance.
Prerequisites & where this fits
You need an Azure subscription with a resource to experiment on (the lab spins up a tiny VM and a disk you will delete), and the Azure CLI or Cloud Shell (the >_ icon in the portal — zero local setup). Be comfortable with a resource group (a folder holding related resources) and a VM size/SKU (like Standard_B2s — a name encoding vCPUs and memory). No prior cost-tooling knowledge is assumed.
On permissions: reading recommendations needs only Reader; acting needs write on the resource (Contributor on the VM to resize, on the disk to delete). Changing Advisor’s configuration needs subscription rights — if you lack them, the portal greys the option out.
Where this sits: Advisor is the detection layer. Azure Cost Management for Beginners: Budgets, Alerts and Cost Analysis tells you how much you spend; Advisor tells you what to cut. A Azure Tagging Strategy for cost allocation attributes each recommendation to a team, and Azure FinOps: Cost Management at Scale wraps Advisor into a practice across subscriptions. Advisor is broader than cost — it has four other categories — but this article is strictly the Cost tab.
A quick map of where Advisor fits among the tools you already have:
| Tool | Question it answers | Relationship to Advisor |
|---|---|---|
| Cost Analysis | “How much did I spend, sliced how?” | Shows the spend Advisor helps cut |
| Budgets & alerts | “Tell me when I cross a line” | Independent; pairs well |
| Azure Advisor (Cost) | “What specifically should I cut?” | This article |
| Azure Policy | “Stop bad resources being created” | Enforces; Advisor recommends |
| Reservations / Savings Plans | “Commit for a discount” | Advisor recommends these |
Core concepts
A few mental models make every recommendation obvious to read.
Advisor is a recommendation engine, not an enforcer. It reads metrics and configuration and emits advice with an estimated impact, but never changes a resource — you act, or wire automation to. Recommendations span five categories — Cost, Security, Reliability, Operational excellence, Performance — and this is only the Cost one.
Underutilization is measured percentiles, not a vibe. When Advisor calls a VM underutilized, it sampled its metrics (every 30 seconds, aggregated to 30-minute buckets) over a lookback window (7 days by default) and found them under thresholds — a statistical read over days, not one graph. Knowing those thresholds (below) tells you whether to trust each call.
Resize, shut down, and delete are different actions with different risk. A resize moves the VM to a cheaper SKU — the workload keeps running, but the VM reboots. A shutdown here means deallocate (stop and release the compute) — for a barely-used VM, offline until restarted. A delete (orphaned disk, empty plan) is irreversible, which is why you snapshot first. Picking the right action is the whole skill.
Stopping is not deallocating — and only one saves money. “Shut down” from inside the guest OS leaves the VM Stopped but still allocated, and compute keeps billing. Only Stopped (deallocated) — via az vm deallocate or the portal Stop button — stops the compute bill. Beginners shut down from inside Windows, the bill does not move, and they conclude Advisor lied — it did not, they stopped the wrong way.
Savings figures are honest estimates with two asterisks. They use public retail (pay-as-you-go) rates and ignore any reservations or savings plans you own. With a reservation covering that VM the real saving may be smaller — or a cross-series resize of a reserved VM can even raise effective cost. Treat the number as an upper-bound signal, not a figure to quote to finance verbatim.
How Advisor decides a VM is underutilized
This section turns Advisor from a black box into a tool you trust. It produces two VM cost recommendations — shut down and resize — each with a rule you can use to sanity-check the advice in seconds.
The shutdown rule — “barely used at all”
A shutdown recommendation appears when a VM was almost entirely idle across the window. Advisor looks at CPU and Outbound Network only — memory is ignored, because Microsoft found those two sufficient to identify a truly idle box. The triggers (all must hold over the default 7-day window):
| Signal | Threshold for “shut down” | In plain English |
|---|---|---|
| P95 of max CPU (across all cores) | < 3% | Peak CPU is almost nothing |
| P100 of avg CPU, last 3 days (all cores) | ≤ 2% | Even the busiest moment averaged ~zero |
| Outbound Network utilization | < 2% over 7 days | Barely talking to anything |
A VM that clears all three is doing essentially nothing — a forgotten test box, a decommissioned service nobody turned off. The right action is deallocate (if you might need it) or delete (if you are sure); the stated saving is the full compute cost.
The resize rule — “right-sized smaller”
A resize is subtler: the VM is doing work, but a cheaper SKU would carry it comfortably. Advisor uses CPU, Memory, and Outbound Network — memory matters now, to fit the load onto less hardware without starving it — then finds a target SKU that keeps headroom:
| Headroom target on the recommended SKU | User-facing workload | Non-user-facing workload |
|---|---|---|
| P95 CPU and Outbound Network | ≤ 40% | ≤ 80% |
| P99 Memory | ≤ 60% | ≤ 80% |
User-facing workloads (by CPU pattern) get more headroom so a spike does not peg the smaller box; batch can run hotter. The candidate SKU must also match on Accelerated Networking and Premium Storage, be available in-region, and be cheaper. Advisor crosses the family line to save — same-family (D4s_v5 → D2s_v5), newer version (D3v2 → D2v3), different family (D3v2 → E3v2), or a burstable B-series.
When the resize target is a burstable SKU
The B-series is Advisor’s favourite target for the “low average, occasional spike” VM. B-series VMs run at a reduced baseline CPU and bank credits while idle, spending them to burst — much cheaper than a SKU sized for the peak. Advisor recommends one only when the average CPU is under the burstable baseline, the P95 under twice it, the SKU does not use Accelerated Networking (B-series lacks it), and the banked credits would cover your spikes — that credit check is why a burstable recommendation is usually trustworthy. See Azure VM Series & Families explained (D, E, F, L, N, M) for the family map.
The lookback window — and why 7 days can mislead
By default Advisor analyzes the last 7 days — fine for steady workloads, misleading for periodic ones: a payroll VM idle for 27 days and slammed on the 28th looks “shut me down” all month, and acting would be a disaster. Widen the lookback so Advisor sees a full cycle. Available windows are 7, 14, 21, 30, 60, or 90 days; recommendations refresh within about 48 hours. For any monthly, seasonal, or batch workload, push it to 30 or 90 days before acting on a shutdown — though a longer window can also hide a recently idled VM, so it is a trade-off.
Idle and orphaned resources Advisor finds
Beyond VMs, the Cost tab surfaces pure waste — resources that cost money while doing nothing, the easiest wins because there is rarely a reason to keep them.
Unattached managed disks are the classic. When you delete a VM, its disks are not always deleted with it; the disk lingers, attached to no VM, billing for its full provisioned size — an orphaned 1 TB Premium SSD is real money for zero value. Advisor flags these under “Review disks that aren’t attached to a VM”, but deletion is irreversible, so the pattern is snapshot first, confirm, then delete. Azure VM Disk Types explained (Standard, Premium, Ultra) covers what each costs.
Empty App Service plans are next. A plan is the VMs you rent to run web apps, billed whether or not any app runs; during a migration teams delete the apps but forget the plan. Advisor flags “Unused/Empty App Service plan” — delete it after confirming no app or slot is about to land on it (pricing in Azure App Service Plans & Tiers explained).
Reservations and savings plans are a different shape: for steady usage Advisor suggests a 1- or 3-year commitment that discounts pay-as-you-go. It does not shrink anything, it asks you to pre-pay — so commit only for workloads stable for the term, and right-size first, or you lock in the waste.
Here is how the main idle/commitment recommendations differ and what to check before acting:
| Recommendation | What it means | The action | Reversible? | Verify first |
|---|---|---|---|---|
| Unattached disk | Bound to no VM | Snapshot, then delete | No | Data truly unneeded? |
| Empty App Service plan | Zero apps | Delete the plan | No | No app/slot about to deploy? |
| Idle Cosmos DB container | No activity 30 days | Lower throughput or delete | Throughput yes; delete no | Rarely-used but required? |
| Buy a reservation | Steady usage qualifies | Commit 1 or 3 years | Limited | Workload stable for the term? |
| Buy a savings plan | Steady compute spend | Commit hourly $ for 1-3 yrs | Limited | Right-sized first? |
Acting on a recommendation: resize vs shut down vs dismiss
Reading a recommendation is half the job; choosing the response is the other half. Every recommendation offers the same responses, and picking the right one separates a saved rupee from an incident.
| Response | What it does | Use it when | Caution |
|---|---|---|---|
| Resize | Move VM to a cheaper SKU | Works but is oversized | Reboots; verify allowed sizes |
| Shut down (deallocate) | Stop + release compute | Near-idle | Offline until restarted |
| Delete | Remove the resource | Orphaned disk / empty plan | Irreversible — snapshot first |
| Postpone | Hide for a period | “Revisit in 30 days” | Comes back; not a fix |
| Dismiss | Hide indefinitely | Low utilization by design | You stop seeing a real cost too |
Dismiss is a feature, not a cop-out — for the genuine reasons Microsoft lists: the VM is pre-provisioned for upcoming traffic, relies on metrics Advisor doesn’t see (GPU, local IO), is kept on a SKU for testing, must stay homogeneous with the fleet, or is a disaster-recovery standby that must stay idle-but-ready. There, dismissing is correct. What you must not do is dismiss because resizing is inconvenient — that is how waste becomes permanent.
The whole decision flow: low usage by design? Yes → dismiss with a note. No → doing real work? Essentially none → deallocate (delete if dead); some, on too-big a box → resize in a maintenance window.
Architecture at a glance
Hold this mental model and every step in the lab makes sense. Three layers. At the bottom, your live resources — VMs, scale sets, disks, plans — emit platform metrics (CPU, memory, network) plus configuration, like whether a disk is attached to anything, sampled every 30 seconds. In the middle sits Advisor’s analysis engine: on a recurring cadence it reads each resource’s metrics across the lookback window, runs the percentile rules above, cross-references retail pricing, and writes recommendation objects — one per qualifying resource, carrying the resource ID, action, and estimated saving. This engine is read-only; it observes and advises, never touching your resources.
At the top is where you live — the Cost tab, az advisor recommendation list, or the REST/ARM API. You read the recommendations and take the action: a resize, deallocate, disk delete, reservation purchase. The arrow from Advisor to your resources is not automatic — Advisor proposes, you dispose. Configuration changes (lookback, CPU filter) flow back into the engine and change next cycle’s output. The loop is: resources emit → Advisor analyzes → you act → resources change → next cycle re-measures. Your job is to close that loop on a schedule instead of letting recommendations pile up unread.
Real-world scenario
Northwind Retail runs a mid-sized e-commerce platform on Azure in Central India. Eighteen months of “ship fast, clean up later” left a production subscription with 140-odd resources and a bill drifting from ₹6.8 lakh to ₹9.2 lakh with no obvious cause — no new product, no traffic surge. Finance asked the platform lead, Anjali, where is the extra ₹2.4 lakh going? She did not know either.
She opened the Advisor Cost tab for the first time: 31 cost recommendations, combined estimated saving about ₹1.9 lakh/month. The big three: nine VMs flagged for resize (mostly Standard_D8s_v5 nodes copied from an over-provisioned on-prem spec, at 6-9% P95 CPU), four for shutdown (old staging boxes untouched for weeks), and eleven unattached disks from a Kubernetes migration, including two 2 TB Premium SSDs at ~₹18,000/month each.
She did not act blindly. For the four “shutdown” VMs she first widened the lookback to 30 days — confirming they were dead, not batch boxes — then deallocated three and deleted one. Of the nine resize candidates, seven were genuinely oversized and got resized (D8s_v5 → D4s_v5, two to B-series), but two she dismissed — the Black Friday buffer, pre-scaled for a sale six weeks out, the textbook “provisioned for upcoming traffic” case. For the eleven orphaned disks she snapshotted the two large ones, confirmed the rest were dead, and deleted all eleven.
Over the next cycle the bill dropped from ₹9.2 lakh to about ₹7.4 lakh — roughly ₹1.8 lakh/month recovered, close to Advisor’s estimate but not identical, because three resized VMs were covered by an existing reservation, so their real saving was smaller than the retail-rate figure (the caveat to expect). Anjali set the rightsizing CPU filter to 10% and put a monthly “review Advisor” reminder on her calendar. The lesson: the waste had been visible the whole time in a free tool nobody had opened, and acting carefully — not blindly — turned a list into ₹21 lakh a year.
Advantages and disadvantages
| Advantages | Disadvantages |
|---|---|
| Free, built in — ships with every subscription | Read-only — you must act (or build automation) |
| Evidence-based — real multi-day percentiles | Default 7-day lookback misleads on periodic workloads |
| Estimates a concrete saving per recommendation | Savings are retail-rate, ignore reservations — can overstate |
| Covers the big wins: oversized VMs, orphaned disks, empty plans | Doesn’t see every resource type or form of waste |
| Tunable per subscription (CPU filter, lookback) | Acting blindly can cause outages (DR, spike buffers) |
| Available via portal, CLI, PowerShell, REST/ARM, workbook | Refresh lags config changes by 24-48h |
These matter differently by maturity. For a small team starting on cost discipline the free, evidence-based, concrete-saving advantages are transformative — “no idea” to a ranked action list in five minutes. As you scale, the disadvantages become things you engineer around: widen lookbacks for batch fleets, reconcile savings against reservations, and treat the list as a well-informed first draft, not gospel. Advisor is excellent at finding waste and only as good as your judgment at acting on it.
Hands-on lab
This is the heart of the article. You will create a wasteful pair of resources, then action a rightsizing and an idle-disk cleanup safely — in the portal and az CLI — plus a Bicep snapshot pattern, with validation at each step and a full teardown. It is free-tier-friendly: a tiny B1s VM and a small Standard disk you delete.
Caveat: Advisor needs days of telemetry before it generates a real rightsizing recommendation, so the lab won’t produce one in ten minutes. Instead you (a) read whatever recommendations exist, (b) perform the exact actions one asks for with the right safety checks, and © configure the rules — that skill is the point; the recommendation is just the trigger.
Step 0 — Prerequisites and variables
Open Cloud Shell (Bash) or a local terminal with the Azure CLI, confirm you are logged in, and set variables.
az account show --query "{sub:name, id:id}" -o table # confirm the right subscription
az upgrade --yes 2>/dev/null # ensure a recent CLI (optional)
RG=rg-advisor-lab
LOC=centralindia
VM=vm-advisor-demo
Expected output: your subscription name and ID. If it errors with “Please run az login”, run az login first.
Step 1 — Create a resource group and a small VM
az group create --name $RG --location $LOC -o table
az vm create \
--resource-group $RG \
--name $VM \
--image Ubuntu2204 \
--size Standard_B1s \
--admin-username azureuser \
--generate-ssh-keys \
--public-ip-sku Standard \
-o table
Expected output: az group create returns "provisioningState": "Succeeded". az vm create takes 1-2 minutes and prints JSON with a publicIpAddress and "powerState": "VM running". You now have a running VM — and, for later, an OS managed disk attached to it.
Validate the VM and capture its current size:
az vm show -g $RG -n $VM --query "{name:name, size:hardwareProfile.vmSize, state:provisioningState}" -o table
az vm get-instance-view -g $RG -n $VM --query "instanceView.statuses[?starts_with(code,'PowerState')].displayStatus" -o tsv
Expect size Standard_B1s and power state VM running.
Step 2 — Read existing Advisor cost recommendations (portal)
- In the portal search bar, type Advisor and open it; in the left menu select Cost.
- Each row shows the recommendation (e.g. Right-size or shutdown underutilized virtual machines), the impact (High/Medium/Low), the resource(s), and a potential savings figure.
- Click one to open it — note the affected resources, the recommended action per resource (resize to a target SKU, or shut down), and the Postpone/Dismiss buttons.
Expected result: you can identify, for one resource, the exact action proposed and the money attached. A brand-new or tiny subscription may show no cost recommendations yet — normal, not an error.
Step 3 — Read the same recommendations via az CLI
The CLI is how you script reviews. Generate results, then list Cost:
# The 'advisor' commands are part of the core CLI; this generates/refreshes results
az advisor recommendation generate
sleep 5
# List only Cost-category recommendations, projected to the useful fields
az advisor recommendation list --category Cost \
--query "[].{resource:impactedValue, problem:shortDescription.problem, impact:impact}" -o table
Expected output: a table of cost recommendations (empty if you have none). impactedValue is the resource, shortDescription.problem the summary (e.g. “Right-size or shutdown underutilized virtual machines”), impact is High/Medium/Low.
To see the estimated savings and target, drill into one recommendation’s raw object:
az advisor recommendation list --category Cost \
--query "[0].extendedProperties" -o json
Expected output: a JSON bag of properties — for a rightsizing recommendation, fields like savingsAmount, savingsCurrency, annualSavingsAmount, targetSku, and regionId (the same numbers the portal shows). If you have zero cost recommendations it returns null — skip to Step 4.
Step 4 — Action a rightsizing safely (the pre-flight checks)
Suppose a recommendation says resize vm-advisor-demo to a smaller SKU. Before any resize, run the checklist — a resize reboots the VM, and not every SKU is on the current cluster.
4a. Check sizes the VM can move to in place:
az vm list-vm-resize-options -g $RG -n $VM --query "[].name" -o table
Expected output: SKUs the VM can resize to in place. If your target is not listed, you must deallocate first — a bigger maintenance action.
4b. Confirm the target SKU isn’t region-restricted:
az vm list-skus --location $LOC --size Standard_B1 --query "[].{name:name, restrictions:restrictions[].reasonCode}" -o table
Expected output: the region’s B-series sizes with an empty [] restrictions column for available ones. A NotAvailableForSubscription reason means pick another SKU.
Step 5 — Perform the resize (CLI and portal)
For the lab, resize to Standard_B1ms purely to exercise the operation (in production you resize down per the recommendation).
# This REBOOTS the VM. In production, do it in a maintenance window.
az vm resize -g $RG -n $VM --size Standard_B1ms -o table
Expected output: runs ~30-90 seconds and returns VM JSON with "vmSize": "Standard_B1ms"; the VM restarts as part of it.
Portal equivalent: VM → Size → pick the new size → Resize; the portal warns it will restart the VM.
Validate the new size and a healthy restart:
az vm show -g $RG -n $VM --query "hardwareProfile.vmSize" -o tsv # → Standard_B1ms
az vm get-instance-view -g $RG -n $VM \
--query "instanceView.statuses[?starts_with(code,'PowerState')].displayStatus" -o tsv # → VM running
Both must reflect the change before you call it done; in production you’d then watch the app’s metrics for a day to confirm the smaller SKU carries the load.
Step 6 — Create and clean up an idle disk (the safe delete pattern)
Now the idle-resource workflow: create an empty managed disk attached to nothing — the orphan Advisor flags — and delete it safely.
DISK=disk-orphan-demo
az disk create -g $RG -n $DISK --size-gb 8 --sku Standard_LRS -o table
Expected output: a disk with "diskState": "Unattached" — attached to no VM, it bills for its 8 GB every month for zero value, a miniature of the real problem.
Confirm it is unattached — never delete a disk showing Attached:
az disk show -g $RG -n $DISK --query "{name:name, state:diskState, managedBy:managedBy}" -o table
Expected output: diskState = Unattached, managedBy = blank/null. A populated managedBy means a VM owns it — stop and investigate.
Snapshot first — the non-negotiable step before deleting any disk that might hold data:
az snapshot create -g $RG -n ${DISK}-snap --source $DISK --sku Standard_LRS -o table
Expected output: a snapshot with "provisioningState": "Succeeded". It costs pennies and is your insurance — for a real data disk, it lets you delete with confidence.
Now delete the orphaned disk and confirm it is gone:
az disk delete -g $RG -n $DISK --yes -o table
az disk list -g $RG --query "[].name" -o table # disk-orphan-demo should be absent
Portal equivalent: Disks → select the unattached disk → confirm Disk state: Unattached → optionally Create snapshot → Delete.
Step 7 — Bicep: the snapshot-before-delete pattern as code
For repeatable cleanups, express the safe half — the snapshot — as code, then delete the source disk via pipeline. Save as snapshot.bicep:
@description('Resource ID of the unattached disk to snapshot before deletion')
param sourceDiskId string
@description('Location for the snapshot')
param location string = resourceGroup().location
resource snap 'Microsoft.Compute/snapshots@2023-10-02' = {
name: 'orphan-disk-snapshot'
location: location
sku: {
name: 'Standard_LRS' // cheap, redundant-enough for a safety snapshot
}
properties: {
creationData: {
createOption: 'Copy'
sourceResourceId: sourceDiskId
}
}
}
output snapshotId string = snap.id
Deploy it, passing the disk ID to protect:
DISK_ID=$(az disk show -g $RG -n some-orphan-disk --query id -o tsv)
az deployment group create -g $RG -f snapshot.bicep -p sourceDiskId="$DISK_ID" -o table
Expected output: "provisioningState": "Succeeded" and a snapshotId. A pipeline can now safely az disk delete the source. (Skip the deploy if you already removed the demo disk in Step 6 — this shows the production pattern.)
Step 8 — Configure Advisor’s VM rightsizing rule (portal)
Tune Advisor to stop flagging boxes you keep on purpose.
- Open Advisor → Configuration (it opens on the Resources tab).
- Select the VM/Virtual Machine Scale Sets right sizing tab.
- Tick the subscription(s) to tune, then click Edit.
- Set the Average CPU utilization filter — e.g. 10% — so only VMs under it are surfaced, and Apply.
Expected result: a saved-settings confirmation; the page notes it can take up to 24 hours to reflect. This filter changes which recommendations you see, not how they are computed. To change the lookback window (e.g. 30 days for a batch-heavy subscription), use the same settings; recommendations refresh within about 48 hours.
Step 9 — Teardown
Remove everything so the lab costs nothing ongoing. Deleting the resource group takes the VM, disks, public IP, NIC, and any leftover snapshot.
az group delete --name $RG --yes --no-wait
Expected output: the command returns immediately (--no-wait) and the group deletes in the background. Confirm after a minute or two:
az group exists --name $RG # → false once teardown completes
Validate no stragglers remain: in Cost Analysis, filter to rg-advisor-lab over the last day — it should trend to zero. Delete any snapshot outside this group with az snapshot delete.
You have now read recommendations two ways, performed the resize and idle-cleanup with pre- and post-flight checks, captured the snapshot pattern in Bicep, tuned the rules, and cleaned up — the loop you repeat on real recommendations every month.
Common mistakes & troubleshooting
The failure modes that turn a cost-cleanup into an incident or a wasted afternoon. Symptom → root cause → confirm → fix.
| # | Symptom | Root cause | Confirm (exact check) | Fix |
|---|---|---|---|---|
| 1 | Shut down VM, bill did not drop | Stopped but still allocated (stopped from inside the OS) | get-instance-view shows VM stopped, not VM deallocated |
az vm deallocate -g <rg> -n <vm> |
| 2 | Resized per Advisor, app erroring under load | Acted without checking it was a spike buffer / DR box | App metrics post-resize; was low usage by design? | Resize back up; dismiss with a note |
| 3 | Deleted a disk, lost needed data | Deleted an orphaned disk without snapshotting first | The disk is gone; no snapshot exists | Restore from snapshot if any; always snapshot first |
| 4 | Monthly batch VM flagged “shut down” | 7-day lookback misses the month-end spike | VM idle 27/30 days | Widen lookback to 30/90 days |
| 5 | Saving did not materialize fully | Estimate is retail-rate, ignores your reservation | Active RI/savings plan covers the VM | Reconcile against reservations before reporting |
| 6 | No cost recommendations at all | Subscription too new/small, or scope filtered | Reader on wrong scope; resources <7 days old |
Wait for data; check the subscription filter |
| 7 | Resize fails: “size not available” | Target SKU not on the VM’s cluster/region | Absent from az vm list-vm-resize-options |
Deallocate first, or pick an available SKU |
| 8 | Can’t change Advisor rules — greyed out | Insufficient permissions on the subscription | Configuration control is disabled | Get the required subscription role, retry |
| 9 | Recommendation reappears after “fixing” | You postponed instead of acting | Returns after the postpone window | Action it properly, or dismiss if intentional |
| 10 | Deleted an “empty” plan, a deploy broke | A slot/app was about to deploy onto it | Pipeline targets the deleted plan | Recreate; verify zero intended apps first |
The biggest is #1 — shutting a VM down the wrong way and concluding cost tooling is broken; only Stopped (deallocated) stops the compute bill. The next is #2/#3 — acting without context or a snapshot. Advisor gives the signal; you supply the judgment and the safety net.
Best practices
- Treat Advisor as a scheduled ritual, not a one-off. A recurring monthly review; waste re-accumulates, so a single cleanup decays.
- Always snapshot before you delete a disk or anything stateful — a pennies-a-month snapshot is cheap insurance against an irreversible mistake.
- Widen the lookback for periodic workloads before acting on shutdowns. Default 7 days will tell you to kill your month-end batch VM; use 30 or 90 days for seasonal.
- Right-size before you reserve, and reconcile savings against reservations. A reservation on an oversized VM locks in waste; and Advisor’s retail-rate figure ignores commitments, so it is an upper bound — not a number to quote finance verbatim.
- Resize in a maintenance window. A resize reboots the VM; schedule it rather than surprising production traffic.
- Dismiss with a recorded reason, and tune the CPU filter. When low utilization is by design (DR, spike buffer, batch), dismiss and note why; set the per-subscription average-CPU filter (5-10%) so Advisor stops surfacing already-reasonable VMs.
- Prefer deallocate over delete when unsure. A deallocated VM stops billing compute but is recoverable; deletion is not. Deallocate, observe, then delete.
- Tag resources so recommendations are attributable. A tagging strategy maps each recommendation to an owning team who can confirm the box is really idle.
- Combine prevention with detection. Advisor finds existing waste; Azure Policy effects (deny/audit/modify) stops new oversized or untagged resources at creation.
Security notes
Cost cleanup touches real resources, so apply least-privilege discipline:
- Reading vs acting are different privileges. Grant reviewers Reader so they see Advisor without the power to change anything; reserve Contributor (or a tighter custom role) for those resizing and deleting.
- Deletion is destructive — gate it. Protect critical resources with resource locks (
CanNotDelete) so a hasty cleanup can’t remove something load-bearing; see Azure Resource Locks to prevent accidental deletion. - Snapshots inherit the data’s sensitivity. A snapshot of a data disk contains that data — store it with the same controls, and delete it once the original is confirmed unneeded, not as a lingering unmanaged copy.
- Scope automation tightly. Automation that acts on recommendations should get the minimum role on the minimum scope — Contributor on one resource group, not Owner on the subscription.
- Audit who acted. Resizes, deallocations, and deletions land in the Activity Log — keep it flowing to a Log Analytics workspace so every cost action is attributable.
Cost & sizing
Advisor itself is free; the cost story is what it helps you save, minus the trivial cost of safety. Rightsizing saves the price difference between SKUs (D8s_v5 → D4s_v5 roughly halves the compute; a B-series move saves more for spiky-but-low load). Deallocating a near-idle VM saves its entire compute cost — but you still pay for its disks until you delete them. Deleting an orphaned disk saves its full provisioned charge (Premium SSDs bill on provisioned, not used, size); deleting an empty plan saves its tier rate. The one cost you add — a snapshot — is a few rupees a month, worth it every time.
A rough picture of typical monthly savings per action (INR; varies by SKU, region, discounts):
| Action | What you save | Rough monthly saving (INR) | Watch-out |
|---|---|---|---|
Resize D8s_v5 → D4s_v5 |
Half the VM compute | ~₹15,000-20,000/VM | Reboots; verify load fits |
| Resize to B-series burstable | Big drop for spiky-low load | ~₹10,000-25,000/VM | Only if credits cover spikes |
| Deallocate a near-idle VM | Full VM compute | ~₹8,000-40,000/VM | Disks still bill until deleted |
| Delete 1 TB orphaned Premium disk | Full disk cost | ~₹9,000-10,000/disk | Snapshot first (pennies) |
| Delete empty App Service plan | Full plan tier rate | ~₹4,000-15,000/plan | Confirm zero apps/slots |
The headline: do not pay Advisor — let it pay you. The biggest mistake is leaving the recommendations unread; even acting on just the orphaned disks and empty plans (the zero-risk wins) pays back an engineer’s afternoon many times over.
Interview & exam questions
1. What is Azure Advisor, and which category does cost optimization live under? A free, built-in service that analyzes resources and recommends improvements across five categories — Cost, Security, Reliability, Operational excellence, Performance. Cost recommendations live on the Cost tab.
2. What does Advisor measure to call a VM “underutilized”, and over what window? For shutdown, CPU and Outbound Network over a default 7-day lookback (roughly P95 max CPU < 3%, network < 2%); for resize it adds Memory and targets headroom (P95 CPU ≤ 40% on the new SKU for user-facing). Lookback is configurable to 7/14/21/30/60/90 days.
3. A VM is idle 27 days a month, busy on day 28. Default recommendation, and the fix? The default 7-day lookback recommends shutdown — wrongly, since it can’t see the month-end spike. Widen the lookback to 30 or 90 days; do not act on the 7-day view.
4. “Stopped” vs “Stopped (deallocated)”, and why it matters? Stopped from inside the OS is Stopped but still allocated — compute keeps billing. Stopped (deallocated), via az vm deallocate or the portal Stop button, releases compute so billing stops. A “shut down” recommendation means deallocate.
5. Advisor says ₹50,000/month but the bill drops ₹30,000. Why? Savings use retail rates and ignore reservations/savings plans you own; if a commitment covered that VM the real saving is smaller. Reconcile before reporting.
6. When is it correct to dismiss a rightsizing recommendation? When low utilization is by design: pre-provisioned for upcoming traffic, a DR standby, using metrics Advisor doesn’t see (GPU, local IO), kept on a SKU for testing, or needing homogeneous fleet SKUs. Dismiss with a recorded reason — not because acting is inconvenient.
7. Safe procedure for an “unattached disk” recommendation? Confirm it is Unattached (diskState, blank managedBy), snapshot it, verify the data is unneeded, then delete — deletion is irreversible. Never delete a disk showing Attached.
8. Two checks before resizing a production VM? (a) az vm list-vm-resize-options — is the target on the VM’s current cluster (else deallocate first); (b) az vm list-skus — is it restricted in the region. And a resize reboots the VM.
9. How do you reduce noisy VM recommendations across a subscription? In Advisor → Configuration → VM/VMSS right sizing, set a per-subscription average-CPU filter (e.g. 10%) and adjust the lookback for batch subscriptions. Changes take ~24h (filter) to ~48h (lookback).
10. Advisor recommends a reservation. What first, and the risk of acting blindly? Right-size first — a reservation on an oversized VM locks in the waste. Commit (1/3 years) only for workloads stable for the term; blindly, you risk paying upfront for capacity you’ll stop using.
11. Which role lets someone review Advisor without changing resources? Reader on the scope sees all recommendations but cannot act. Resizing/deleting needs Contributor on the resource; changing Advisor’s configuration needs subscription rights, or the portal greys it out.
These map most directly to AZ-900 (Azure Fundamentals) — describe cost management tools including Azure Advisor — and to the cost-optimization and operational-excellence pillars of the Azure Well-Architected Framework. The hands-on resize/deallocate/disk operations also appear in AZ-104 (Administrator) under managing VMs and cost. A compact cert map:
| Question theme | Primary cert | Objective area |
|---|---|---|
| What Advisor is, its categories, cost tools | AZ-900 | Describe cost management and governance |
| Rightsizing rules, lookback, savings caveats | AZ-900 / WAF | Cost optimization pillar |
| Resize / deallocate / disk cleanup operations | AZ-104 | Manage VMs; optimize cost |
| Reservations vs savings plans | AZ-900 / AZ-104 | Cost management; commitments |
Quick check
- Advisor flags a VM for shutdown. Which two metric types did it analyze, and over what default window?
- You “shut down” a VM from inside Windows and the bill doesn’t move. Why, and what state actually stops the compute charge?
- A VM is idle 27 days a month and busy on day 28. What’s the default recommendation, and the one configuration change that fixes the false positive?
- Before deleting an “unattached disk” recommendation, what is the one safety step you must take — and why?
- Advisor estimates ₹50,000/month savings but you only see ₹30,000. Give the most likely reason.
Answers
- CPU and Outbound Network (memory is not used for shutdown), over the default 7-day lookback (configurable up to 90 days).
- The VM is Stopped but still allocated, so compute keeps billing. Only Stopped (deallocated) — via
az vm deallocateor the portal Stop button — stops the charge. - It recommends shutdown (the 7-day window can’t see the month-end spike). Fix by widening the lookback to 30 or 90 days so it sees a full cycle.
- Snapshot the disk first — deletion is irreversible, so the snapshot is cheap insurance if the data turns out to be needed. Also confirm
diskStateisUnattached. - Savings are at retail rates and ignore your existing reservations/savings plans — if a commitment covered that VM, the real saving is smaller.
Glossary
- Azure Advisor — a free, built-in service that analyzes resource usage and configuration and recommends improvements across Cost, Security, Reliability, Operational excellence, and Performance.
- Recommendation — a single piece of advice tied to a resource, with a recommended action and (for cost) an estimated saving.
- Rightsize — resize a VM/scale set to a cheaper SKU that still carries the load with headroom; the workload keeps running but the VM reboots.
- Shut down / deallocate — stop a VM and release its compute (Stopped-deallocated), which stops the compute bill; offline until restarted.
- Stopped (allocated) — a VM stopped from inside the OS that still reserves compute and keeps billing — not what saves money.
- Lookback window — the days of utilization data Advisor analyzes (default 7; configurable to 14/21/30/60/90); widen it for periodic workloads.
- Unattached (orphaned) disk — a managed disk bound to no VM, billing for its provisioned size for no benefit; safe to delete after a snapshot.
- Empty App Service plan — a plan running zero apps that still bills at its tier rate.
- Burstable (B-series) — a low-cost VM family that runs at a reduced baseline and banks credits to burst; Advisor’s common resize target for spiky-but-low load.
- Reservation / Savings plan — a 1- or 3-year commitment (to capacity, or to a fixed hourly compute spend) that discounts steady usage versus pay-as-you-go.
- Postpone / Dismiss — hide a recommendation temporarily (returns later) or indefinitely (when the behavior is intentional).
- Snapshot — a point-in-time copy of a managed disk, stored cheaply, used as a safety net before an irreversible delete.
- Retail rate — the public pay-as-you-go price Advisor uses for savings estimates, which ignores your reservations/discounts.
Next steps
You can now open Advisor’s Cost tab, understand why each recommendation appeared, and action it safely. Build outward:
- Next: Azure Cost Management for Beginners: Budgets, Alerts and Cost Analysis — see how much you spend and get alerted before crossing a line, the partner to Advisor’s what to cut.
- Related: Azure Tagging Strategy for Cost Allocation — tag resources so every recommendation is attributable to a team.
- Related: Azure VM Series & Families explained (D, E, F, L, N, M) — the SKU map behind every resize target.
- Related: Azure Policy Effects explained (deny, audit, modify) — prevent new oversized or untagged resources so there’s less for Advisor to flag.
- Related: Azure FinOps: Cost Management at Scale — wrap Advisor into a repeatable practice across many subscriptions.