Azure Backup & DR

Azure Backup Hardening: Immutable Vaults, Multi-User Authorization, Soft Delete, and Cross-Region Restore

Every ransomware tabletop I have run ends at the same uncomfortable question: when the attacker has Backup Contributor on your subscription, what actually stops them from stopping backups, dropping retention to one day, and waiting you out? The honest answer for most tenants is “nothing.” Backups are the last line of defense, which makes the backup control plane the highest-value target in the blast radius. Modern attackers know this – they delete recovery points before they encrypt. Azure Backup ships four independent controls that together make the vault tamper-resistant even against an admin-level compromise: immutability locks the retention floor, multi-user authorization (MUA) puts destructive operations behind a second tenant’s approval, enhanced soft delete keeps deleted backups recoverable, and cross-region restore gives you an out-of-region copy when the primary region is gone. This is how to wire all four correctly, in the right order, and prove they work.

1. Recovery Services vault vs Backup vault, and what each protects

Azure has two vault resource types and they are not interchangeable. Picking the wrong one means re-onboarding workloads later, so get this right on day one.

Capability Recovery Services vault Backup vault
Resource type Microsoft.RecoveryServices/vaults Microsoft.DataProtection/backupVaults
Azure VMs Yes (snapshot + vault) No
SQL/SAP HANA in Azure VM Yes No
Azure Files (snapshot) Yes Yes (vaulted, via Backup vault)
Azure Blobs No Yes (operational + vaulted)
Azure Disks No Yes
Azure Database for PostgreSQL Flexible Server No Yes
AKS No Yes
Immutability Yes Yes
MUA via Resource Guard Yes Yes
Cross-region restore Yes Yes (selected workloads)

The rule of thumb: Recovery Services vault for the classic IaaS and in-guest workloads (VMs, SQL-in-VM, SAP HANA-in-VM), Backup vault for the newer managed-data-store estate (Blobs, Disks, PostgreSQL Flexible Server, AKS, vaulted Azure Files). Many platform teams run both, and that is expected – they are governed the same way for immutability and MUA, which is the whole point of this article.

Create a Recovery Services vault and immediately set redundancy. Storage redundancy is only changeable while the vault has zero protected items, so this is a day-zero decision:

az backup vault create \
  --resource-group rg-backup-prod \
  --name rsv-prod-weu \
  --location westeurope

# GeoRedundant + CrossRegionRestore enabled is the prerequisite for CRR.
az backup vault backup-properties set \
  --resource-group rg-backup-prod \
  --name rsv-prod-weu \
  --backup-storage-redundancy GeoRedundant \
  --cross-region-restore-flag true

Cross-region restore requires GeoRedundant storage. It does not work with LocallyRedundant or ZoneRedundant. If you need both zone resilience and CRR, that is not a single setting – ZRS protects you within the region, CRR uses the geo-paired region. Decide which failure mode dominates your risk model before you onboard anything.

2. Immutable vaults: unlocked vs locked, and the operational tradeoff

Vault immutability prevents operations that would reduce the protection of existing recovery points: deleting backup data before its retention expires, shortening retention in a policy, or disabling soft delete. It does not block creating new backups or extending retention – only the destructive direction is gated.

There are two states, and the difference is whether you can ever go back:

Enable it unlocked first via the vault’s securitySettings. With the CLI you patch the vault property:

# Step 1: enable immutability in the "Unlocked" state for a soak period.
az resource update \
  --resource-group rg-backup-prod \
  --name rsv-prod-weu \
  --resource-type Microsoft.RecoveryServices/vaults \
  --set properties.securitySettings.immutabilitySettings.state=Unlocked

Run unlocked for a release cycle or two. Confirm no automation breaks – the usual offenders are decommissioning pipelines that delete backups early, or policy-as-code that lowers retention. Once you are confident, lock it. In Bicep the locked state is explicit and intentional:

resource vault 'Microsoft.RecoveryServices/vaults@2024-04-01' = {
  name: 'rsv-prod-weu'
  location: 'westeurope'
  sku: {
    name: 'RS0'
    tier: 'Standard'
  }
  properties: {
    securitySettings: {
      immutabilitySettings: {
        // 'Locked' is irreversible. Deploy this only after soaking on 'Unlocked'.
        state: 'Locked'
      }
    }
  }
}

The operational tradeoff is real: once locked, you cannot shorten retention even for a legitimate cost-cutting exercise. If you set a 10-year policy by mistake and lock the vault, you pay for 10 years. Treat the lock like a production change-freeze decision – review every active policy’s retention before you flip it.

3. Multi-user authorization with Resource Guard across tenants

Immutability stops you reducing protection on existing data. MUA stops the other class of attack: disabling soft delete, deleting the protection entirely, or removing immutability while it is still unlocked. It does this by requiring that destructive vault operations be authorized through a Resource Guard – a separate Microsoft.DataProtection/resourceGuards resource that you deliberately place where the backup admin has no permissions.

The architecture that actually resists insider compromise puts the Resource Guard in a different tenant (or at minimum a different subscription governed by a different team):

Tenant A (workload)                         Tenant B (security)
+-----------------------+                   +------------------------+
| Recovery Services     |   protected by    | Resource Guard         |
| vault                 |------------------>| (no Backup Operator     |
|                       |                   |  for Tenant A admins)   |
| Backup admin: full    |                   | Security admin: owns    |
| rights EXCEPT the     |                   | the guard, approves     |
| guard-protected ops   |                   | critical operations     |
+-----------------------+                   +------------------------+

Create the guard in the security tenant/subscription:

az dataprotection resource-guard create \
  --resource-group rg-security-guards \
  --name rg-prod-resourceguard \
  --location westeurope

By default the guard protects a set of critical operations: disabling MUA itself, modifying or deleting protection, disabling soft delete, and reducing the soft-delete retention period. You can inspect and tune which operations are gated:

az dataprotection resource-guard list-protected-operations \
  --resource-group rg-security-guards \
  --name rg-prod-resourceguard \
  --resource-type Microsoft.RecoveryServices/vaults

Now associate the vault with the guard. The backup admin in Tenant A needs Reader on the guard (cross-tenant) to create the association, and after this is in place they can no longer perform the protected operations without a just-in-time approval from Tenant B:

# Run as the backup admin, authenticated to BOTH tenants.
az backup vault resource-guard-mapping update \
  --resource-group rg-backup-prod \
  --vault-name rsv-prod-weu \
  --resource-guard-id "/subscriptions/<security-sub>/resourceGroups/rg-security-guards/providers/Microsoft.DataProtection/resourceGuards/rg-prod-resourceguard"

The operating model after association: when the backup team genuinely needs to perform a protected operation (say, retire a workload), the security team grants the backup operator’s identity a time-bound Backup Operator role on the Resource Guard via Azure AD PIM, the operation is performed within the activation window, and the role expires. An attacker who has only compromised Tenant A cannot self-approve – they lack any standing access to the guard. That separation of duties is the entire value of MUA; if you put the guard in the same subscription the backup admin owns, you have built a speed bump, not a control.

4. Enhanced soft delete and recovering from deletion

Soft delete keeps backup data retrievable after someone deletes a backup item or stops protection with “delete data.” Enhanced soft delete (the current model for Recovery Services vaults) makes the feature always-on and configurable: you set a retention between 14 and 180 days, and you can optionally make soft delete itself immutable (non-disablable). Basic soft delete was a fixed 14 days and could be turned off – enhanced is the one you want.

# Configure enhanced soft delete to 30 days. AlwaysON makes it non-disablable.
az backup vault backup-properties set \
  --resource-group rg-backup-prod \
  --name rsv-prod-weu \
  --soft-delete-feature-state AlwaysON \
  --soft-delete-retention-period-in-days 30

AlwaysON is irreversible in the same spirit as locked immutability – you can extend the retention but never disable the feature. Combined with immutability and MUA, you now have three controls that an admin-level attacker cannot individually defeat: they cannot delete inside retention (immutability), cannot turn off soft delete (AlwaysON), and cannot disable any of it without the guard (MUA).

When a backup is deleted – maliciously or by a fat-fingered decommission script – the item moves to a soft-deleted state. Recovery is undelete-then-resume:

# List soft-deleted items.
az backup item list \
  --resource-group rg-backup-prod \
  --vault-name rsv-prod-weu \
  --backup-management-type AzureIaasVM \
  --query "[?properties.isScheduledForDeferredDelete].name" -o tsv

# Undelete and re-enable protection for a specific VM.
az backup protection undelete \
  --resource-group rg-backup-prod \
  --vault-name rsv-prod-weu \
  --container-name <container> \
  --item-name <vm-name> \
  --backup-management-type AzureIaasVM \
  --workload-type VM

For Backup vaults (Blobs, Disks, PostgreSQL), the equivalent is configured through the vault’s softDeleteSettings with the same 14–180 day window, set via the az dataprotection backup-vault update command or the portal.

5. Backup policies, retention, and instant-restore snapshots

Policy is where retention lives, and retention is what immutability and MUA enforce. Build the policy deliberately. For Azure VMs, the instant restore tier keeps local snapshots (1–5 days) for fast restores that never touch vault storage, while GRS-replicated recovery points serve long-term and cross-region needs.

A defensible IaaS policy template – daily plus a weekly/monthly/yearly grandfather-father-son ladder:

{
  "schedulePolicy": {
    "schedulePolicyType": "SimpleSchedulePolicy",
    "scheduleRunFrequency": "Daily",
    "scheduleRunTimes": ["2026-06-08T01:00:00Z"]
  },
  "retentionPolicy": {
    "retentionPolicyType": "LongTermRetentionPolicy",
    "dailySchedule":  { "retentionDuration": { "count": 30,  "durationType": "Days"   } },
    "weeklySchedule": { "daysOfTheWeek": ["Sunday"], "retentionDuration": { "count": 12, "durationType": "Weeks" } },
    "monthlySchedule": { "retentionScheduleFormatType": "Weekly", "retentionScheduleWeekly": { "daysOfTheWeek": ["Sunday"], "weeksOfTheMonth": ["First"] }, "retentionDuration": { "count": 36, "durationType": "Months" } },
    "yearlySchedule": { "retentionScheduleFormatType": "Weekly", "monthsOfYear": ["January"], "retentionScheduleWeekly": { "daysOfTheWeek": ["Sunday"], "weeksOfTheMonth": ["First"] }, "retentionDuration": { "count": 7, "durationType": "Years" } }
  },
  "instantRpRetentionRangeInDays": 5,
  "timeZone": "UTC"
}
az backup policy set \
  --resource-group rg-backup-prod \
  --vault-name rsv-prod-weu \
  --name policy-iaas-gfs \
  --policy @iaas-policy.json

Two retention facts that catch people:

6. Cross-region restore and zone-redundant storage

CRR lets you restore a VM, SQL-in-VM, or SAP HANA backup into the Azure-paired region without waiting for a regional failover or a Microsoft-declared outage – you choose to restore in the secondary on demand. It is the control that turns “the region is down” from an outage into a runbook. The prerequisites, in order:

  1. Vault redundancy = GeoRedundant (set at section 1, before onboarding).
  2. crossRegionRestore flag enabled on the vault.
  3. The workload type supports CRR (Azure VM, SQL in Azure VM, SAP HANA in Azure VM).

CRR and zone-redundant storage solve different problems and you cannot have both on one vault: ZoneRedundant protects within the primary region against a single-AZ loss but has no secondary-region copy; GeoRedundant gives you the paired-region copy that CRR reads from. For a production vault whose dominant risk is regional or ransomware, choose GeoRedundant + CRR. List the secondary-region recovery points and restore:

# Enumerate recovery points available in the SECONDARY (paired) region.
az backup recoverypoint list \
  --resource-group rg-backup-prod \
  --vault-name rsv-prod-weu \
  --container-name <container> \
  --item-name <vm-name> \
  --backup-management-type AzureIaasVM \
  --workload-type VM \
  --use-secondary-region \
  --query "[].{name:name, time:properties.recoveryPointTime}" -o table

# Restore disks into the secondary region from a secondary recovery point.
az backup restore restore-disks \
  --resource-group rg-backup-prod \
  --vault-name rsv-prod-weu \
  --container-name <container> \
  --item-name <vm-name> \
  --rp-name <recovery-point-id> \
  --use-secondary-region \
  --target-resource-group rg-dr-northeurope \
  --storage-account <staging-sa-in-secondary>

The restore lands disks in the secondary region; you then build the VM from those disks (or use the full-VM restore flow). Note the staging storage account and target resource group must already exist in the secondary region – pre-provision them as part of your DR landing zone, not during the incident.

Verify

Do not trust the configuration blade. Prove each control with a command and, for restore, with an actual recovery.

# 1. Immutability state is Locked.
az resource show \
  --resource-group rg-backup-prod --name rsv-prod-weu \
  --resource-type Microsoft.RecoveryServices/vaults \
  --query "properties.securitySettings.immutabilitySettings.state" -o tsv
# Expect: Locked

# 2. Soft delete is AlwaysON with your retention.
az backup vault backup-properties show \
  --resource-group rg-backup-prod --name rsv-prod-weu \
  --query "{softDelete:softDeleteFeatureState, days:softDeleteRetentionPeriodInDays, redundancy:storageModelType, crr:crossRegionRestoreFlag}"
# Expect: AlwaysON / 30 / GeoRedundant / true

# 3. Resource Guard mapping exists.
az backup vault resource-guard-mapping show \
  --resource-group rg-backup-prod --vault-name rsv-prod-weu \
  --query "properties.resourceGuardOperationDetails" -o table
// 4. In Log Analytics (vault diagnostics -> CoreAzureBackup), confirm a
// successful secondary-region restore in the last 7 days.
AddonAzureBackupJobs
| where TimeGenerated > ago(7d)
| where BackupItemUniqueId != ""
| where JobOperation == "Restore"
| project TimeGenerated, JobStatus, JobOperation, BackupManagementType, JobUniqueId
| order by TimeGenerated desc

The fourth check is the one that matters. A green config and an untested restore is exactly the trap from the ASR world: “protected” is not “recoverable” until you have booted from a secondary-region recovery point and timed it.

7. Monitoring with Backup center, alerts, and Backup reports

Backup center is the single pane across every vault in the tenant – jobs, alerts, policy compliance, and security posture in one place. Two monitoring layers matter:

az monitor diagnostic-settings create \
  --name backup-diag \
  --resource "/subscriptions/<sub>/resourceGroups/rg-backup-prod/providers/Microsoft.RecoveryServices/vaults/rsv-prod-weu" \
  --workspace "/subscriptions/<sub>/resourceGroups/rg-obs/providers/Microsoft.OperationalInsights/workspaces/law-platform" \
  --logs '[{"categoryGroup":"allLogs","enabled":true}]'

Alert on the security-relevant operations specifically:

CoreAzureBackup
| where TimeGenerated > ago(1d)
| where OperationName has_any ("StopProtectionWithRetainData", "StopProtectionWithDeleteData", "DisableSoftDelete")
| project TimeGenerated, OperationName, BackupItemUniqueId, State

Enterprise scenario

A European fintech platform team ran ~900 production VMs across two GeoRedundant Recovery Services vaults, governed by a central backup squad with Backup Contributor on the landing-zone subscriptions. A red-team exercise compromised a CI service principal that, through role inheritance, held Backup Contributor. The red team’s playbook was textbook ransomware: before touching workloads, disable soft delete and stop-protection-with-delete-data on the crown-jewel VMs to destroy the recovery path. With only immutability enabled (unlocked), the audit showed the attacker could have disabled immutability first and then deleted – the lock had never been flipped because nobody wanted to lose the ability to shorten retention.

The fix was sequencing, not new technology. They (1) audited every policy’s retention and trimmed three over-long yearly schedules, (2) flipped immutability to Locked on both vaults, (3) set soft delete to AlwaysON / 30 days, and (4) stood up a Resource Guard in a separate security tenant and mapped both vaults to it, so the backup squad’s standing rights no longer included disabling soft delete or stopping protection – those now required a PIM-activated Backup Operator role on the guard, approved by the security team. The re-run red team, holding the same Backup Contributor, was fully blocked:

# Re-run attacker, holding Backup Contributor in the workload tenant, tries to
# strip protection. With the Resource Guard mapped, this FAILS authorization
# because the identity has no Backup Operator role on the guard in Tenant B.
az backup protection disable \
  --resource-group rg-backup-prod --vault-name rsv-prod-weu \
  --container-name <c> --item-name <vm> \
  --backup-management-type AzureIaasVM --delete-backup-data true
# -> ResourceGuard: operation requires authorization on the Resource Guard.

The lesson the team wrote into their platform standard: these four controls are only worth anything combined and locked. Any single one, left unlocked or co-located with the admin who would be the attacker, is theatre.

Hardening checklist

AzureAzure BackupRansomwareData ProtectionResilience

Comments

Keep Reading