Azure Storage

Blob Storage Data Protection: Lifecycle Tiering, Immutability, and Recovery

Most Blob Storage incidents are not ransomware or region outages. They are a service principal with Storage Blob Data Contributor running a del against the wrong prefix, a lifecycle rule that tiered hot data to archive because someone got a filter wrong, or a compliance auditor asking for the WORM evidence that nobody actually enabled. Data protection on Blob Storage is a stack of independent features that each cover a different failure mode, and they interact in ways that bite you if you enable them in the wrong order. This is how to engineer the full stack: tiering for cost, versioning and soft delete for recovery, point-in-time restore for bulk rollback, and immutable storage for compliance — with the prerequisites and the gotchas.

The whole stack assumes a general-purpose v2 or premium block blob account with hierarchical namespace considerations noted where they matter. Most of the recovery features below require the account to be not HNS-enabled (Data Lake Gen2), so check that first if you are on ADLS.

1. Access tiers and the cost/retrieval tradeoff

Blob Storage has four online/offline tiers, and the entire economic model is a trade between storage cost and access cost plus retrieval latency. Get this backwards and you either overpay for cold data or pay rehydration penalties on data you read weekly.

Tier Storage cost Access (read) cost Min retention Retrieval latency
Hot Highest Lowest None Milliseconds
Cool Lower Higher 30 days Milliseconds
Cold Lower still Higher still 90 days Milliseconds
Archive Lowest Highest 180 days Hours (rehydrate)

The rules that actually trip teams up:

Rule of thumb: tier down only data you are confident you will not read before the minimum retention elapses. For data you might restore, archive’s rehydration latency means it belongs in your DR plan, not your hot path.

2. Authoring lifecycle management rules

Lifecycle management is a JSON policy on the account that the platform evaluates roughly once per day. It moves or deletes blobs based on age (last modified, last accessed, or creation time) and filters (prefix, blob type, index tags). One policy per account, up to 100 rules.

Here is a production-shaped policy: tier logs down to cool then cold then archive, expire them, and clean up old versions and snapshots independently.

{
  "rules": [
    {
      "enabled": true,
      "name": "tier-and-expire-logs",
      "type": "Lifecycle",
      "definition": {
        "filters": {
          "blobTypes": [ "blockBlob" ],
          "prefixMatch": [ "logs/app/" ]
        },
        "actions": {
          "baseBlob": {
            "tierToCool":    { "daysAfterModificationGreaterThan": 30 },
            "tierToCold":    { "daysAfterModificationGreaterThan": 90 },
            "tierToArchive": { "daysAfterModificationGreaterThan": 180 },
            "delete":        { "daysAfterModificationGreaterThan": 2555 }
          },
          "snapshot": {
            "delete": { "daysAfterCreationGreaterThan": 90 }
          },
          "version": {
            "tierToCool": { "daysAfterCreationGreaterThan": 30 },
            "delete":     { "daysAfterCreationGreaterThan": 365 }
          }
        }
      }
    }
  ]
}

Apply it with the CLI:

az storage account management-policy create \
  --account-name kvstgprod \
  --resource-group rg-storage-prod \
  --policy @lifecycle-policy.json

Key behaviors to internalize:

3. Versioning and change feed: the recovery foundation

Blob versioning automatically creates an immutable, read-only snapshot of a blob every time it is overwritten or deleted, identified by a version ID. This is the bedrock of recovery: with versioning on, an overwrite never destroys the prior content — it just demotes it to a previous version.

az storage account blob-service-properties update \
  --account-name kvstgprod \
  --resource-group rg-storage-prod \
  --enable-versioning true

The change feed is a complementary, ordered, durable transaction log of every create/update/delete in the account, written as Avro into a system container ($blobchangefeed). It is the audit trail and the input to point-in-time restore.

az storage account blob-service-properties update \
  --account-name kvstgprod \
  --resource-group rg-storage-prod \
  --enable-change-feed true \
  --change-feed-retention-days 90

What to know:

# Find versions of a blob
az storage blob list \
  --account-name kvstgprod --container-name app-data \
  --prefix config.json --include v \
  --query "[].{name:name, versionId:versionId, current:isCurrentVersion}" -o table

# Restore a specific version by copying it over the current blob
az storage blob copy start \
  --account-name kvstgprod \
  --destination-container app-data --destination-blob config.json \
  --source-uri "https://kvstgprod.blob.core.windows.net/app-data/config.json?versionId=2026-06-01T10:15:30.1234567Z"

4. Soft delete for blobs and containers

Soft delete is the seatbelt. With it enabled, a deleted blob (or an overwritten one, if versioning is off) is retained in a recoverable state for a retention window instead of being purged. There are two independent soft-delete features, and you want both:

# Blob-level soft delete: recovers individual deleted/overwritten blobs
az storage account blob-service-properties update \
  --account-name kvstgprod \
  --resource-group rg-storage-prod \
  --enable-delete-retention true \
  --delete-retention-days 14

# Container-level soft delete: recovers an entire deleted container
az storage account blob-service-properties update \
  --account-name kvstgprod \
  --resource-group rg-storage-prod \
  --enable-container-delete-retention true \
  --container-delete-retention-days 14

The distinction matters: blob soft delete does not save you if someone deletes the whole container — the blobs go with it. Container soft delete covers that, but only the container as a unit (you cannot restore a single blob from a soft-deleted container; you restore the container, then recover blobs).

Choosing a retention window:

Recover an undeleted blob:

az storage blob undelete \
  --account-name kvstgprod \
  --container-name app-data \
  --name important.parquet

Soft delete protects against deletion, not against a malicious actor with permission to change the retention setting. Lock down Microsoft.Storage/storageAccounts/blobServices/write with Azure RBAC and a deny assignment or resource lock so the seatbelt cannot be quietly unbuckled.

5. Point-in-time restore (PITR)

PITR restores a set of block blobs (by container/prefix) to their state at a chosen timestamp in the past. It is your “undo the last hour across thousands of objects” button — exactly what you reach for after a bad bulk job. It works by reading the change feed and reverting via versions, which is why both are hard prerequisites.

The dependency chain, enabled in this order:

  1. Blob versioning on.
  2. Change feed on.
  3. Blob soft delete on.
  4. PITR on, with restore-days strictly less than the soft-delete retention.
az storage account blob-service-properties update \
  --account-name kvstgprod \
  --resource-group rg-storage-prod \
  --enable-restore-policy true \
  --restore-days 13

The constraint that fails deployments: restore-days must be less than delete-retention-days. Set soft delete to 14 and PITR to 13, not 14. PITR cannot restore past the soft-delete horizon because the deleted blobs it needs would already be purged.

Run a restore (this reverts the prefix to the state two hours ago):

az storage blob restore \
  --account-name kvstgprod \
  --resource-group rg-storage-prod \
  --time-to-restore "2026-06-08T08:00:00Z" \
  --blob-range container-name="app-data" start-blob="orders/" end-blob="orders/zzz"

Operational limits worth knowing:

6. Immutable storage: time-based retention and legal hold

Immutability is a different concern from recovery. It is WORM (Write Once, Read Many): once set, data cannot be modified or deleted by anyone — not an admin, not the subscription owner — until the policy releases it. This is what satisfies SEC 17a-4, FINRA, CFTC, and similar regulatory retention mandates.

There are two policy types, and they compose:

Immutability policies are scoped to a container (or, in newer accounts, a version-level policy on individual blobs). Enable version-level immutability support on the account first if you want per-blob control:

# Enable version-level immutability support (account level, one-time)
az storage account blob-service-properties update \
  --account-name kvstgprod \
  --resource-group rg-storage-prod \
  --enable-versioning true

# Container with version-level immutability support enabled
az storage container-rm create \
  --storage-account kvstgprod \
  --resource-group rg-storage-prod \
  --name compliance-archive \
  --enable-vlw true

Set an unlocked time-based policy (5 years) on the container so you can test before committing:

az storage container immutability-policy create \
  --account-name kvstgprod \
  --resource-group rg-storage-prod \
  --container-name compliance-archive \
  --period 1825 \
  --allow-protected-append-writes true

Apply a legal hold:

az storage container legal-hold set \
  --account-name kvstgprod \
  --resource-group rg-storage-prod \
  --container-name compliance-archive \
  --tags "litigation-2026-0481"
# Clear it when released:
az storage container legal-hold clear \
  --account-name kvstgprod --resource-group rg-storage-prod \
  --container-name compliance-archive --tags "litigation-2026-0481"

allow-protected-append-writes is the pragmatic flag for log/append workloads: it lets you keep appending to existing append blobs while still blocking overwrites and deletes of committed data. Without it, even an append is rejected once the policy is on.

7. Locking policies and the implications for deletion

An unlocked time-based policy can be edited or deleted by an authorized user — it is for testing and ramp-up. A locked policy is the real compliance control, and locking is irreversible.

# Lock the policy. This requires the policy's current etag and CANNOT be undone.
ETAG=$(az storage container immutability-policy show \
  --account-name kvstgprod --resource-group rg-storage-prod \
  --container-name compliance-archive --query etag -o tsv)

az storage container immutability-policy lock \
  --account-name kvstgprod --resource-group rg-storage-prod \
  --container-name compliance-archive \
  --if-match $ETAG

What locking means in practice — read this twice before you run it in production:

Treat locking like signing a contract. In code review, a PR that locks an immutability policy should require the same scrutiny as one that deletes a database — because it is just as irreversible, in the opposite direction.

8. Validating recovery and estimating cost

Protection you have not exercised is a hypothesis. Two things to actually do.

Validate the recovery paths in a non-prod account that mirrors prod settings:

# 1) Soft delete works: delete then undelete, confirm content matches.
az storage blob delete --account-name kvstgnonprod -c test -n probe.txt
az storage blob undelete --account-name kvstgnonprod -c test -n probe.txt

# 2) Versioning works: overwrite, list versions, restore prior, diff.
az storage blob list --account-name kvstgnonprod -c test \
  --prefix probe.txt --include v -o table

# 3) PITR works: write garbage, restore to a pre-garbage timestamp, verify.
az storage blob restore --account-name kvstgnonprod -g rg-nonprod \
  --time-to-restore "2026-06-08T08:00:00Z" \
  --blob-range container-name="test" start-blob="" end-blob=""

# 4) Immutability works: confirm a delete is REJECTED (this should fail).
az storage blob delete --account-name kvstgnonprod -c compliance-archive -n locked.bin

The fourth check is the important one: a passing immutability test is a failed delete. If that delete succeeds, your WORM control is not actually protecting anything.

Estimate the protection cost before you flip everything on. The cost drivers, in rough order of impact:

Audit current consumption with a metrics query (Log Analytics / Azure Monitor):

StorageBlobLogs
| where TimeGenerated > ago(7d)
| where OperationName in ("PutBlob", "PutBlock", "DeleteBlob", "CopyBlob")
| summarize Operations = count(), Bytes = sum(RequestBodySize)
    by OperationName, bin(TimeGenerated, 1d)
| order by TimeGenerated desc

Use the operation/byte profile to model what versioning and soft delete will add: roughly, extra storage ~ (overwrite + delete volume) x retention/version lifetime. If the number is uncomfortable, tighten the lifecycle version.delete threshold rather than disabling protection.

Verify

Confirm the full stack with read-only commands. This is the post-deploy gate.

# Service-level protection settings in one shot
az storage account blob-service-properties show \
  --account-name kvstgprod --resource-group rg-storage-prod \
  --query "{ versioning: isVersioningEnabled, \
             changeFeed: changeFeed.enabled, \
             blobSoftDelete: deleteRetentionPolicy.enabled, \
             blobSoftDeleteDays: deleteRetentionPolicy.days, \
             containerSoftDelete: containerDeleteRetentionPolicy.enabled, \
             pitr: restorePolicy.enabled, \
             pitrDays: restorePolicy.days }" -o json

# Lifecycle policy present and enabled
az storage account management-policy show \
  --account-name kvstgprod --resource-group rg-storage-prod \
  --query "policy.rules[].{name:name, enabled:enabled}" -o table

# Immutability policy state on the compliance container
az storage container immutability-policy show \
  --account-name kvstgprod --resource-group rg-storage-prod \
  --container-name compliance-archive \
  --query "{ state: state, period: immutabilityPeriodSinceCreationInDays }" -o json

Expected results: versioning/change feed/both soft deletes/PITR all true; pitrDays strictly less than blobSoftDeleteDays; the lifecycle rule enabled; the immutability policy state reading Locked (not Unlocked) for anything in production compliance scope.

Enterprise scenario

A capital-markets platform team running trade-confirmation archives on Blob Storage had a clean SEC 17a-4 design: a compliance-archive container with a locked, 7-year time-based immutability policy. Months later their FinOps automation flagged the account: storage was growing ~4% a month with no corresponding business growth. The cause was a lifecycle rule meant to expire confirmations after 7 years — it was firing daily, attempting delete on blobs that were still inside their 7-year immutable window, silently failing on every run, and doing nothing while the data accumulated. Worse, a parallel team had pointed a high-churn manifest writer at the same account with versioning on and no version.delete rule, so every manifest rewrite was minting a permanent version.

The constraint was hard: the immutable policy was locked, so they could not shorten retention or delete anything early — that was non-negotiable and, legally, exactly correct. The fix was twofold. First, they reconciled the lifecycle delete threshold with the WORM period so the expiry rule only targeted blobs past their 7-year interval, ending the daily no-op failures and letting genuinely-expired data clear. Second, they isolated the high-churn manifests into a separate, non-immutable account and added a tight version-cleanup rule. The version rule that stopped the bleed:

{
  "rules": [
    {
      "enabled": true,
      "name": "cap-manifest-versions",
      "type": "Lifecycle",
      "definition": {
        "filters": { "blobTypes": [ "blockBlob" ], "prefixMatch": [ "manifests/" ] },
        "actions": {
          "version": {
            "tierToCool": { "daysAfterCreationGreaterThan": 7 },
            "delete":     { "daysAfterCreationGreaterThan": 30 }
          }
        }
      }
    }
  ]
}

The lessons that stuck: a lifecycle delete and a locked WORM policy will collide, and the collision is silent — the delete just fails and your bill keeps climbing; and immutable compliance data and high-churn operational data do not belong in the same account, because the protection settings you want for one are wrong for the other.

Checklist

storagebloblifecycleimmutabilitydata-protection

Comments

Keep Reading