Standard Key Vault gives you a multi-tenant, HSM-backed key store that Microsoft operates for you. That is the right tool for most workloads, and the wrong tool when your threat model says Microsoft operators must never be able to touch your key material, when a regulator demands a single-tenant HSM at FIPS 140-3 Level 3, or when you want a key to be physically unusable outside a CPU you have cryptographically verified. That is the territory of Azure Managed HSM with Secure Key Release (SKR): a key that exists only inside dedicated HSM hardware and is released, wrapped, only to a confidential VM whose AMD SEV-SNP report you have attested.
This is an expert path involving a quorum key ceremony, an offline security domain, and an attestation pipeline. Read it end to end before provisioning anything - some decisions here are irreversible.
1. When you actually need Managed HSM
Reach for Managed HSM, not the Premium SKU of Key Vault, when at least one of these is true:
- Single-tenant isolation. Managed HSM gives you a pool of dedicated HSM partitions. No other Azure customer shares the cryptographic boundary. The Premium vault is multi-tenant HSM-backed.
- FIPS 140-3 Level 3. Managed HSM is validated at FIPS 140-3 Level 3. (Key Vault Premium is FIPS 140-2 Level 3.) If a control framework names Level 3 explicitly, this is the line.
- You own the security domain. Activation produces a security domain encrypted to your RSA keys under a quorum (M of N). Microsoft cannot decrypt it. This is the property that lets you assert operator exclusion.
- Secure Key Release. Both Premium Key Vault and Managed HSM support SKR, but when SKR is the whole point and the keys are high-value, the single-tenant boundary is what auditors want to see.
Cost and operational weight are real. A Managed HSM pool is billed per hour regardless of key count and obligates you to custody of the security domain backup. Do not provision one to hold three secrets. Provision one to be the root of trust for an estate.
2. Provision and activate the pool
Managed HSM has two phases: the resource is provisioned (created in ARM, but cryptographically inert) and then activated (the security domain is generated and downloaded). Between those two steps you supply the quorum of RSA public keys.
Create the pool. The --administrators flag takes Entra object IDs that become the initial local RBAC administrators. --retention-days sets the soft-delete window and cannot be lowered after creation.
az keyvault create --hsm-name kv-mhsm-prod \
--resource-group rg-security-core \
--location eastus2 \
--retention-days 90 \
--administrators "$(az ad signed-in-user show --query id -o tsv)"
Provisioning returns once the pool exists but reports securityDomain as not activated. Generate three or more RSA key pairs for the quorum holders. In production these private keys live on separate hardware tokens held by separate humans; here we generate them for illustration.
for i in 1 2 3; do
openssl req -newkey rsa:2048 -nodes -keyout sd-key-$i.key \
-x509 -days 3650 -subj "/CN=sd-holder-$i" -out sd-cert-$i.cer
done
Activate with a quorum. --sd-quorum 2 means any two of the three certificate holders must cooperate to ever decrypt the security domain.
az keyvault security-domain download --hsm-name kv-mhsm-prod \
--sd-wrapping-keys sd-cert-1.cer sd-cert-2.cer sd-cert-3.cer \
--sd-quorum 2 \
--security-domain-file kv-mhsm-prod-SD.json
The downloaded kv-mhsm-prod-SD.json is the security domain, encrypted such that 2-of-3 private keys are required to recover it. This file plus the private keys are the only way to recover the HSM into a new pool after a disaster. Lose them and the keys are gone permanently - that is the design, not a bug.
3. Design the quorum and key ceremony
The quorum is a governance decision, not a technical default. Get it wrong and you either cannot recover (quorum too high, holders unavailable) or you have weak custody (quorum too low). A workable pattern for a regulated estate:
| Role | Count (N) | Quorum (M) | Custody |
|---|---|---|---|
| Security domain holders | 5 | 3 | FIPS-validated tokens, 3 sites |
| HSM administrators (RBAC) | 3-4 | n/a | Break-glass + PIM-elevated |
Run the activation as a formal ceremony: witnessed, scripted, with each private key generated on its holder’s token and never exported. Record certificate thumbprints in your CMDB. The output you protect forever is the SD file plus the holders’ private keys, stored separately so no single location can both decrypt and reach the file.
After activation, switch the data-plane authentication to local RBAC. Managed HSM does not use Azure RBAC for data operations - it has its own built-in role model evaluated by the HSM itself. Assign the narrowest roles. A crypto user can use keys but not manage them; only an administrator can change role assignments.
# Service identity that will only *use* keys, never manage them
az keyvault role assignment create --hsm-name kv-mhsm-prod \
--role "Managed HSM Crypto User" \
--assignee "<app-managed-identity-object-id>" \
--scope /keys
# Separate identity allowed to import keys (BYOK)
az keyvault role assignment create --hsm-name kv-mhsm-prod \
--role "Managed HSM Crypto Officer" \
--assignee "<key-import-pipeline-object-id>" \
--scope /keys
4. Import on-prem keys with BYOK and verify provenance
If a key was generated on your on-premises HSM and policy says it must never have existed in software, you import it under wrap (BYOK) so the plaintext key never transits Azure unencrypted. The flow:
- Download the Key Exchange Key (KEK) public key from the target HSM. The KEK is an RSA-HSM key generated inside the Managed HSM with
importin its key operations. - On your on-prem HSM, wrap the target key to that KEK using the vendor’s BYOK tool, producing a Key Transfer Blob.
- Upload the blob. The Managed HSM unwraps it inside the cryptographic boundary.
Create the KEK inside the HSM first - note it is hardware-backed (RSA-HSM) and marked for import:
az keyvault key create --hsm-name kv-mhsm-prod \
--name byok-kek --kty RSA-HSM --size 4096 \
--ops import \
--immutable false
Your HSM vendor’s BYOK tooling consumes the KEK public key and emits a transfer blob (key-transfer.byok). Import it:
az keyvault key import --hsm-name kv-mhsm-prod \
--name payments-wrap-key \
--byok-file key-transfer.byok
To prove provenance to an auditor, request the key’s attestation - Managed HSM can return a signed statement, chained to the Microsoft HSM vendor root, asserting the key is non-exportable and resident in the HSM. Pull it and validate the certificate chain offline:
az keyvault key get-attestation --hsm-name kv-mhsm-prod \
--name payments-wrap-key \
--file payments-wrap-key.attest
The attestation bundle contains the certificate chain plus the attestation blobs. Validate the chain against the vendor root certificate published by Microsoft; that is what turns “we promise it’s in the HSM” into a verifiable claim.
5. Configure the secure key release policy
This is the heart of the design. A key marked exportable with a release policy can be exported - but only when the caller presents an attestation token from Microsoft Azure Attestation (MAA) whose claims satisfy the policy. The key leaves the HSM only as a wrapped blob bound to that attested environment.
Write the policy as a BearerToken-grammar JSON document. The authority must match your MAA instance, and allOf/anyOf express the claim conditions. A minimal policy that gates on a SEV-SNP confidential VM:
{
"version": "1.0.0",
"anyOf": [
{
"authority": "https://sharedeus2.eus2.attest.azure.net",
"allOf": [
{
"claim": "x-ms-isolation-tee.x-ms-attestation-type",
"equals": "sevsnpvm"
},
{
"claim": "x-ms-isolation-tee.x-ms-compliance-status",
"equals": "azure-compliant-cvm"
}
]
}
]
}
Create the releasable key with that policy attached. The --exportable flag is meaningless without --policy; together they mean “exportable only under attestation.”
az keyvault key create --hsm-name kv-mhsm-prod \
--name cvm-data-key --kty RSA-HSM --size 3072 \
--exportable true \
--policy @skr-policy.json
x-ms-compliance-status: azure-compliant-cvm is doing heavy lifting: it asserts MAA validated the SEV-SNP report against Azure’s baseline (genuine AMD silicon, expected firmware, secure boot state). Pin additional claims - x-ms-sevsnpvm-hostdata to bind to a specific guest image measurement, or x-ms-sevsnpvm-bootloader-svn to enforce a minimum firmware version - when you need to tie release to one exact workload.
6. Release the key inside a confidential VM
On the confidential VM (a DCasv5/ECasv5-series SEV-SNP guest), the workload obtains an MAA token and exchanges it for the wrapped key. The clean path uses the Azure SKR tooling, which talks to the in-guest attestation client to fetch a fresh SEV-SNP report and have MAA sign it.
Boot a confidential VM with a Microsoft-defaulted attestation configuration:
az vm create --resource-group rg-confidential \
--name cvm-payments-01 \
--image "Canonical:ubuntu-24_04-lts:cvm:latest" \
--size Standard_DC4as_v5 \
--security-type ConfidentialVM \
--enable-vtpm true --enable-secure-boot true \
--os-disk-security-encryption-type DiskWithVMGuestState \
--admin-username azureuser --generate-ssh-keys
Inside the guest, perform the release. The AzureAttestSKR helper (shipped in Microsoft’s confidential-computing CVM guest-attestation repo) fetches the report, gets the MAA token, calls the /keys/{name}/release endpoint, and returns the key:
sudo ./AzureAttestSKR \
-a https://sharedeus2.eus2.attest.azure.net \
-k https://kv-mhsm-prod.managedhsm.azure.net/keys/cvm-data-key \
-c imds \
-u # unwrap into the guest TEE
If you are wiring this into your own service, the data-plane call is the REST release operation with the MAA JWT in the body. The HSM re-validates the token’s signature and claims against the key’s policy before responding with the wrapped key:
POST https://kv-mhsm-prod.managedhsm.azure.net/keys/cvm-data-key/release?api-version=7.4
Content-Type: application/json
{ "target": "<MAA-attestation-jwt>" }
The response is the key wrapped to the attested environment. A request from a non-attested host, or one whose SEV-SNP report fails MAA validation, never gets a usable key - the HSM refuses the release.
7. Lock down the perimeter
The cryptographic controls above are worthless if the management and data planes are reachable from anywhere. Three controls, all mandatory for production:
Private endpoint. Put the HSM behind Private Link and set public network access to disabled so the data plane is only reachable from your VNets.
az network private-endpoint create --name pe-mhsm \
--resource-group rg-security-core \
--vnet-name vnet-hub --subnet snet-privatelink \
--private-connection-resource-id "$(az keyvault show --hsm-name kv-mhsm-prod --query id -o tsv)" \
--group-id managedhsm \
--connection-name mhsm-conn
az keyvault update-hsm --hsm-name kv-mhsm-prod \
--resource-group rg-security-core \
--public-network-access Disabled
Purge protection. With purge protection on, even a soft-deleted HSM cannot be permanently removed before the retention window elapses - this defeats a ransomware-style “delete the keys” attack. It is irreversible once enabled.
az keyvault update-hsm --hsm-name kv-mhsm-prod \
--resource-group rg-security-core \
--enable-purge-protection true
Least-privilege RBAC. Keep the administrator role count tiny and behind PIM. Crypto Officers import and rotate; Crypto Users only sign, wrap, and release. No identity should hold both Administrator and Crypto Officer in steady state.
8. Operational runbook
Treat the HSM as a system with a lifecycle, not a vault you fill and forget.
Full backup. Back up the entire HSM (all keys, versions, and RBAC) on a schedule to a storage container the HSM identity can write to:
az keyvault backup start --hsm-name kv-mhsm-prod \
--blob-container-name mhsm-backups \
--storage-account-name stsecbackups \
--use-managed-identity true
Disaster recovery. Recovery into a brand-new pool in another region requires the security domain file and the quorum of private keys - this is the only path, which is why custody of those artifacts is the single most important control you own. Provision a fresh pool, then upload the SD with the quorum to reconstitute it.
Key lifecycle. SKR keys should rotate on a defined cadence. A new version inherits the release policy; old versions stay readable for in-flight unwraps until you disable them. Drive this with a rotation policy and watch the audit log.
Enterprise scenario
A European payments platform had to satisfy a banking supervisor’s requirement that the data-encryption keys for cardholder data be (a) in a single-tenant FIPS 140-3 Level 3 HSM and (b) provably unusable by anyone other than the production tokenization service - explicitly including the cloud provider’s own operators. Their first design used Key Vault Premium with RBAC, but the audit failed on two points: the HSM was multi-tenant, and “operators cannot use the key” could not be demonstrated, only asserted.
They moved to Managed HSM and inverted the trust model with SKR. The tokenization service runs on DCasv5 SEV-SNP confidential VMs from a hardened, measured image. They generated the master key in the HSM as exportable with a release policy pinned not just to sevsnpvm and azure-compliant-cvm but to the exact image measurement via x-ms-sevsnpvm-hostdata, so only that one build of the tokenization service could ever obtain the key:
{
"claim": "x-ms-isolation-tee.x-ms-sevsnpvm-hostdata",
"equals": "9c8e...the-expected-image-measurement...f1"
}
The result satisfied the supervisor cleanly. The single-tenant Level 3 boundary covered requirement (a). For (b), the security domain (3-of-5 quorum, tokens held by the bank’s own officers across three sites) meant Microsoft mathematically could not decrypt the HSM, and the host-data-pinned release policy meant the key could be unwrapped only inside the attested, measured tokenization VMs - not by an operator, not by a rebuilt image, not from any non-confidential host. The one painful lesson: their first hostdata value was wrong because they hashed the image artifact instead of using the measurement MAA actually reports, and every release silently failed with a policy mismatch until they read the rejected token’s claims and corrected the pin.
Verify
Confirm the system behaves as designed, not just as configured.
- Activation and SD custody.
az keyvault show --hsm-name kv-mhsm-prod --query securityDomainPropertiesreports activated; the SD file and quorum keys are in separate custody and a restore has been rehearsed in a non-prod pool. - Release succeeds only when attested. From inside the confidential VM,
AzureAttestSKRreturns the key. From any non-confidential host or a CVM with a tampered image, thereleasecall is rejected. - Policy actually gates. Temporarily set a bogus
authorityin a test key’s policy and confirm release fails - proving the claims are evaluated, not ignored. - Perimeter closed.
az keyvault show --hsm-name kv-mhsm-prod --query properties.publicNetworkAccessreturnsDisabled; the data plane resolves only over the private endpoint. - Provenance verifiable. The BYOK key’s attestation bundle validates against the published vendor root chain offline.