Security Multi-Cloud

Building Enterprise PAM: Credential Vaulting, Session Brokering, and Automatic Rotation

Entra PIM activates a role for a window of time. That is identity governance, and it is necessary - but it is not privileged access management. PIM never sees the actual Domain Admin password on a domain controller, never brokers the RDP session into a tier-0 jump host, never rotates the root SSH key on a database appliance, and never records what the operator typed once they were in. A real PAM layer owns the credential and the session, not just the directory role. This guide builds that layer end to end: an HSM-backed vault with dual-control checkout, a session broker that records every keystroke and frame, just-in-time ephemeral accounts for tier-0 assets, and automated rotation with break-detection - and then hardens the PAM tier itself so it doesn’t become the single richest target in your estate.

I’ll use HashiCorp Vault Enterprise (or its open-source fork OpenBao) for the vault and Teleport for the session broker, because both are API-driven and run the same on Azure, AWS, on-prem, and air-gapped. The patterns transfer to CyberArk or Delinea if that’s your stack.

1. PAM vs PIM vs PEDM: scope a zero-standing-privilege target

Three distinct controls get conflated constantly. Naming them correctly is what makes the architecture coherent.

Control What it governs Example product What it does NOT do
PIM Time-bound activation of a role Entra PIM Hold or rotate the underlying credential; record the session
PAM The credential and the session Vault, CyberArk, Teleport Decide who is eligible (that’s PIM’s job upstream)
PEDM Privilege on the endpoint (sudo/UAC elevation) sudo policy, EPM, Defender Broker remote access or vault a shared secret

The target state is zero standing privilege (ZSP): no human account holds a usable tier-0 credential at rest. Every privileged action follows the same loop - request -> approve -> issue an ephemeral credential or check out a vaulted one -> broker and record the session -> revoke and rotate on exit. PIM (or your ITSM) answers “is this person eligible right now”; PAM answers “give them a credential that exists only for this session, and watch what they do with it.”

The mental shift: stop managing who knows the password and start managing who can ask for a credential that will not exist five minutes from now. A shared admin password known by twelve people is twelve standing risks. An ephemeral certificate valid for one 60-minute session is zero.

Define tiers before anything else (the classic Microsoft model): tier 0 = identity control plane (DCs, ADFS, PKI, the PAM system itself), tier 1 = servers and apps, tier 2 = workstations. Credentials and broker sessions must never cross tiers. A tier-2 admin workstation that can RDP to a domain controller has already collapsed the model.

2. Design the vault: secret stores, dual-control checkout, HSM-backed encryption

The vault is the root of trust, so its own master key cannot live in software. Initialize Vault with an HSM (or cloud KMS / managed HSM) as the auto-unseal and seal-wrap provider. On Azure that’s Managed HSM via PKCS#11 or the azurekeyvault seal; the principle is identical on AWS CloudHSM.

# vault.hcl - HSM-backed seal (PKCS#11), Raft storage
seal "pkcs11" {
  lib            = "/usr/lib/softhsm/libsofthsm2.so"  # vendor PKCS#11 lib in prod
  slot           = "0"
  pin            = "env://HSM_PIN"
  key_label      = "vault-unseal-key"
  hmac_key_label = "vault-hmac-key"
  generate_key   = "true"
}

storage "raft" {
  path    = "/opt/vault/data"
  node_id = "vault-tier0-1"
}

listener "tcp" {
  address       = "0.0.0.0:8200"
  tls_cert_file = "/etc/vault/tls/vault.crt"
  tls_key_file  = "/etc/vault/tls/vault.key"
  tls_min_version = "tls13"
}

# Seal wrapping encrypts critical values (tokens, recovery keys) with the HSM
seal_wrap = true

Run the secret engines that match your asset classes:

The control that turns a vault into a PAM vault is dual-control checkout - no single operator can retrieve a tier-0 secret alone. In Vault Enterprise this is a Control Group: the read returns a wrapping token, not the secret, until a second authorizer approves.

# Policy requiring two approvers from the platform-leads group
# before a tier-0 KV secret can be unwrapped
path "kv-tier0/data/domain-admin" {
  capabilities = ["read"]
  control_group = {
    factor "two_leads" {
      identity {
        group_names = ["platform-leads"]
        approvals   = 2
      }
    }
  }
}

The requester gets back an accessor; an approver authorizes it explicitly:

# Requester reads -> gets a wrapping token + accessor, NOT the secret
vault read kv-tier0/data/domain-admin
# -> wrapping_token: hvs.CAES..., wrapping_accessor: <accessor>

# A platform lead approves that specific request
vault write sys/control-group/authorize accessor="<accessor>"

# Once approvals are met, the requester unwraps the real value
vault unwrap hvs.CAES...

Cap the wrapping-token TTL so an un-approved checkout self-destructs:

vault write sys/config/control-group max_ttl=15m

3. Deploy the session broker for RDP/SSH with full recording

Direct network paths to privileged hosts are the failure. Operators must traverse a broker that authenticates them, injects the credential, and records the session - keystrokes for SSH, full frame capture for RDP. Teleport gives you SSH, Kubernetes, database, and (in its desktop service) RDP through one proxy, with recording handled by the Auth Service rather than the endpoint.

Set recording to node-sync so the recording streams node -> auth -> storage and is never written to disk on the host being administered. An attacker who compromises the target cannot tamper with or delete the evidence of their own session.

# teleport.yaml on the Auth Service
auth_service:
  enabled: true
  cluster_name: "pam.kloudvin.internal"
  # Synchronous recording, streamed off-host to durable storage
  session_recording: "node-sync"
  proxy_checks_host_keys: yes
  authentication:
    type: local
    second_factor: webauthn   # phishing-resistant MFA at the broker
    webauthn:
      rp_id: pam.kloudvin.internal

proxy_service:
  enabled: true
  public_addr: "pam.kloudvin.internal:443"
  https_keypairs:
    - cert_file: /etc/teleport/tls/proxy.crt
      key_file: /etc/teleport/tls/proxy.key

For Windows targets, enable the desktop service so RDP is brokered and recorded the same way SSH is - the operator never possesses the host’s local credentials, and the bitmap stream is captured for replay.

# teleport.yaml on the Windows Desktop Service host
windows_desktop_service:
  enabled: true
  ldap:
    addr:        "dc1.corp.kloudvin.internal:636"
    domain:      "corp.kloudvin.internal"
    username:    "CORP\\svc-teleport"
    server_name: "dc1.corp.kloudvin.internal"
  discovery:
    base_dn: "*"

Recordings are replayable per session and tied to the actor identity:

# List recorded sessions; replay one keystroke-for-keystroke / frame-for-frame
tsh recordings ls
tsh play <session-id>

4. Implement JIT account creation and ephemeral credentials for tier-0

This is where you actually delete standing privilege. Two mechanisms, depending on protocol.

SSH / Linux tier-0 - sign a certificate, don’t hand out a key. Configure the SSH CA engine; the target hosts trust the CA, and operators receive a certificate scoped to a single principal and TTL. There is no private key to steal because the cert dies in an hour.

# Enable and configure the SSH CA
vault secrets enable -path=ssh-ca ssh
vault write ssh-ca/config/ca generate_signing_key=true

# A role: max 1h certs, principal must be an approved admin account
vault write ssh-ca/roles/tier0-admin -<<'EOF'
{
  "key_type": "ca",
  "algorithm_signer": "rsa-sha2-256",
  "allowed_users": "adm-*",
  "default_user": "adm-break-glass",
  "ttl": "60m",
  "max_ttl": "60m",
  "allow_user_certificates": true
}
EOF

# Operator signs THEIR public key; gets a 60-minute cert, no shared secret
vault write -field=signed_key ssh-ca/sign/tier0-admin \
  public_key=@$HOME/.ssh/id_ed25519.pub valid_principals="adm-break-glass" \
  > $HOME/.ssh/id_ed25519-cert.pub

On every tier-0 host, trust the CA’s public key:

# /etc/ssh/sshd_config on the target
TrustedUserCAKeys /etc/ssh/vault_ca.pub

Windows / database tier-0 - create the account on checkout, delete it on return. Vault’s AD library and database engines generate a real, randomized credential per request and reclaim it when the lease ends.

# Database engine: a Postgres admin role that exists only for the lease
vault write database/roles/dba-jit \
  db_name=prod-pg \
  creation_statements="CREATE ROLE \"{{name}}\" WITH LOGIN PASSWORD '{{password}}' VALID UNTIL '{{expiration}}' IN ROLE pg_read_all_data, pg_write_all_data;" \
  revocation_statements="DROP ROLE IF EXISTS \"{{name}}\";" \
  default_ttl="30m" max_ttl="1h"

# Checkout: a brand-new user, alive for 30 minutes, then DROPped automatically
vault read database/creds/dba-jit

The result: query pg_roles or AD an hour later and the account is gone. There is no persistent tier-0 identity for an attacker to phish, spray, or kerberoast.

5. Automated rotation with reconciliation and break-detection

Some credentials can’t be ephemeral - the local administrator on a SAN controller, a vendor appliance with one service account. For those, rotation is the control, and rotation without reconciliation is a liability: if the real password drifts from the vaulted value, the next checkout fails at exactly the wrong moment.

Use the vault to own rotation so the password is changed on the device and stored atomically:

# AD: enroll a service account; Vault rotates it on a TTL and serves checkouts
vault write ad/config \
  binddn="CN=svc-vault,OU=Service,DC=corp,DC=kloudvin,DC=internal" \
  bindpass="${VAULT_AD_BIND}" \
  url="ldaps://dc1.corp.kloudvin.internal" \
  ttl=24h max_ttl=24h

vault write ad/rotate-root          # rotate Vault's own bind account first
vault read  ad/creds/svc-appliance  # serves current pwd; rotates per TTL

Break-detection is the piece teams skip. A reconciliation job periodically tests that the vaulted credential still authenticates against the target; a failure means out-of-band tampering or drift, and should page, not silently retry.

#!/usr/bin/env bash
# reconcile.sh - verify vaulted creds still authenticate; alert on break
set -euo pipefail
PW=$(vault read -field=current_password ad/creds/svc-appliance)
if ! ldapwhoami -x -H ldaps://dc1.corp.kloudvin.internal \
     -D "CORP\\svc-appliance" -w "$PW" >/dev/null 2>&1; then
  echo "RECONCILE BREAK: svc-appliance vaulted credential rejected" >&2
  curl -fsS -X POST "$ALERT_WEBHOOK" \
    -d '{"severity":"critical","msg":"PAM reconcile break: svc-appliance"}'
  exit 1
fi

For SSH keys, rotation means re-signing the CA-trusted set or rotating the CA itself on a cadence; because hosts trust the CA and not individual keys, rotating an operator’s key is a no-op on the fleet - you simply stop signing it.

6. Secure the PAM tier itself: isolation, jump hosts, tamper-proof logs

The vault and broker now hold the keys to the kingdom, so the PAM tier is tier-0 by definition and must be the hardest target you run.

# Azure NSG: only the broker subnet may reach the Vault data plane
resource "azurerm_network_security_rule" "vault_from_broker_only" {
  name                        = "allow-broker-to-vault-8200"
  priority                    = 100
  direction                   = "Inbound"
  access                      = "Allow"
  protocol                    = "Tcp"
  source_address_prefix       = "10.40.2.0/24"   # broker subnet ONLY
  source_port_range           = "*"
  destination_port_range      = "8200"
  destination_address_prefix  = "10.40.1.0/24"   # vault subnet
  resource_group_name         = azurerm_resource_group.pam.name
  network_security_group_name = azurerm_network_security_group.vault.name
}
# Enable a file audit device; ship the file off-host in real time
vault audit enable file file_path=/var/log/vault/audit.log

# Forward to immutable, WORM-backed storage (append-only)
# e.g. fluent-bit -> object store with object-lock / immutability policy

Pair this with break-glass: a sealed, offline copy of recovery keys (Shamir shares split across custodians) so that losing the HSM or the cluster never means losing the estate - and so no single person can unilaterally unseal.

7. Session analytics: anomaly detection, four-eyes, SIEM forwarding

Recording is evidence; analytics is detection. Forward broker and vault audit events to your SIEM and alert on the privileged-access-specific signals that generic logging misses.

# session_join_policy: a peer must moderate tier-0 SSH sessions live
kind: session_join_policy
version: v2
metadata:
  name: require-moderator-tier0
spec:
  roles: ["tier0-admin"]
  require_session_join:
    - name: "peer-moderator"
      filter: 'contains(user.roles, "platform-lead")'
      kinds: ["ssh"]
      count: 1
      modes: ["moderator"]
VaultAudit_CL
| where type_s == "response" and request_path_s startswith "kv-tier0/"
| extend actor = auth_display_name_s, ts = TimeGenerated
| join kind=leftouter (
    ChangeRequests_CL
    | project change_actor_s, window_start_t, window_end_t
  ) on $left.actor == $right.change_actor_s
| where isnull(window_start_t) or ts < window_start_t or ts > window_end_t
| project ts, actor, request_path_s, AlertReason = "tier0 checkout outside change window"

8. Rollout playbook: discovery, onboarding, measuring coverage

A PAM program fails on rollout, not technology. Sequence it.

  1. Discover. You cannot vault what you cannot see. Inventory every privileged account - local admins, domain admins, service accounts, SSH keys, appliance logins, cloud break-glass. Scan AD for adminCount=1, enumerate local admins fleet-wide, and grep config/IaC for embedded keys.
  2. Onboard read-only first. Bring accounts under vault management (rotation + checkout) before you cut the standing paths. Let operators keep working while you prove the broker path.
  3. Cut standing access by tier, tier-0 first. Remove the standing credential only once the JIT/broker path is proven for that asset class. Reverse the usual instinct - tier 0 is highest risk, so it gets ZSP first, with break-glass as the safety net.
  4. Measure coverage relentlessly. The KPI is percentage of privileged sessions that went through the broker and percentage of privileged accounts with zero standing usable credential. A PAM deployment at 60% coverage with a back door is worth less than honest 95%.
# Coverage signal: privileged accounts NOT yet onboarded to the vault
# Cross-reference your privileged-account inventory against vaulted accounts.
comm -23 \
  <(sort privileged_accounts.txt) \
  <(vault list -format=json ad/roles | jq -r '.[]' | sort) \
  | tee unmanaged_privileged_accounts.txt | wc -l

Enterprise scenario

A regional bank ran Entra PIM for cloud roles and believed it had “PAM covered.” During an audit, the examiners asked a simple question: who used the Domain Admin account last Tuesday, and what did they do? Nobody could answer. PIM had governed the cloud roles, but the on-prem Domain Admin was a shared password in a spreadsheet vault, known to nine engineers, with no session record. Worse, the same engineers RDP’d to domain controllers directly from their daily-driver laptops - a tier-2-to-tier-0 collapse.

The constraint: a 90-day remediation window from the regulator, a legacy core-banking appliance whose vendor refused to support certificate auth or ephemeral accounts (it only accepts a static local admin password), and a hard requirement for moderated four-eyes on every tier-0 session.

The platform team’s solution split the problem cleanly. For everything that could go ephemeral - Windows admins, Postgres DBAs, Linux hosts - they moved to Vault-issued JIT credentials and SSH certificates with no standing accounts, brokered and recorded through Teleport in node-sync mode. For the one stubborn appliance, they accepted a vaulted static secret but wrapped it in the strongest possible controls: Control Group dual-approval on checkout, mandatory rotation after every single use (not on a timer), and reconciliation break-detection so any out-of-band change paged immediately. The four-eyes requirement was met with moderated sessions - a second engineer had to join live before the appliance session could start.

# The appliance: static secret, but rotate-after-every-use + dual control
path "kv-appliance/data/core-banking-admin" {
  capabilities = ["read"]
  control_group = {
    factor "two_leads" {
      identity { group_names = ["platform-leads"]; approvals = 2 }
    }
  }
}

Rotation-after-use was wired as a post-session hook: the session-close event from the broker triggered ad/rotate-role (or a vendor API call) so the password that just got used was already dead before the operator’s terminal closed. They passed re-audit at 100% tier-0 session coverage and a full replayable record of every privileged action. The lesson: PIM answered “who is eligible,” but only PAM answered “who used the credential, when, and what did they type” - and the one credential they couldn’t make ephemeral became the most tightly controlled, not the exception that was waved through.

Verify

Confirm the controls actually hold before declaring victory.

# 1. Dual-control is enforced: a single operator gets a wrapping token, not the secret
vault read kv-tier0/data/domain-admin
# EXPECT: wrapping_token + accessor, NOT the password field

# 2. JIT account truly disappears after lease expiry (run >max_ttl later)
vault read database/creds/dba-jit         # note the username
# ...wait past max_ttl, then on the DB:
psql -h prod-pg -c "SELECT rolname FROM pg_roles WHERE rolname = '<that-user>';"
# EXPECT: 0 rows

# 3. Recording is off-host and replayable
tsh recordings ls && tsh play <session-id>
# EXPECT: a session you can scrub; confirm nothing was written to the target's disk

# 4. Reconciliation break-detection fires on drift
#    Change the appliance password out-of-band, then:
./reconcile.sh
# EXPECT: non-zero exit + critical alert

# 5. Audit failing closed
#    Make the audit path unwritable, then attempt any op:
vault kv get kv-tier0/data/anything
# EXPECT: request blocked (audit device must succeed)

Checklist

PAMprivileged-accesssession-recordingcredential-rotationzero-standing-privilege

Comments

Keep Reading