DevOps Platform

Deploy a Self-Hosted HashiCorp Boundary Cluster for Brokered SSH and RDP Access

A mid-size insurer has just failed an audit finding on privileged access: thirty-odd engineers and three managed-service vendors reach production Linux and Windows hosts through a flat VPN, with shared bastion.pem keys passed around in chat and a domain admin RDP account whose password last rotated when the bastion was built. The auditor’s note is blunt — there is no per-user attribution on a host, no just-in-time grant, and the standing network path means a single phished laptop owns the whole estate. The mandate from the CISO is to retire the VPN-plus-jumpbox model and put a brokered, identity-gated access layer in front of every SSH and RDP target, self-hosted because the regulated workloads cannot egress to a SaaS control plane. This guide stands up exactly that with HashiCorp Boundary: a controller cluster, a tier of session workers reachable from the private subnets, Okta-federated-to-Entra SSO at the front, and HashiCorp Vault brokering short-lived credentials so the engineer who connects never sees a password or a private key.

Boundary’s model is worth one sentence before the commands. A user authenticates to the controller (the control plane — API, auth, policy, session orchestration), the controller authorizes them against a target, and then a worker (the data plane) proxies the actual TCP session to the private host. The user’s client only ever talks to the worker; the worker only ever talks to the host. There is no standing network route from the laptop to the target, and no credential ever lands on the laptop.

Prerequisites

Target topology

Deploy a Self-Hosted HashiCorp Boundary Cluster for Brokered SSH and RDP Access — topology

The deployment splits cleanly into a control plane and a data plane, and keeping them separate in your head is the whole point of Boundary.

Auxiliary tooling that operates around the cluster: CrowdStrike Falcon sensors run on every controller and worker node for runtime threat detection; Wiz (with Wiz Code scanning the Terraform in the pipeline) continuously checks the cloud posture and flags any drift that would re-open a direct path to a host; Dynatrace ingests the controller/worker metrics and traces; ServiceNow holds the change record and the just-in-time access approvals; and GitHub Actions with Terraform applies the whole estate as code while Ansible bakes the node images.

1. Provision the database, KMS keys, and node images

Boundary needs a Postgres database and KMS keys it can reach before the first controller boots. Provision these with Terraform so the estate is reproducible and Wiz Code can scan the plan in the pipeline before anything lands.

# kms.tf — three purposes: never share one key across roles
resource "aws_kms_key" "boundary_root"   { description = "boundary-root"   }
resource "aws_kms_key" "boundary_recovery"{ description = "boundary-recovery" }
resource "aws_kms_key" "boundary_worker" { description = "boundary-worker-auth" }

resource "aws_kms_alias" "root"     { name = "alias/boundary-root"     target_key_id = aws_kms_key.boundary_root.id }
resource "aws_kms_alias" "recovery" { name = "alias/boundary-recovery" target_key_id = aws_kms_key.boundary_recovery.id }
resource "aws_kms_alias" "worker"   { name = "alias/boundary-worker"   target_key_id = aws_kms_key.boundary_worker.id }

Create the database and a least-privilege role:

CREATE DATABASE boundary;
CREATE ROLE boundary WITH LOGIN PASSWORD 'set-via-vault-not-here';
GRANT ALL PRIVILEGES ON DATABASE boundary TO boundary;

Bake the controller and worker VM images with Ansible so the boundary binary, the CrowdStrike Falcon sensor, and the Dynatrace OneAgent are baked in rather than installed at boot. A minimal play:

- hosts: boundary_nodes
  become: true
  tasks:
    - name: Install Boundary binary
      ansible.builtin.unarchive:
        src: "https://releases.hashicorp.com/boundary/0.16.2+ent/boundary_0.16.2+ent_linux_amd64.zip"
        dest: /usr/local/bin
        remote_src: true
    - name: Install CrowdStrike Falcon sensor
      ansible.builtin.apt: { deb: /opt/falcon-sensor.deb }
    - name: Install Dynatrace OneAgent
      ansible.builtin.command: /opt/dynatrace/oneagentctl --set-host-group=boundary

2. Initialize the controller and run the database migration

On the first controller node, write the controller config. The kms stanzas point at the KMS aliases from step 1; the database.url points at Postgres.

# /etc/boundary/controller.hcl
disable_mlock = true

controller {
  name        = "boundary-controller-1"
  description = "KloudVin Boundary controller"
  database {
    url = "env://BOUNDARY_PG_URL"   # postgres://boundary:...@db:5432/boundary
  }
}

listener "tcp" { purpose = "api"     address = "0.0.0.0:9200"  tls_disable = true }   # TLS terminated at Akamai/LB
listener "tcp" { purpose = "cluster" address = "0.0.0.0:9201" }

kms "awskms" { purpose = "root"          key_id = "alias/boundary-root"     }
kms "awskms" { purpose = "recovery"      key_id = "alias/boundary-recovery" }
kms "awskms" { purpose = "worker-auth"   key_id = "alias/boundary-worker"   }

Run the one-time schema migration, then start the service. Run database init on exactly one node:

export BOUNDARY_PG_URL="postgres://boundary:$(vault kv get -field=password secret/boundary/db)@db.internal:5432/boundary?sslmode=verify-full"

# First node only — creates schema + bootstrap org/auth/role; capture the output
boundary database init -config /etc/boundary/controller.hcl

# All controller nodes
systemctl enable --now boundary-controller

database init prints a bootstrap auth method, an initial admin login name and password, and a generated org/project scope. Store these in Vault immediately and never in the repo. Bring up controllers 2 and 3 with the same config (changing only controller.name) — they share the database and KMS, so they form an HA set behind the internal load balancer automatically.

3. Enroll workers into the data plane

Workers are what actually reach your private hosts, so they live in the target subnets and register upstream to the controllers. Use controller-led (PKI) worker registration, which is the model that does not require pre-sharing a static token.

# /etc/boundary/worker.hcl
disable_mlock = true

listener "tcp" { purpose = "proxy" address = "0.0.0.0:9202" }

worker {
  public_addr        = "worker-1.private.kloudvin.internal:9202"   # what clients are told to dial
  initial_upstreams  = ["controller-lb.internal:9201"]
  tags { region = ["ap-south-1"], zone = ["app-private"] }         # used by target worker filters
}

kms "awskms" { purpose = "worker-auth" key_id = "alias/boundary-worker" }

Start the worker, grab its auth request token from the log, and approve it on the controller:

systemctl enable --now boundary-worker
journalctl -u boundary-worker | grep -m1 "Worker Auth Registration Request"
# -> copy the token string, then on an admin client:

boundary workers create worker-led \
  -worker-generated-auth-token "<token-from-log>" \
  -name "worker-app-private-1" \
  -description "ap-south-1 / app-private zone"

boundary workers list   # confirm it shows active=true

Repeat for the second worker (and one per additional network zone). Workers heartbeat to the controllers; if a zone has no healthy worker, targets there are simply unreachable — which is the safe failure direction.

4. Federate identity through Okta and Entra ID (OIDC)

Engineers authenticate with their corporate identity, not a Boundary-local password. Okta is the workforce IdP; it federates to Microsoft Entra ID, and Boundary trusts the Entra-issued OIDC token. Register Boundary as an app in Entra (redirect URI https://boundary.kloudvin.com/v1/auth-methods/oidc:authenticate:callback), then create the OIDC auth method:

boundary auth-methods create oidc \
  -issuer "https://login.microsoftonline.com/<tenant-id>/v2.0" \
  -client-id "<entra-app-client-id>" \
  -client-secret "$(vault kv get -field=secret secret/boundary/oidc)" \
  -signing-algorithm "RS256" \
  -api-url-prefix "https://boundary.kloudvin.com" \
  -claims-scopes "groups" \
  -name "entra-sso" -description "Okta->Entra workforce SSO"

# Make it usable and primary for the org
boundary auth-methods change-state oidc -id <amoidc_id> -state "active-public" -primary

Map Entra group claims to Boundary principals with managed groups, so membership is driven by the directory, not maintained by hand:

boundary managed-groups create oidc \
  -auth-method-id <amoidc_id> \
  -filter '"prod-ssh-admins" in "/token/groups"' \
  -name "prod-ssh-admins"

Now an engineer who is in the prod-ssh-admins Entra group (sourced from Okta) automatically lands in the matching Boundary managed group — the directory is the single source of truth.

5. Broker SSH credentials from Vault for Linux targets

This is the step that retires shared .pem files. Rather than store keys in Boundary, configure a Vault credential store and have Vault’s ssh secrets engine sign a short-lived certificate per session. On the Vault side, enable the SSH CA:

vault secrets enable -path=ssh-client-signer ssh
vault write ssh-client-signer/config/ca generate_signing_key=true
vault write ssh-client-signer/roles/boundary - <<EOF
{ "key_type":"ca","algorithm_signer":"rsa-sha2-256","allow_user_certificates":true,
  "allowed_users":"*","default_extensions":{"permit-pty":""},"ttl":"5m" }
EOF

Put the Vault CA public key in each Linux host’s sshd_config as a TrustedUserCAKeys file (push it with Ansible) so hosts accept Vault-signed certs and nothing else. Then wire Boundary to that Vault path:

# A token with policy to use ssh-client-signer, set to renew
boundary credential-stores create vault \
  -scope-id <project_id> \
  -vault-address "https://vault.kloudvin.internal:8200" \
  -vault-token "$(vault token create -policy=boundary-ssh -period=20m -field=token)" \
  -name "vault-ssh-ca"

boundary credential-libraries create vault-ssh-certificate \
  -credential-store-id <cs_id> \
  -vault-path "ssh-client-signer/sign/boundary" \
  -username "ec2-user" \
  -key-type "ecdsa" -key-bits 256 \
  -name "linux-ssh-cert"

6. Define targets and attach the brokered credentials

A target is the host (or host set) plus the worker filter that says which worker proxies it and the credential library that injects the secret. Create an SSH target for a private Linux host, scoped to the right worker zone:

boundary targets create ssh \
  -scope-id <project_id> \
  -name "prod-db-linux" \
  -default-port 22 \
  -egress-worker-filter '"app-private" in "/tags/zone"' \
  -address "10.20.4.11"

# Inject the Vault-signed cert at connect time (brokered, not visible to the user)
boundary targets add-credential-sources \
  -id <tssh_id> \
  -brokered-credential-source <linux-ssh-cred-lib-id>

For Windows RDP, broker the credential as a Vault-issued, short-TTL local/AD account instead of a static domain admin. Create a generic TCP target on 3389 with a username/password credential library backed by Vault’s AD or kv secrets engine:

boundary targets create tcp \
  -scope-id <project_id> \
  -name "prod-win-rdp" \
  -default-port 3389 \
  -egress-worker-filter '"app-private" in "/tags/zone"' \
  -address "10.20.4.40"

boundary targets add-credential-sources \
  -id <ttcp_id> \
  -brokered-credential-source <vault-ad-cred-lib-id>

Finally, grant access by binding the Entra-sourced managed group to a role on the project scope, with a grant to connect to targets:

boundary roles create -scope-id <project_id> -name "prod-ssh-admins-connect"
boundary roles add-principals -id <role_id> -principal <managed_group_id>
boundary roles add-grant-strings -id <role_id> \
  -grant "ids=*;type=target;actions=authorize-session" \
  -grant "ids=*;type=session;actions=read:self,cancel:self"

Validation

Prove the full path end to end from an engineer’s laptop. First authenticate through Entra, then connect — the credential is injected, never shown.

# 1. SSO login via the OIDC auth method (opens browser to Okta -> Entra)
boundary authenticate oidc -auth-method-id <amoidc_id>

# 2. List what this identity is allowed to reach
boundary targets list -scope-id <project_id>

# 3. Brokered SSH — no key on disk, Vault signs a 5-min cert per session
boundary connect ssh -target-id <tssh_id>
#   you land on 10.20.4.11 as ec2-user; `last` on the host shows your identity

# 4. Brokered RDP — Boundary opens a local proxy port; point mstsc/Remmina at it
boundary connect rdp -target-id <ttcp_id>

Then confirm the controls actually hold:

# Active sessions are visible and cancellable centrally
boundary sessions list -scope-id <project_id> -recursive
# On a target host, no standing route exists outside an active session:
#   from the laptop, a direct `ssh 10.20.4.11` MUST fail (no network path)

Check that Dynatrace shows the controller :9200 and worker :9202 listeners healthy, that CrowdStrike Falcon reports both node types as protected, and that Wiz shows no path from the engineer subnet straight to the target subnet. A clean run here is the audit evidence: every connection is attributed to a federated identity, every credential is short-lived and Vault-issued, and there is no path to a host except through a worker during an authorized session.

Rollback / teardown

Tear down in reverse dependency order so you never strand a live session. Cancel sessions first, then remove grants, then infrastructure.

# 1. Drain: cancel any live sessions
for s in $(boundary sessions list -scope-id <project_id> -recursive -format json | jq -r '.items[].id'); do
  boundary sessions cancel -id "$s"
done

# 2. Remove access bindings and targets
boundary roles delete -id <role_id>
boundary targets delete -id <tssh_id>
boundary targets delete -id <ttcp_id>

# 3. Deregister workers, then stop services
boundary workers delete -id <worker_id>
systemctl disable --now boundary-worker boundary-controller

Because the estate is Terraform-managed, the durable teardown is terraform destroy of the Boundary stack, which removes the controllers, workers, load balancer, and database. Keep the KMS keys until last and schedule their deletion separately — destroying the root/recovery key before the database is gone leaves an unrecoverable cluster. Revoke the Vault token used by the credential store (vault token revoke) so no orphaned lease can sign a certificate after the cluster is gone. Re-open the legacy VPN path only as a documented break-glass during the cutover window, gated by a ServiceNow change.

Common pitfalls

Security notes

The whole point is Zero Trust for privileged access: no standing network path, identity-based authorization on every session, and credentials that are brokered, short-lived, and never seen by the human. Keep the three KMS purposes separate; scope Boundary roles to the minimum grant (authorize-session, read:self, cancel:self) rather than admin; and source group membership from Okta → Entra so an offboarded user loses access the moment the directory does. CrowdStrike Falcon on the nodes catches runtime compromise of a worker (the one component with target reachability), Wiz + Wiz Code continuously assert that neither the running posture nor the Terraform re-opens a direct route, and every session is centrally logged and cancellable for incident response and audit. Session recording (BSR) can be enabled on the worker for the highest-sensitivity targets where compliance wants a replayable record.

Cost notes

Self-hosting Boundary’s community/enterprise binary means you pay for compute and the database, not a per-seat SaaS fee — three small controllers, two workers, and a modest managed Postgres covers a few hundred engineers comfortably. Right-size the controllers (they are control-plane only) and scale workers by session concurrency and per-zone reachability rather than over-provisioning. The real saving is indirect and larger: retiring the always-on VPN concentrators and shared jumpboxes, and replacing a standing privileged-access exposure with a just-in-time one, removes both licence cost and the far more expensive risk the auditor flagged. Pipe utilization to Dynatrace so worker scaling tracks actual session load instead of a guess.

HashiCorp BoundaryZero TrustSSHRDPVaultPAM
Need this built for real?

Vinod is a Senior Cloud Architect (22+ yrs) — available for Azure / AWS / GCP architecture, landing zones, and migrations.

Work with me

Comments

Keep Reading