Zero Trust Network Access for Remote Workforce on Azure

A pharmaceutical company’s head of security gets a finding from an external red team that lands like a verdict: a contractor’s laptop, compromised through a phished session, sat on the corporate VPN for nine days and from there had flat network reachability to the clinical-trial document store, the SAP environment, and three lab automation servers running unpatched Windows. Nothing about the VPN noticed. The contractor authenticated once, got an IP on the corporate subnet, and from the network’s point of view was the corporate network for the rest of the session. The board’s instruction afterward is one sentence: “Get us off the model where a single stolen credential plus a VPN tunnel equals the keys to the building.” The constraint is that the company is GxP-regulated — clinical and manufacturing systems are validated, change-controlled, and audited — and it has 11,000 staff plus a rotating population of CROs, lab partners, and contract manufacturers who all need access to some applications and none of whom should ever see a flat network. This article is the reference architecture for replacing that VPN with Zero Trust Network Access on Azure, built so a CISO, a quality officer, and an auditor each sign it.

The pressures are the ones every regulated, distributed enterprise now feels at once. The perimeter is gone: the workforce is remote, the contractors are external, and the applications are split across Azure, on-prem datacenters, and SaaS. Lateral movement is the actual threat: breaches almost never start at the crown jewels; they start at a laptop and walk there, and a VPN’s flat reachability is the hallway they walk down. Identity is the new control plane, but identity alone is not enough — a valid token from an infected, non-compliant device is exactly the red team’s path. And the legacy VPN concentrators themselves are a liability: internet-facing virtual appliances whose CVEs are a standing item on every threat brief. Zero Trust Network Access — ZTNA — is the pattern that answers all four: never trust the network, authenticate every session against identity and device posture and real-time risk, and grant access to one application at a time instead of a subnet.

Why not the obvious shortcuts

The tempting fixes each fail in a way someone on the project will have to be talked out of.

“Just keep the VPN but add MFA” stops credential-stuffing at the front door and does nothing about the core problem: once the tunnel is up, the device has network-layer reachability to everything behind the concentrator, MFA or not. The pharma red team had MFA. “Put everything behind a SaaS reverse proxy” works beautifully for HTTP apps and falls over the moment you hit RDP, SSH, a thick-client SAP GUI, or a lab instrument speaking a proprietary TCP protocol — and a regulated estate is full of those. “Microsegment the network with firewalls” is real Zero Trust progress but is a multi-year datacenter project that does nothing for the remote contractor on a home network, and it still trusts the network inside each segment.

ZTNA threads the needle differently. There is no network to join. A lightweight client brokers each connection through an identity-aware proxy; the user’s device makes an outbound connection to the broker, the broker checks identity, device compliance, and session risk on every request, and then stitches a connection to the specific application — never the subnet it lives on. The application’s real address is never exposed, the inbound firewall ports on the VPN concentrator disappear entirely, and an attacker on a compromised laptop can reach exactly the handful of apps that laptop’s user is currently authorized and compliant to use — not a hallway, a single locked door at a time.

Architecture overview

Zero Trust Network Access for Remote Workforce on Azure — architecture

The platform is built on Microsoft Entra Global Secure Access (GSA) as the security service edge, with Entra Private Access providing ZTNA to private apps and Entra Internet Access governing egress to SaaS and the open web. Conceptually there are three planes, and keeping them straight is the first step to operating this well: a control plane (Entra makes the access decision), a data plane (the GSA client and connectors carry the traffic), and a signal plane (device posture and threat telemetry feed the decision). The VPN concentrators come out; nothing inbound replaces them.

The defining property of the topology is the one the red team’s finding demanded: no application is reachable by network address, and no decision is made once. Private apps sit behind outbound-only Private Network Connectors deployed as virtual appliances inside the Azure VNet and the on-prem datacenter. They dial out to the Entra GSA service and hold the tunnel open; nothing dials in. Every user request is re-evaluated against Conditional Access in real time, so a device that falls out of compliance mid-session loses access at the next request, not at the next login.

Access path, following the control flow:

A user — employee or contractor — signs in once through Okta, the company’s incumbent workforce IdP that fronts the HR-driven joiner/mover/leaver lifecycle. Okta is federated to Microsoft Entra ID over OIDC/SAML so that Azure and GSA see a first-class Entra token. The Global Secure Access client on the managed device (Windows, macOS, iOS, Android) silently acquires the traffic-forwarding profiles.
The user opens an application — say the clinical-trial document store. The GSA client recognizes the app’s FQDN belongs to a Private Access profile and tunnels that traffic to the GSA edge, rather than letting it hit DNS or the network directly. There is no route to the app outside this tunnel.
At the edge, Conditional Access evaluates the session as the policy decision point. It checks: is this a known user in the right group; is the device Entra-joined or Intune-compliant (disk encrypted, EDR healthy, OS patched, not jailbroken); what is the session risk from Entra ID Protection; and — the part the red team’s scenario hinges on — what does the device-posture signal say right now. Only if every condition passes does the session proceed; otherwise it is blocked or sent to step-up MFA.
Device posture is enriched by CrowdStrike Falcon through the Falcon Zero Trust Assessment (ZTA) score. Falcon’s sensor on the endpoint continuously computes a posture score and feeds it into the Conditional Access decision (via the Entra device-compliance integration): a laptop Falcon flags as compromised or low-score is treated as non-compliant, and Conditional Access denies the very access the red team abused — the access decision is now a live function of endpoint health, not a one-time login event.
The approved request crosses the GSA service to the Private Network Connector nearest the app, which completes the last hop to the application over the connector’s already-open outbound tunnel. The app sees traffic from the connector, never the user’s IP, and was never exposed to the internet.
The session is continuously authorized. Every subsequent request re-checks policy; Continuous Access Evaluation (CAE) lets Entra push a near-real-time revocation if risk spikes, the user is disabled in Okta’s lifecycle, or Falcon downgrades the device — so a token does not stay valid for its full lifetime after the device is owned.

Egress path, governed in parallel: traffic to SaaS and the open internet is forwarded through Entra Internet Access, which applies web content filtering, blocks known-malicious destinations, and — critically — lets sanctioned SaaS apps enforce a source-IP / Continuous Access restriction so they only accept sessions that came through the company’s secure edge. A token stolen from a managed device cannot be replayed from an attacker’s network, because the SaaS app rejects any session not originating from GSA.

Component breakdown

Component	Service / tool	Role in the platform	Key configuration choices
Identity / SSO	Okta + Microsoft Entra ID	Okta runs workforce lifecycle + primary auth; federated to Entra for native ZTNA policy	OIDC/SAML federation; group + risk claims flow to Conditional Access; SCIM provisioning
Policy decision	Conditional Access (Entra)	The PDP: evaluates user, device, risk, posture per session	Per-app policies; require compliant device; require step-up on risk; CAE enabled
ZTNA data plane	Entra Private Access (GSA)	Identity-aware proxy to private apps; replaces VPN	Per-app segments; no network-level access; per-app Conditional Access
Secure egress	Entra Internet Access (GSA)	Web filtering + source-IP restriction for SaaS	Content categories; tenant restrictions; CA-bound SaaS access
Connectors	Private Network Connectors	Outbound-only virtual appliances bridging GSA to apps in VNet/on-prem	Connector groups per site; HA pairs; no inbound ports
Endpoint posture	CrowdStrike Falcon (ZTA)	Continuous device-health score feeding the access decision	Sensor on all managed endpoints; ZTA score → Entra compliance; detections to SOC
Device compliance	Microsoft Intune	Baseline compliance signal (encryption, patch, OS, jailbreak)	Compliance policies; Falcon ZTA as a connected app; non-compliant = blocked
Secrets	HashiCorp Vault	Connector service tokens, federation signing keys, automation creds	Entra auth method; dynamic short-lived leases; Agent injection
CSPM / posture	Wiz + Wiz Code	Verifies the ZTNA + VNet config has no public-exposure drift; scans IaC pre-merge	Agentless scan of connector subnets/NSGs; Wiz Code policy-as-code in PRs
Observability	Dynatrace + Datadog	Connector health, session latency, access-decision telemetry, anomaly detection	OneAgent on connectors; Datadog log pipeline for CA + sign-in logs; SLOs on access latency
ITSM / approvals	ServiceNow	App-onboarding approvals, access requests, auto-incidents on policy breach	Change gate before an app goes live in ZTNA; auto-ticket on blocked-access spikes
CI / IaC	GitHub Actions / Jenkins + Argo CD	Build/test pipeline; GitOps rollout of connector + policy config	OIDC to Azure (no stored creds); Argo CD reconciles connector deployments
Infra automation	Terraform + Ansible	VNet, NSGs, connector VMs as code; Ansible configures the appliances	Terraform for Azure resources; Ansible playbooks harden + register connectors
Edge / DNS	Akamai	Protects the few remaining public web entry points + global DNS	WAF on public origins; anycast DNS; not in the private-app path

A few of these choices deserve the why, because they are the ones teams get wrong.

Why device posture must be a live input, not a checkbox. The single most important design decision is that the access decision is a function of current device health, not a property of “this device once enrolled.” Microsoft Intune supplies the baseline compliance signal — encryption on, OS patched, not jailbroken — but a device can be fully Intune-compliant and actively compromised. That gap is exactly what CrowdStrike Falcon’s Zero Trust Assessment closes: Falcon’s sensor continuously scores the endpoint and surfaces a live “is this machine healthy right now” verdict that flows into the Entra compliance state. Wire it so a Falcon detection downgrades the device, Conditional Access sees a non-compliant device on the next request, and CAE revokes the live session. In the red team’s scenario, the moment the laptop is owned, its access evaporates — the property the VPN could never provide.

Why federate Okta to Entra rather than rip and replace. A real enterprise does not throw away a working Okta lifecycle to adopt ZTNA. Okta stays the system of record for joiners/movers/leavers and primary authentication, and is federated to Entra so the ZTNA policy engine — which lives in Entra and consumes device compliance, ID Protection risk, and Falcon posture — gets a first-class token to reason over. The tradeoff is a token-translation hop and the discipline of keeping group and risk claims mapped across both directories; the payoff is that the leaver Okta disables is, via SCIM and CAE, also the leaver who loses every ZTNA app within minutes.

Why connectors are outbound-only, and why that is the whole point. The legacy VPN concentrator was an internet-facing virtual appliance — a listening port, a CVE magnet, the thing the threat brief worried about. Private Network Connectors invert that: they are virtual appliances that make only outbound connections to the GSA service and hold the tunnel open, so there is nothing to scan and no inbound port to exploit. You deploy them in HA pairs per site (an Azure VNet connector group, an on-prem datacenter group), and the application’s real address never leaves the connector’s side of the tunnel.

Implementation guidance

Provision with Terraform, configure with Ansible, and treat the connector network as the first deliverable. Order matters: get the connector placement and DNS wrong and apps resolve but never connect.

A hub/spoke topology with a dedicated connector subnet in each VNet and a hardened NSG that permits outbound 443 only — connectors need no inbound rules at all.
Private DNS so connector-side resolution reaches the private app FQDNs, with GSA’s application segments mapping those same FQDNs on the client side.
The connector VMs themselves — at least two per group for HA — deployed by Terraform and registered + hardened by Ansible playbooks (install the connector agent, pull the registration token from Vault, apply CIS baseline, join the connector group).
The Entra Private Access application segments, one per app or tight app group, each bound to its own Conditional Access policy.
Conditional Access policies requiring a compliant device, with Falcon ZTA feeding compliance and CAE enabled.

A minimal Terraform shape for a connector VM and its locked-down NSG communicates the intent — no inbound, egress 443 only:

resource "azurerm_network_security_group" "gsa_connector" {
  name = "nsg-gsa-connector-prod-cin"

  security_rule {
    name                       = "deny-all-inbound"
    priority                   = 4096
    direction                  = "Inbound"
    access                     = "Deny"
    protocol                   = "*"
    source_address_prefix      = "*"
    destination_address_prefix = "*"
    source_port_range          = "*"
    destination_port_range     = "*"
  }

  security_rule {
    name                       = "allow-https-egress"
    priority                   = 100
    direction                  = "Outbound"
    access                     = "Allow"
    protocol                   = "Tcp"
    destination_port_range     = "443"
    source_address_prefix      = "VirtualNetwork"
    destination_address_prefix = "Internet"
    source_port_range          = "*"
  }
}

The pipeline that applies this runs in GitHub Actions (or Jenkins, where the company already standardizes there), authenticating to Azure via OIDC federation so there is no stored service-principal secret to leak. Connector and policy configuration is then reconciled by Argo CD in a GitOps loop, so the desired ZTNA state lives in git and drift is corrected automatically. The connector registration tokens, the Okta federation signing keys, and any automation credentials are held in HashiCorp Vault with the Entra auth method and dynamic short-lived leases, injected at deploy time — never written into a pipeline variable or a VM’s disk.

Pin the access policy to the application, not the network. The discipline that makes this Zero Trust and not “VPN with extra steps” is granularity. Define one Private Access application segment per application (or a narrowly scoped group), and give each its own Conditional Access policy. The clinical-trial store requires a compliant device and a healthy Falcon score and membership in the trial group; a low-sensitivity intranet wiki might require only a managed device. A contractor’s policy grants exactly the two apps their statement of work names and nothing adjacent. The connector can technically reach the subnet; the policy ensures the user reaches one app.

Conditional Access policy shape (illustrative):

Policy: ZTNA - Clinical Trial Document Store
  Assignments:
    Users:   grp-ClinicalTrials-Access (synced from Okta via SCIM)
    Apps:    [Private Access] clinical-docs.internal.pharma
  Conditions:
    Device platform:  Windows, macOS
    Sign-in risk:     block High
  Grant (require ALL):
    - Require compliant device          (Intune + CrowdStrike Falcon ZTA)
    - Require multifactor authentication
  Session:
    - Continuous Access Evaluation: enabled
    - Sign-in frequency: every 4 hours

Enterprise considerations

Security & Zero Trust. The architecture is Zero Trust by construction: no network-level access, per-session and per-application authorization, least privilege to a single app instead of a subnet, and no internet-facing access appliance at all. Layer on top: (a) device-bound posture via Falcon ZTA + Intune so a compromised endpoint is denied in real time; (b) CAE so revocation is near-instant rather than waiting out a token’s lifetime; © Entra Internet Access source-IP restrictions so a stolen SaaS token cannot be replayed off a managed device; (d) Wiz running continuous CSPM across the connector subnets and NSGs, alerting the moment a rule drifts to allow inbound or a connector is exposed, with Wiz Code scanning the Terraform/Ansible in the pull request so a misconfiguration is caught before merge, not after deploy; (e) a blocked-access spike or a Falcon high-severity detection auto-raises a ServiceNow incident so the SOC has a ticket, not just a log line. Azure Policy denies any NSG rule that opens inbound ports on a connector subnet, and Wiz independently verifies the policy is actually holding.

Cost optimization. ZTNA’s economics are mostly a substitution, and that is the story to tell the CFO.

Lever	Mechanism	Typical effect
Retire VPN concentrators	Drop the HA VPN appliances + their licenses and patch toil	Eliminates a hardware/license line and a recurring CVE-response cost
Bundled licensing	Entra Private/Internet Access ride Entra Suite / E5 the org likely owns	Avoids a separate per-seat ZTNA SKU
Connector right-sizing	Scale connector VM count to throughput, not peak headroom	Small VNet footprint; autoscale the connector group
Egress consolidation	Web filtering at GSA replaces standalone SWG appliances	Collapses a second box into the same edge
Contractor scoping	Grant external users named apps only	Cuts the blast radius and the audit surface that costs review time

The largest saving is rarely on the invoice — it is the avoided breach and the shrunken audit scope: when a contractor can reach two named apps instead of a flat network, the population of systems an auditor must reason about for that contractor collapses.

Scalability. Each plane scales independently. The GSA edge is Microsoft-operated and scales with the service. Connectors scale horizontally — add VMs to a connector group for more throughput and resilience, place a group in every region and datacenter where apps live so the last hop stays local, and let Argo CD reconcile new connectors into the group declaratively. Conditional Access evaluation is per-request and scales with Entra. The natural ceiling is connector capacity per site and the WAN path from connector to app, which is why a global rollout plans connector placement region-by-region rather than backhauling all traffic to one datacenter the way the old VPN did.

Failure modes, and what each one looks like. Name them before they page you.

All connectors in a group down — every app behind that group becomes unreachable; this is a hard outage, worse than a VPN node loss because there is no network fallback. Mitigation: minimum two connectors per group, spread across availability zones, with health alerts wired to Datadog and an on-call page.
Conditional Access misconfiguration — a too-broad policy silently grants more than intended, or a too-tight one locks out a legitimate population at 9am. Mitigation: policies in git via Argo CD with review, a report-only rollout stage before enforce, and Wiz Code checks on policy-as-code.
Falcon/Intune signal stale or failing open — if the posture feed breaks and compliance defaults permissive, the core control silently degrades. Mitigation: treat a missing posture signal as non-compliant (fail closed), and alert on posture-feed gaps in Dynatrace.
Okta-to-Entra federation outage — primary auth is down and no one can start a session. Mitigation: federation health monitoring, an emergency break-glass account path, and documented runbook.
DNS mismatch — the GSA client tunnels an FQDN the connector cannot resolve, so sessions open and hang. Mitigation: assert connector-side private DNS in Terraform and a post-deploy smoke test per app segment.

Reliability & DR (RTO/RPO). ZTNA availability is mostly control-plane availability, so decide the numbers per dependency. The Entra/GSA control plane carries Microsoft’s SLA; your responsibility is connector redundancy (≥2 per group, multi-zone) and a break-glass access path for the handful of administrators who must get into critical systems if the federation or GSA itself is impaired — typically a tightly-scoped, heavily-audited emergency account that bypasses Okta. Because ZTNA holds no application data, there is no RPO for the access tier itself; RTO is the time to restore connector capacity or fail over to a paired region’s connector group, which a GitOps-driven Argo CD reconcile makes fast. A pragmatic target: RTO 15 minutes to restore access via a standby connector group, with the policy and connector definitions rebuildable from git in minutes. Akamai health checks drive failover for the residual public web origins that sit outside the ZTNA path.

Observability. Stream the Entra sign-in and Conditional Access logs plus connector telemetry into Datadog for the log pipeline and into Dynatrace for connector-host health and end-to-end session-latency tracing, with anomaly detection on both. Emit the metrics the business and security teams actually care about — access-decision outcomes (granted / blocked / step-up, sliced by app and by user population), connector health and saturation, per-app session latency (the thing remote users feel), compliant-vs-non-compliant device ratio, and CAE revocation events. A sustained spike in blocked access for a legitimate app is usually a policy or posture-feed regression and should page; it also auto-raises a ServiceNow incident so there is an owned ticket. New applications pass through a ServiceNow change approval before going live in ZTNA, giving quality and security a documented gate — non-negotiable in a GxP estate.

Governance. Keep every Conditional Access policy, connector definition, and NSG rule in version control, reviewable and instantly revertable, reconciled by Argo CD so production matches git. Apply Azure Policy to deny inbound rules on connector subnets and require diagnostic settings on every connector, with Wiz as the independent check that the controls are real and Wiz Code enforcing the same as policy-as-code in pull requests. Log every access decision for audit and incident review. And because this is a validated environment, treat changes to a ZTNA policy that fronts a GxP system as change-controlled: the ServiceNow gate, the git history, and the report-only-then-enforce rollout together produce the evidence trail an auditor expects.

Explicit tradeoffs

Accept these or do not build it. ZTNA concentrates availability into the access control plane: when the VPN node died you lost a node, but when a connector group dies you lose every app behind it with no network-layer fallback — so connector redundancy is not optional, it is the architecture. The model only works if device posture is trustworthy: it presumes managed, enrolled, Falcon-and-Intune-covered endpoints, which means unmanaged or BYOD devices need a deliberate, separate answer (browser-isolation or a VDI landing zone), not a quiet exception that reintroduces the very risk you removed. Policy granularity is the whole value and also the whole operational burden — one segment and one policy per app is more configuration than a single VPN ACL, and it lives or dies on the discipline of keeping it in git and reviewed. The Okta-to-Entra federation adds a hop, a token-translation step, and the chore of mapping claims across two directories that single-IdP shops will not have. And the posture integration is a hard dependency on CrowdStrike + Intune health: if that feed fails open, your strongest control silently evaporates, which is why it must fail closed.

The alternatives, and when they win. If your applications are all modern HTTP and you have no thick clients, a pure identity-aware reverse proxy (Entra Application Proxy alone, or a SaaS ZTNA that only does web) is simpler and you can skip the connector-for-arbitrary-TCP complexity. If you are predominantly an on-prem datacenter shop with little remote workforce, firewall microsegmentation may be the higher-leverage Zero Trust investment first. If a population genuinely needs unmanaged-device access, VDI / remote browser isolation is the right tool and composes with this — ZTNA for managed endpoints, an isolated landing zone for the rest. And if you are a small team optimizing for speed, Entra Private Access on its own, without the Okta federation, the Falcon posture feed, and the full GitOps/Wiz governance, stands up working VPN-replacement ZTNA quickly; graduate to this full posture-aware, federated, change-controlled platform when regulation, scale, contractor populations, or an audit demand it.

The shape of the win

For the pharma security team, the payoff is not “a faster VPN.” It is that the next time a contractor’s laptop is phished, Falcon scores the device unhealthy, Conditional Access marks it non-compliant, CAE revokes the live session within minutes, and the device can reach exactly nothing — no flat subnet, no lateral hallway to the clinical-trial store, no nine-day dwell. The auditor’s scope for that contractor is two named applications instead of a network. And the VPN concentrators that were a standing line item on the threat brief are simply gone, replaced by outbound-only connectors with no port to attack. Everything upstream — the Okta-to-Entra federation, the per-app Conditional Access, the Falcon and Intune posture feeds, the Vault-held connector tokens, the Wiz posture scanning, the Argo CD GitOps, the Dynatrace and Datadog telemetry, the ServiceNow change gate — exists to make a CISO, a quality officer, and an auditor each say yes. The architecture here is the destination; retire the VPN in waves if you must, but this is where a regulated, distributed, contractor-heavy workforce’s access has to land.

Zero Trust Network Access for Remote Workforce on Azure

Why not the obvious shortcuts

Architecture overview

Component breakdown

Implementation guidance

Enterprise considerations

Explicit tradeoffs

The shape of the win

Written by Vinod

Comments

Keep Reading

The AWS Architecting Ladder: From a Static Site to Multi-Region Active-Active

The Azure Architecting Ladder: From a Simple Web App to Mission-Critical

Azure Architecture Case Studies: Real Proposal Walkthroughs (Easy → Complex)