A Munich-based HR-tech SaaS vendor — payroll, performance reviews, and a learning portal for mid-market European employers — loses a seven-figure renewal with a German hospital group because its security questionnaire could not answer one question in writing: “Prove that our employees’ data has never left Germany, and show me the audit log.” The vendor’s platform was a single deployment in West Europe (Amsterdam), which is technically inside the EU but is the wrong country, runs everyone’s data in one shared database, and had no way to demonstrate where any given tenant’s records physically lived on any given day. The works council escalated, the DPO wrote a memo, and the deal evaporated. The mandate that came down from the CEO afterward was unambiguous: every tenant’s data must be provably pinned to a chosen EU region, a deletion request must complete in days with evidence, and a regulator must be able to verify residency without taking us at our word. This article is the reference architecture for that — a pan-European, multi-tenant SaaS built so that data residency is a structural guarantee, not a promise in a PDF.
The pressures here are specific to European B2B SaaS and they compound. Regulation is GDPR plus a thicket of sector and national rules — German Mitbestimmung (works-council co-determination) over employee data, French CNIL guidance, and increasingly data sovereignty clauses that demand not just “in the EU” but “in this member state, on EU-controlled infrastructure.” The data subject has enforceable rights — access, portability, and erasure — that a regulator expects you to honor inside a one-month statutory window, at scale, across every system that touched the person. Trust is commercial: the DPO and the works council are now blockers on the buying committee, and “trust us” loses to “here is the cryptographic and audit evidence.” And cost matters because the budget-conscious answer of “one big shared deployment” is exactly what failed. The architecture’s job is to make residency, erasability, and provability properties of the system rather than operational heroics.
Why the obvious shortcuts fail
Three shortcuts will be proposed in the first design meeting, and each fails a regulator in a predictable way.
“We’re in West Europe, that’s the EU, we’re fine.” Region is not country, and residency clauses increasingly name the member state. West Europe is the Netherlands; a German hospital’s works council may contractually require Germany (Azure Germany West Central, Frankfurt). One region cannot satisfy a German and a French and a Nordic sovereignty clause simultaneously.
“One shared multi-tenant database with a tenant_id column.” This is operationally cheapest and a residency dead end: every tenant’s rows sit on the same storage in one region, so you can never pin tenant A to Frankfurt and tenant B to Paris, you cannot cryptographically isolate one customer, and a single DSAR erasure risks touching shared infrastructure. It also makes “prove tenant A’s data never left Germany” unanswerable, because it didn’t — it lived next to everyone else’s.
“We’ll handle deletion requests manually when they come in.” At a handful of tenants and a few requests a year, a human running SQL works. At hundreds of EU employers with tens of thousands of employees each, manual DSAR fulfilment misses the one-month deadline, misses backups and the search index and the audit copy and the third-party learning system, and produces no evidence that the deletion actually happened. Manual is how you get a fine.
The architecture below replaces all three with structure: per-tenant region pinning as a first-class routing decision, per-tenant cryptographic isolation with EU-held keys, and DSAR-as-a-pipeline that fans out across every data store and emits proof.
Architecture overview
The platform is a set of region-pinned tenant stamps sitting behind one global control plane. The single most important design decision is the split between a thin global control plane that holds only non-personal routing metadata, and per-region data planes that hold all personal data and never replicate it across a border. Get that boundary right and residency follows; blur it and you will leak personal data into a “global” table that defeats the whole exercise.
Control plane (no personal data, deliberately). A global front door routes tenants to their home region; it stores only a tenant directory — tenantId → homeRegion, plan, and routing config — and explicitly no employee data. This is what lets the system be global without any person’s data being global.
Data plane (per EU region, the only place personal data lives). Each EU region a customer can choose — Germany West Central (Frankfurt), France Central (Paris), Sweden Central (Gävle), West Europe (Amsterdam) — runs an identical, independently deployed regional stamp: an AKS workload, an Azure SQL / PostgreSQL database, Blob Storage for documents, and an Azure AI Search index, all with publicNetworkAccess = Disabled and reached only through Private Endpoints. A tenant’s data is created, stored, encrypted, indexed, backed up, and audited entirely within its home region. There is no cross-region replication of personal data, by construction.
Request path, following the flow:
- An employee or HR admin hits Akamai at the edge for TLS termination, anycast routing, and WAF/bot protection. Critically, Akamai is configured so that requests for an EU tenant are only ever steered to an EU origin — the edge is the first residency control, not just a performance layer.
- Identity federates through Okta as the customer-facing workforce IdP for tenants that bring their own SSO, brokered to Microsoft Entra ID so the platform’s own Azure resources see first-class Entra tokens and conditional-access policies apply. The token carries the
tenantIdclaim that every downstream decision keys off. - The request reaches the global control plane (a lightweight Azure Front Door + a small routing service), which looks up the tenant’s
homeRegionand forwards to that region’s ingress. The control plane reads routing metadata only; it never sees a salary or a review. - Inside the home region, the request lands on Azure API Management in internal VNet mode as the regional front door. APIM validates the Entra JWT, asserts that the token’s
tenantIdmatches a tenant homed in this region (a request for a Frankfurt tenant arriving at the Paris stamp is rejected — a cross-region request is treated as a defect), and rate-limits per tenant. - The request reaches the application on a private AKS cluster. It authenticates to data services via Workload Identity and pulls the few residual secrets — third-party payroll-connector tokens, the Okta introspection secret — from HashiCorp Vault, with each region running its own Vault namespace so secrets, like data, stay regional.
- The app reads and writes the regional Azure SQL/PostgreSQL and Blob Storage, and queries the regional Azure AI Search index for in-app search. Every one of these is encrypted with a customer-managed key (CMK) held in that region’s Azure Key Vault, so the cryptographic root of the data also lives in-country.
- Every data-access and administrative action emits a structured event to the region’s immutable audit store (append-only, WORM Blob with a legal-hold policy), tagged with tenant, region, actor, and purpose — the raw material for proving residency later.
Classification & governance plane. Microsoft Purview scans each regional data store, classifies columns and files (national ID, salary, health, performance rating) against EU-specific classifiers, and maintains the data map that tells the DSAR pipeline where personal data for a person actually lives. Wiz runs continuous cloud posture and sensitive-data-exposure scanning across every region, alerting the instant any store drifts to public exposure or any personal data lands in the control plane where it must never be.
Component breakdown
| Component | Service / tool | Role in the platform | Key residency choices |
|---|---|---|---|
| Edge | Akamai | TLS, anycast, WAF, bot mitigation | EU tenants steered to EU origins only; geo-aware routing as a residency control |
| Identity / SSO | Okta + Microsoft Entra ID | Tenant SSO (Okta) federated to Entra for native Azure RBAC | tenantId claim drives region routing; conditional access on Entra |
| Control plane | Azure Front Door + routing service | tenantId → homeRegion lookup, global ingress |
Holds routing metadata only — no personal data, ever |
| Regional ingress | Azure API Management (internal) | Per-region front door, JWT validation, anti-cross-region check | Rejects any request whose tenant is not homed in this region |
| Application | AKS (private cluster, per region) | Business logic: payroll, reviews, learning | Identical stamp per region; data never leaves the stamp |
| Tenant data | Azure SQL / PostgreSQL (per region) | Structured personal data | CMK-encrypted with EU-held key; geo-redundancy pinned to EU-paired region |
| Documents | Blob Storage (per region) | Contracts, payslips, uploads | CMK-encrypted; infrastructure encryption on; EU-paired GRS |
| In-app search | Azure AI Search (per region) | Search over the tenant’s own data | CMK-encrypted index; one index never spans regions |
| Secrets | HashiCorp Vault (per-region namespace) | Connector tokens, introspection secrets | Regional namespaces; dynamic leases; Vault Agent injection |
| Encryption keys | Azure Key Vault (Managed HSM, per region) | Customer-managed keys, BYOK | Key material EU-held; per-tenant key option for crypto-shredding |
| Classification | Microsoft Purview | Discover + classify personal data, maintain the data map | EU classifiers; the map that powers DSAR fan-out |
| Audit trail | WORM Blob + legal hold (per region) | Immutable residency & access evidence | Append-only, time-locked; the regulator-facing proof |
| CSPM / data posture | Wiz / Wiz Code | Posture, sensitive-data exposure, IaC scanning | Alerts on public-exposure drift and personal data in the control plane |
| Runtime security | CrowdStrike Falcon | Workload runtime threat detection | Sensors on every regional node pool; detections to the SOC |
| Observability | Dynatrace / Datadog | Tracing, residency SLOs, anomaly detection | Telemetry scrubbed of personal data; per-region dashboards |
| ITSM / DSAR workflow | ServiceNow | DSAR intake, approvals, SLA tracking, evidence record | The system of record for the one-month clock and proof |
| Learning portal | Moodle (per region) | Employee training / LMS | Region-pinned per tenant; DSAR connector for course/PII data |
| CI / IaC | GitHub Actions / Jenkins + Argo CD + Terraform / Ansible | Build/test, GitOps deploy, infra & config as code | One stamp module deployed identically to every region |
A few choices carry the whole design and deserve the why.
Why per-tenant region pinning, and how the routing decision is made. The control plane’s tenant directory is the single source of truth for homeRegion, set at onboarding from the contract’s residency clause and immutable thereafter without a formal data-migration process. Every layer re-enforces it: Akamai steers EU origins, the control plane routes by homeRegion, and regional APIM rejects any token whose tenant is homed elsewhere. Residency is checked at four independent layers so a bug in one does not silently move data across a border.
Why EU-held customer-managed keys, and the crypto-shredding payoff. Azure encrypts everything at rest by default, but the keys are Microsoft-managed. For a sovereignty clause, the customer wants the key material to be EU-held and, ideally, customer-controlled. Each region’s Azure Key Vault Managed HSM holds the CMKs; SQL, Blob, and AI Search are configured to encrypt with them. Going further, issuing a per-tenant key turns erasure into crypto-shredding: destroy the tenant’s key and every backup, snapshot, and archived copy encrypted under it becomes unrecoverable instantly — which solves the hardest DSAR problem, deleting data from immutable backups you cannot selectively edit.
Why Purview is not optional. You cannot fulfil a DSAR you cannot scope. Microsoft Purview discovers and classifies personal data across every regional store and keeps the data map that answers “for employee X, which tables, blobs, search indexes, and the Moodle LMS hold their data?” Without that map, DSAR fan-out is guesswork and you will miss a store.
DSAR automation — the deletion and access pipeline
A data-subject request (access, portability, or erasure) is the operation a regulator audits hardest, so it is engineered as a pipeline with evidence at every step, not a manual SQL session.
- Intake — a request arrives (employee self-service, or HR on their behalf) and opens a ServiceNow DSAR case that starts the one-month statutory clock and tracks the SLA. ServiceNow is the system of record the regulator is shown.
- Scope — the pipeline queries the Purview data map to enumerate every store in the tenant’s home region holding that person’s data: SQL tables, Blob containers, the AI Search index, and the Moodle LMS (enrolments, grades, forum posts).
- Fan-out — for an access/portability request, the pipeline extracts and packages the person’s data into a portable archive (machine-readable JSON/CSV). For an erasure request, it deletes or irreversibly anonymises across every enumerated store, including the search index and the LMS, and tombstones the records.
- Backups — the hard part. Immutable, time-locked backups cannot be selectively edited, so erasure relies on crypto-shredding (destroy the per-tenant or per-subject key so backup ciphertext is unrecoverable) plus a documented backup-retention expiry, and the case records which mechanism applied.
- Evidence — every action writes to the region’s WORM audit store and back to the ServiceNow case: what was deleted, from which stores, by which mechanism, at what time. That evidence package is the deliverable that closes the case and satisfies the regulator.
# DSAR fan-out (illustrative orchestration over a tenant's home-region stamp)
dsar:
subjectId: "emp-83f1"
type: "erasure"
homeRegion: "germanywestcentral"
scope_from: purview_data_map # enumerate stores holding this subject
targets:
- store: azuresql.employees # delete rows + dependents
- store: blob.payslips # delete objects + versions
- store: aisearch.index # delete documents from index
- store: moodle.lms # delete enrolments, grades, posts
backups:
mechanism: crypto_shred # destroy per-tenant CMK -> ciphertext dead
evidence:
write_to: [worm_audit, servicenow_case] # immutable proof + SLA record
The pipeline itself runs entirely within the tenant’s home region — a Frankfurt tenant’s DSAR never executes against, or copies data to, another region.
Proving residency to a regulator
“Trust us” is what lost the original deal, so the architecture produces three independent classes of evidence a DPO can hand an auditor.
| Evidence type | Source | What it proves |
|---|---|---|
| Where data lives | Purview data map + resource inventory (Terraform state) | Every store holding a tenant’s data is in the declared region |
| Who touched it | WORM audit trail (append-only, legal-hold) | Every access/admin action, tamper-evident, with actor + purpose |
| That keys are EU-held | Key Vault Managed HSM region + RBAC logs | Cryptographic root of the data is in-country and access-controlled |
The audit store is WORM (write-once-read-many) Blob with an immutability/legal-hold policy, so records cannot be altered or deleted even by an administrator within the retention window — exactly the tamper-evidence a regulator wants. Wiz independently and continuously verifies the posture the policies claim: that no regional store is publicly exposed, that personal data has not leaked into the control plane, and that CMK encryption is actually enabled — turning “we configured it correctly” into “an independent scanner confirms it is correct, today.”
Implementation guidance
One stamp, deployed identically to every region. The regional stamp is a single Terraform module — VNet, private AKS, SQL/PostgreSQL, Blob, AI Search, Key Vault Managed HSM, Private Endpoints, private DNS zones — parameterised only by region. A new member-state region is a new instantiation of the same module, which is what keeps four (and someday eight) regions operable by a small team. Ansible handles in-stamp configuration management for the VM-based pieces (the Moodle tier, any virtual appliances such as a third-party data-loss-prevention or secure-gateway appliance a sovereignty customer mandates in-region).
A minimal Terraform shape for the regional SQL database communicates the intent — EU-held CMK, no public access:
resource "azurerm_mssql_server" "tenant" {
name = "sql-hrsaas-${var.region_short}"
location = var.region # e.g. germanywestcentral
public_network_access_enabled = false # private endpoint only
azuread_administrator { ... } # Entra-only auth
}
resource "azurerm_mssql_server_transparent_data_encryption" "tde" {
server_id = azurerm_mssql_server.tenant.id
key_vault_key_id = azurerm_key_vault_key.tenant_cmk.id # EU-held CMK in-region
}
The pipeline that deploys it runs in GitHub Actions (or Jenkins for teams already standardised there) authenticating to Azure via OIDC federation — no stored service-principal secret to leak. Application rollout to the regional AKS clusters is GitOps via Argo CD: a single desired-state repo reconciled into every region, so all stamps stay byte-identical and drift is detected and reverted automatically. Wiz Code scans the Terraform in the pipeline so a misconfiguration — a public endpoint, a missing CMK, geo-replication pointed outside the EU — is caught before it deploys, not after a scanner finds it in production.
Identity: federate the humans, key off the tenant. Tenant SSO flows Okta → Entra for customers who bring their own IdP; the platform’s own access is Entra with Workload Identity on AKS and least-privilege RBAC scoped per regional resource. The tenantId claim is load-bearing — it drives the control-plane routing decision and the regional anti-cross-region check — so it is validated, never trusted from a header. Residual secrets live in the region’s Vault namespace, leased dynamically and injected by the Vault Agent sidecar so nothing sensitive sits in a Kubernetes Secret.
Enterprise considerations
Security & Zero Trust. Zero Trust by construction: identity-based access only, least-privilege RBAC per regional resource, no public data-plane surface. Layered on top: Wiz for continuous CSPM, sensitive-data-exposure, and attack-path analysis across every region (and the control-plane leak check that matters most here); CrowdStrike Falcon sensors on every regional AKS node pool and the Moodle/appliance compute for runtime threat detection feeding the SOC; and Azure Policy denying any SQL, Blob, or Search resource created with public network access or without CMK encryption, with Wiz as the independent check that the policy actually holds. A posture breach auto-raises a ServiceNow incident so security gets a ticket, not just a log line.
Cost optimization. Per-region stamps cost more than one shared deployment — that is the deliberate price of residency — so engineer to keep the premium bounded.
| Lever | Mechanism | Typical effect |
|---|---|---|
| Stamp consolidation | Co-home several small tenants of the same country on one regional stamp | Avoids a half-idle stamp per tiny customer |
| Right-sized regions | Run popular regions (DE, FR) hot; smaller ones at minimum viable scale | Pays for demand, not for symmetry |
| Shared global control plane | One thin, stateless routing tier serves all regions | No per-region duplication of routing infra |
| AKS autoscaling | KEDA/cluster autoscaler per stamp on real concurrency | No peak-sized clusters sitting idle |
| Reserved + spot mix | Reservations for steady stamps, spot for batch/ingestion | Cuts compute bill on predictable load |
Meter cost per region and per tenant and pipe it to Dynatrace (or Datadog) for the per-customer margin view the CFO needs — residency-as-a-premium-tier only works if you can see its cost.
Scalability. Each regional stamp scales independently: AKS on concurrency via KEDA and the cluster autoscaler, SQL/PostgreSQL by tier and read replicas within the same region, AI Search by replicas (QPS) and partitions (index size) sized separately. The platform scales horizontally by adding stamps — a new member-state region is a Terraform instantiation, not a re-architecture. The natural ceiling is operational: each new region multiplies the surface to patch and monitor, which is exactly why the single-stamp module and GitOps reconciliation are non-negotiable.
Failure modes, and what each looks like. Name them before they page you — or before they become a breach notification.
- Personal data leaks into the control plane. A well-meaning feature caches a name or email in the global routing tier; suddenly personal data is “global” and residency is broken. Mitigation: control plane holds routing metadata only, enforced by code review, by Wiz sensitive-data scanning of the control-plane stores, and by a contract test that fails the build if a personal-data field appears there.
- Cross-region request slips through. A misrouted call reads a Frankfurt tenant against the Paris stamp. Mitigation: APIM’s anti-cross-region check rejects any token whose tenant is not homed in this region, treated as a hard defect with an alert.
- Geo-replication misconfigured outside the EU. Someone enables GRS to a non-EU paired region. Mitigation: Wiz Code scans Terraform for any replication target outside an EU allow-list and blocks the merge; Azure Policy denies it at deploy.
- DSAR misses a store. Erasure runs but skips the search index or the Moodle LMS, leaving recoverable personal data. Mitigation: scope strictly from the Purview data map and reconcile the deletion result against it; a leftover triggers a failed case, not a closed one.
- Backup erasure gap. Immutable backups still hold a “deleted” person. Mitigation: crypto-shredding via the per-tenant key plus documented retention expiry, both recorded in the case.
- Regional outage — see DR below.
Reliability & DR (RTO/RPO) — within residency limits. DR must respect the residency boundary: a German tenant cannot fail over to Ireland. So DR is intra-region or EU-paired-region only. SQL/PostgreSQL use zone-redundancy within the region and geo-redundant backup to the EU-paired region only (Germany West Central ↔ Germany North; France Central ↔ France South). Blob is EU-paired GRS. AI Search has no native cross-region replication, so DR means a warm index re-ingested in the paired EU region. A pragmatic target per stamp: RTO 30 minutes, RPO 5 minutes, with the constraint that every recovery path stays inside the EU and, where the contract demands it, inside the member state. Akamai health checks drive ingress failover to the surviving in-EU origin.
Observability. Trace requests end to end in Dynatrace (or Datadog), but with a residency twist: telemetry and logs are scrubbed of personal data before leaving the stamp, because an observability pipeline that ships salaries to a global APM backend re-creates the exact leak you are preventing. Emit the SLOs the business and the DPO care about — residency-violation count (must be zero), DSAR cycle time against the one-month clock, % of stores covered by the Purview map, CMK-encryption coverage, and per-region availability. ServiceNow tracks the DSAR SLA and is the documented gate compliance is shown.
Governance. Pin every regional stamp to its declared member state and make homeRegion immutable without a formal, audited migration. Keep the stamp Terraform and Argo CD desired-state in version control, reviewable and instantly revertable. Apply Azure Policy to deny public network access and require CMK encryption and diagnostic settings on every relevant resource, with Wiz as the independent verifier. Retain the WORM audit trail for the contractually required period under legal hold. And maintain a Records of Processing (RoPA) that the Purview data map keeps honest — the data map is the live, machine-verified version of the RoPA spreadsheet a regulator asks for.
Explicit tradeoffs
Accept these or do not build it. Per-tenant region pinning multiplies operational surface — four regions is four of everything to patch, monitor, and pay for, and the cost premium over one shared deployment is real and permanent; it is the price of a sovereignty guarantee and it only pencils out as a premium pricing tier. Crypto-shredding makes erasure tractable but couples deletion to key management: lose a tenant’s key by accident and you have “deleted” a live customer, so key lifecycle becomes safety-critical. The strict control-plane/data-plane split costs you the convenience of a single global table and forces discipline forever — one cached email in the wrong tier breaks the model. Scrubbing personal data out of telemetry costs you some debugging fidelity. And the four-layer residency enforcement (edge, control plane, regional APIM, policy) is redundant on purpose — overhead you can skip for a single-region MVP and absolutely cannot skip when a works council is auditing you.
The alternatives, and when they win. If you serve one country only, a single in-country region with strong encryption and DSAR tooling is far simpler — skip the global control plane entirely until a second member state demands it. If your data is genuinely non-personal or fully anonymised, residency relaxes and a single multi-region-replicated deployment is fine. If you need member-state guarantees stronger than commercial Azure offers — air-gapped or government-cloud sovereignty — a sovereign-cloud offering (e.g. an EU “sovereign cloud” SKU or a local partner) is the right tool, at higher cost and lower service breadth. And if you are pre-product-market-fit, do single-region first and graduate to this stamped architecture the moment a deal hinges on residency — which, for European HR-tech, is sooner than founders expect.
The shape of the win
For the HR-tech vendor, the payoff is not “we moved to the cloud.” It is that the next German hospital’s security questionnaire asks “prove our employees’ data has never left Germany, and show me the audit log,” and the answer is a Purview data map showing every store in Frankfurt, a WORM audit trail of every access, a Key Vault attestation that the encryption keys are EU-held, and a ServiceNow record of a DSAR fulfilled in four days with evidence — handed over in writing, before the works council even asks twice. That packet is what wins the renewal the single shared deployment lost. Everything upstream — the per-region stamps, the EU-held CMKs and crypto-shredding, the control-plane/data-plane split, the DSAR pipeline, the Wiz posture verification, the Okta-to-Entra federation — exists to turn “trust us” into “here is the proof.” Start single-region if you must; this is where pan-European, residency-bound SaaS has to land.