A multinational health-insurance carrier has a data problem that has finally become a board problem. Claims, eligibility, provider networks, and actuarial models live in eleven separate systems across three regions, and every quarter the analytics team spends two weeks stitching extracts together by hand to produce the loss-ratio numbers the regulator and the reinsurers both demand. Worse, the last audit flagged that a long-departed contractor’s credentials still had read access to a member-level claims extract sitting on a file share — exactly the kind of finding that, in a HIPAA-and-Solvency-II world, turns into a remediation order and a line item in the annual report. The mandate from the CFO and the CISO together is blunt: one governed data platform, no member data on the public internet, no standing human credentials with direct data access, and a defensible audit trail for every query. Snowflake on Azure is the answer this article builds — not the marketing version, but the version where every connection rides Private Link, every human logs in through Entra, every service account uses a short-lived key issued by Vault, and the whole thing is watched by Datadog.
The pressures here are the ones that define regulated data engineering. Regulation means data residency per region and an auditable trail of who touched what. Consolidation means a single source of truth that the actuarial, fraud, and finance teams can all trust instead of eleven divergent extracts. Security means the network can never be the soft underbelly — no public endpoint, no shared password in a config file. And cost means a platform whose compute bill scales with adoption and therefore has to be metered, attributed, and capped before it surprises the CFO. Snowflake fits because it separates storage from compute, lets you size and isolate workloads independently, and — critically for this carrier — supports Azure Private Link, Entra federation, and key-pair service auth as first-class features rather than bolt-ons.
Why Snowflake on Azure, and why not the obvious alternatives
Three shortcuts will be proposed in the first planning meeting, and each fails for a reason worth naming.
“Just use a big SQL warehouse on a VM” couples storage and compute, so the month-end actuarial run starves the daily fraud queries, and scaling means a painful resize with downtime. “Point Power BI directly at the eleven source systems” leaves you with no governed layer, no single security model, and eleven sets of credentials to audit — it reproduces the exact mess the project exists to kill. “Use Snowflake but over the public endpoint with passwords” technically works and is how most proofs-of-concept start, but it puts member data on an internet-reachable endpoint and leaves password rotation to human discipline, which is precisely what the last audit caught.
Snowflake’s architecture — a cloud-services layer, independent virtual warehouses for compute, and centralized storage on Azure Blob underneath — lets the actuarial team run a four-extra-large warehouse for month-end while the fraud team runs a small one continuously, with neither contending for the other’s resources. Put that behind Private Link, federate the humans through Entra, and broker the machine credentials through Vault, and you have a platform whose security posture is structural, not procedural.
Architecture overview
The platform has three planes that are useful to hold separately in your head: a network/identity plane that decides who and what can connect, an ingestion plane that lands and transforms data, and a consumption plane where analysts and applications query. They share the Snowflake account but operate on different schedules and have different threat models.
The defining property — the one the CISO signs on — is that Snowflake’s account URL resolves to a private IP inside the carrier’s Azure VNet, and public access to the account is blocked by network policy. Analysts in the office, the ingestion jobs running on Azure, and the BI service all reach Snowflake over Azure Private Link via a private endpoint; nothing traverses the public internet, which is what makes the residency and member-data story defensible.
Connectivity and identity, following the path a request takes:
- A human analyst opens the BI tool or the Snowflake web UI. Authentication is SSO through Microsoft Entra ID configured as the SAML/OIDC identity provider for the Snowflake account. The analyst never types a Snowflake password — they authenticate against Entra (with the carrier’s conditional-access policies and MFA), and Entra asserts the identity to Snowflake. Because the carrier’s directory is mastered in Okta for some legacy SaaS, Okta federates into Entra so there is one human identity fabric, and Entra is the single IdP Snowflake trusts.
- Entra SCIM provisioning pushes users and groups into Snowflake automatically. When HR marks the departed contractor as a leaver, Entra de-provisions them in Snowflake within the sync window — closing the exact gap the audit found. Entra groups map to Snowflake roles, so access is granted by group membership, never by hand-editing grants.
- The connection itself rides Private Link: a private endpoint in the VNet maps to Snowflake’s Azure Private Link service, and a private DNS zone (
privatelink.snowflakecomputing.com) makes the account URL resolve to that private IP. A Snowflake network policy additionally restricts the account to the carrier’s private link integration, so even a leaked credential cannot connect from the open internet. - Service accounts — the ingestion jobs, the dbt runner, the BI gateway — do not use passwords at all. They authenticate with key-pair auth, and the private keys are issued and rotated by HashiCorp Vault, never written to disk in a job definition. More on this below; it is the heart of the “no standing credential” requirement.
Ingestion plane, event-driven and batch: source data lands in Azure Data Lake Storage (ADLS) Gen2, written by extract jobs from the eleven source systems. Snowflake reads it through an external stage backed by a storage integration — Snowflake assumes an Entra-managed identity to read the container, so there is no storage account key embedded anywhere. Snowpipe auto-ingests new files on Azure Event Grid notifications for near-real-time loads (eligibility changes, daily claims), while heavier historical loads run as scheduled COPY INTO batches. Once raw data lands in Snowflake, dbt transformations (run by GitHub Actions CI on a schedule and on merge) build the medallion layers — raw to conformed to curated marts — with tests gating every model so a broken loss-ratio calculation never reaches a mart.
Consumption plane: governed marts are exposed to Power BI, to the actuarial modeling tools, and to a fraud-scoring service, each through its own right-sized virtual warehouse so workloads never contend. Row-access policies and dynamic data masking ensure an analyst in the EU region cannot see US member identifiers, and a fraud analyst sees masked SSNs unless their role explicitly unmasks.
Component breakdown
| Component | Service / tool | Role in the platform | Key configuration choices |
|---|---|---|---|
| Connectivity | Azure Private Link | Private-only path from VNet to Snowflake | Private endpoint + privatelink.snowflakecomputing.com DNS zone; network policy locks account to the link |
| Human identity / SSO | Microsoft Entra ID (Okta-fed) | SSO and SCIM into Snowflake; no human passwords | SAML/OIDC IdP; SCIM auto-provision; Entra groups → Snowflake roles; conditional access + MFA |
| Machine identity | HashiCorp Vault | Issues and rotates key-pair private keys for service accounts | Dynamic/short-lease key material; no key on disk; per-account scoping |
| Lake storage | ADLS Gen2 | Landing zone for source extracts; external stage backing | Hierarchical namespace; storage integration via Entra managed identity (no account key) |
| Ingestion (streaming) | Snowpipe + Event Grid | Auto-load new files with low latency | Event Grid notification integration; per-table pipes |
| Ingestion (batch) | COPY INTO jobs |
Bulk historical and daily batch loads | Scheduled tasks; file-format objects; load metadata for idempotency |
| Transformation | dbt + GitHub Actions | Medallion modeling, tests, documentation | OIDC to Azure (no stored creds); test gate; key-pair auth to Snowflake |
| Compute | Snowflake virtual warehouses | Isolated, right-sized compute per workload | Auto-suspend; auto-resume; multi-cluster for concurrency; resource monitors |
| Governance | Snowflake RBAC + policies | Roles, row-access, masking, object tagging | Role hierarchy; masking on PII; row policy by region; tag-based classification |
| Observability | Datadog | Query, warehouse, cost, and freshness monitoring | Snowflake integration; credit-spend monitors; data-freshness SLOs; alert routing |
| Secrets posture | HashiCorp Vault + Wiz | Credential brokering + posture/exposure scanning | Vault for keys; Wiz scans ADLS and config drift to public exposure |
| ITSM / approvals | ServiceNow | Access requests, role-grant change records, incidents | Change gate for new role grants; auto-ticket on guardrail breach |
A few choices carry the weight of the design and deserve the why.
Why Private Link instead of an IP allow-list. A Snowflake network policy with allowed IP ranges is better than nothing, but it still leaves the account on a public endpoint and depends on every office and egress IP being correctly enumerated forever. Private Link removes the public attack surface entirely: the account is reachable only through the private endpoint inside the VNet, so a stolen credential used from anywhere else simply cannot establish a connection. For member data under HIPAA, that structural guarantee is worth the extra DNS and endpoint plumbing.
Why key-pair auth brokered by Vault, not passwords or long-lived keys. Snowflake service accounts can use passwords, but passwords get committed to repos and shared in chat. Key-pair auth is far stronger, but a long-lived private key sitting in a CI variable is itself a standing credential — exactly what the mandate forbids. The pattern here is Vault as the broker: a job authenticates to Vault (via its Azure managed identity), Vault hands back a short-lived private key for the Snowflake service user, the job uses it for the run, and it expires. No human ever holds the key, and there is nothing durable to steal. Snowflake supports two registered public keys per user precisely to allow zero-downtime rotation — Vault rotates one slot while the other stays valid.
-- Snowflake side: a service user that can ONLY auth by key pair
CREATE USER svc_ingest
DEFAULT_ROLE = ingest_loader
DEFAULT_WAREHOUSE = wh_ingest
RSA_PUBLIC_KEY = 'MIIBIjANBgkq...' -- primary, rotated by Vault
MUST_CHANGE_PASSWORD = FALSE;
ALTER USER svc_ingest UNSET PASSWORD; -- no password path at all
Why storage integration over a SAS token or account key. An external stage could be wired with a SAS token, but that token is a bearer credential with an expiry someone has to manage. A storage integration instead lets Snowflake assume an Entra-managed identity that you grant Storage Blob Data Reader on the container — no key, no token, and access governed by Azure RBAC you already audit. It is the same “kill the standing credential” principle applied to the lake.
Implementation guidance
Provision with Terraform, and stand the network up first. As with any private-endpoint design, private DNS is the part that silently breaks: get it wrong and the Snowflake URL resolves to a public IP that the network policy then refuses, and connections hang. Order of operations:
- The VNet and subnets (a subnet for private endpoints, subnets for the ingestion compute and the BI gateway).
- The private endpoint to Snowflake’s Azure Private Link service, plus the private DNS zone
privatelink.snowflakecomputing.comlinked to the VNet, with the account’s privatelink URL records. - The Snowflake account-level network policy that allows the private link integration and blocks public access.
- The storage integration and external stage against ADLS Gen2, with the managed identity granted reader on the container.
- Entra enterprise application for Snowflake (SSO + SCIM), with group-to-role mappings.
A minimal Terraform shape for the warehouse and a resource monitor communicates the cost-and-isolation intent — auto-suspend on, a hard credit cap:
resource "snowflake_warehouse" "actuarial" {
name = "WH_ACTUARIAL"
warehouse_size = "XLARGE"
auto_suspend = 60 # seconds idle before suspend — pay only when querying
auto_resume = true
initially_suspended = true
min_cluster_count = 1
max_cluster_count = 3 # multi-cluster: scale out for month-end concurrency
}
resource "snowflake_resource_monitor" "actuarial_cap" {
name = "RM_ACTUARIAL"
credit_quota = 4000 # monthly ceiling
notify_triggers = [75, 90] # warn finance early
suspend_trigger = 100 # stop the warehouse at the cap, not the invoice
}
Identity: federate the humans, broker the machines. Configure Entra as the Snowflake IdP for SAML SSO and turn on SCIM provisioning so joiner-mover-leaver flows from the directory land in Snowflake automatically — the leaver case is the one that closes the audit finding. Map Entra groups to Snowflake roles (ANALYST_EU, ACTUARY, FRAUD_ANALYST) so a grant is a group membership, reviewable in the carrier’s existing access-certification process, not a hand-typed GRANT. For machines, register each service user with key-pair auth and let Vault issue and rotate the keys; the dbt runner and Snowpipe loader authenticate to Vault with their Azure managed identity and never see a durable secret.
Transformation wiring. Run dbt in GitHub Actions, authenticating to Azure via OIDC (no stored service-principal secret) and to Snowflake via the Vault-issued key. Structure the project as medallion layers — raw (Snowpipe lands here), staging (typed and deduplicated), marts (the governed loss-ratio, eligibility, and fraud-feature tables) — and make dbt test a required gate so a null member ID or a broken join fails the build instead of corrupting a mart. Tag PII columns at the model level so masking policies attach automatically by tag rather than by remembering to mask each new column.
Enterprise considerations
Security and Zero Trust. The platform is Zero Trust by construction: no public data-plane endpoint, no human passwords, no standing machine credentials. Layer on top: (a) dynamic data masking so SSNs and member IDs are masked unless the querying role is explicitly entitled, and row-access policies so EU analysts cannot see US member rows and vice versa, satisfying residency; (b) object tagging to classify PII/PHI columns and drive both masking and discovery; © Wiz running continuous posture scanning across ADLS and the Azure resources, alerting the moment a storage container drifts toward public exposure or an integration’s RBAC widens — the backstop behind the policy controls; (d) HashiCorp Vault as the single broker for every machine credential so there is nothing durable to steal; (e) a guardrail breach — a failed login storm, a masking-policy change, a resource-monitor cap hit — auto-raises a ServiceNow incident so security and finance get a ticket, not just a log entry. Snowflake’s own ACCESS_HISTORY and LOGIN_HISTORY views give the per-query audit trail the regulator asked for.
Cost optimization. Compute (measured in credits) dominates the bill and grows with adoption, so engineer for it from day one.
| Lever | Mechanism | Typical effect |
|---|---|---|
| Auto-suspend | Suspend idle warehouses after 60s | Pay only for active query seconds, not idle time |
| Right-sizing | Match warehouse size to the workload, not the biggest job | Avoids paying XL rates for small queries |
| Multi-cluster scaling | Scale out only for concurrency spikes (month-end) | Burst capacity without a permanently large warehouse |
| Resource monitors | Hard credit caps per warehouse with early alerts | Stops runaway spend at the warehouse, not the invoice |
| Result cache | Re-serve identical queries from cache for free | Repetitive dashboards cost zero compute |
| Storage tiering | Time-travel/retention tuned per table | Trims storage on high-churn staging tables |
Pipe Snowflake’s credit and query metrics into Datadog so the FinOps team sees spend per warehouse — which maps to per-team chargeback — and so a runaway query that ignores the warehouse ceiling triggers an alert before month-end.
Scalability. Each workload scales independently because each owns its warehouse. The actuarial month-end run uses a multi-cluster XL warehouse that scales out for the concurrency burst and back in afterward; the fraud service runs a small warehouse continuously; ingestion uses its own loader warehouse so a heavy backfill never slows analyst queries. Storage scales transparently on Azure Blob underneath. The practical ceilings are the per-account concurrency and the regional Snowflake capacity, both of which a regulated multi-region rollout should size with Snowflake support early rather than discovering at quarter-end.
Failure modes, and what each one looks like. Name them before they page you.
- A missing or wrong private DNS record — the account URL resolves to a public IP, the network policy refuses it, and every connection hangs until timeout. Mitigation: assert the
privatelink.snowflakecomputing.comzone and records in Terraform plus a post-deploy connection smoke test. - Snowpipe lag or stalls — Event Grid notifications back up or a malformed file blocks a pipe, and “near-real-time” eligibility silently goes stale. Mitigation: monitor pipe lag and
COPYerrors in Datadog with a freshness SLO, and route bad files to a quarantine path. - An expired or unrotated service key — Vault fails to rotate, the ingestion job’s key expires, and loads stop. Mitigation: dual-key registration so one slot is always valid during rotation, plus an alert on key age.
- Warehouse cost runaway — an unbounded analyst query or a forgotten auto-suspend burns credits. Mitigation: resource monitors with hard suspend triggers and auto-suspend enforced by policy on every warehouse.
- Regional outage — see DR below.
Reliability and DR (RTO/RPO). Snowflake stores data redundantly within a region, but cross-region resilience is a deliberate design choice. For this carrier, use Snowflake replication and failover groups to replicate databases, account objects, and the relevant integrations to a paired Azure region on a schedule that meets the RPO; ADLS Gen2 is geo-redundant (GZRS) so the lake — the durable source of truth — survives a regional loss and ingestion can be replayed. A pragmatic target for the analytics platform: RPO 15 minutes via replication for the curated marts, RTO under an hour by promoting the secondary failover group and repointing the Private Link endpoint and Entra app in the paired region. Time Travel and Fail-safe additionally protect against accidental deletes and bad transformations within a region, which is the far more common recovery scenario than a full outage.
Observability. Use Datadog as the single pane: the Snowflake integration surfaces query performance, warehouse utilization, credit spend, queue depth, and login activity, and you layer data-freshness SLOs on top so a stalled Snowpipe or a late dbt run alerts before a business user notices a stale loss-ratio. Track the metrics the business cares about — credits per team, p95 query latency on the marts, Snowpipe lag, dbt run success and duration, and failed-login rate — and route critical alerts to on-call while cost alerts go to FinOps. dbt’s own test and freshness results feed the same dashboards so a data-quality regression is as visible as a latency one.
Governance. Maintain a clear role hierarchy (functional roles granted to access roles granted to users via Entra groups) so privilege is auditable end to end, and review it through the carrier’s existing access-certification cycle. Classify every PII/PHI column with object tags and attach masking by tag so new sensitive columns are protected by default. Keep dbt models and policy definitions in version control, reviewable and revertable, and gate new role grants through a ServiceNow change record so compliance has a documented approval. Pin warehouse sizes and retention as code so cost and behavior do not drift silently.
Explicit tradeoffs
Accept these or do not build it this way. Private Link, Entra SCIM, Vault-brokered keys, and storage integrations all add setup complexity over the password-and-public-endpoint quick start — five-plus moving parts whose payoff is structural security, not features a demo would show. The price of a misconfigured DNS zone or network policy is a silent hang, not a clear error, so the first deploy must include connection smoke tests. The Okta-to-Entra federation adds a hop the single-IdP shops will not need. And the per-warehouse isolation that prevents workload contention means more warehouses to size, monitor, and cap than a single shared cluster — discipline you trade for the guarantee that month-end never starves the fraud queries.
The alternatives, and when they win. If your data is modest, single-region, and the security bar is low, Snowflake over the public endpoint with key-pair auth is genuinely simpler and fine for a pilot — graduate to Private Link when residency or member data demands it. If your team already lives in the Azure-native stack and wants Spark notebooks alongside SQL, Azure Databricks with Unity Catalog is the stronger fit and overlaps heavily on governance; Snowflake wins here for the carrier because the workload is overwhelmingly SQL analytics with strict warehouse isolation and the simplest possible operational model. And if you need a cheap, append-only landing layer for raw files before any warehouse touches them, the ADLS lake itself is the right first stop — which is exactly why it sits in front of Snowflake in this design rather than being replaced by it.
The shape of the win
For the carrier, the payoff is not “a faster warehouse.” It is that the quarterly loss-ratio number now comes from one governed mart that the actuarial, fraud, and finance teams all trust, produced in hours instead of two weeks of manual stitching — and that when the next auditor asks “show me every credential with access to member-level claims and prove none of them is a standing human secret,” the answer is a clean one: humans authenticate through Entra with MFA and are de-provisioned by SCIM the day they leave, machines hold no durable key because Vault issues short-lived ones, nothing is reachable from the public internet because everything rides Private Link, and ACCESS_HISTORY shows exactly who queried what. That audit answer is what funds the platform. Everything upstream — the private endpoint, the Entra SCIM sync, the Vault-brokered keys, the storage integration, the masking and row policies, the Datadog freshness SLOs — exists so a regulator, a CISO, and a CFO each say yes. Start narrower if you must, but for regulated data at multi-region scale, this is where it has to land.