Quick take — A reusable Terraform module for azurerm_web_pubsub on hashicorp/azurerm ~> 4.0: SKU-driven capacity, hubs with event handlers, managed identity and local-auth controls for secure real-time WebSocket apps. New here? Jump to the Quickstart below to deploy it in minutes; read on for how it works and when to reach for it.
Quickstart (copy-paste)
Minimal, runnable configuration — drop this in a .tf file and fill in the "..." placeholders (each required input is commented):
provider "azurerm" {
features {}
}
module "web_pubsub" {
source = "git::https://dev.azure.com/teknohut/kloudvin/_git/terraform-modules//terraform-module-azure-web-pubsub?ref=v1.0.0"
name = "..." # Globally unique Web PubSub name (3-63 chars, starts wit…
resource_group_name = "..." # Resource group to deploy into.
location = "..." # Azure region.
}
Then terraform init && terraform apply. Every other input has a sensible default — see Inputs below to override behaviour.
What this module is
Azure Web PubSub is a fully managed service for building real-time messaging over WebSockets (and Server-Sent Events) at scale. It handles the connection fan-out, group/room semantics, and the publish-subscribe broker so your app server doesn’t have to hold thousands of sticky sockets. Think live dashboards, collaborative editing, multiplayer state sync, IoT telemetry streams, and chat — anything where the server needs to push to many clients instantly.
Wrapping azurerm_web_pubsub in a reusable Terraform module matters because a “real” Web PubSub deployment is never just the one resource. In production you almost always pair it with hubs (azurerm_web_pubsub_hub) that route client events to your backend via event handlers, a system-assigned managed identity so those handlers can reach Azure Functions or Event Grid without secrets, and tight control over local (access-key) authentication, TLS client-cert behaviour, and public network access. This module bundles those decisions into a single, opinionated, variable-driven unit so every environment (dev/test/prod) provisions the same secure shape — only the SKU, capacity, and hub wiring change.
When to use it
- You’re standing up real-time WebSocket messaging and want consistent, reviewable infrastructure instead of click-ops in the portal.
- You run multiple environments and need identical security posture (managed identity on, local auth off where possible) with only capacity differing.
- You want hub-level event handlers wired declaratively so client
connect/message/disconnectedsystem events flow to an Azure Function or HTTP backend. - You need the Web PubSub identity’s principal ID as an output so a downstream module can grant it RBAC (e.g., on a Function App or Event Grid topic).
- Reach for a different tool if you only need raw pub/sub between backend services with no browser/WebSocket clients — Service Bus or Event Grid may fit better. Web PubSub shines when the client is a browser or device holding a live socket.
Module structure
terraform-module-azure-web-pubsub/
├── versions.tf
├── main.tf
├── variables.tf
└── outputs.tf
versions.tf
terraform {
required_version = ">= 1.5.0"
required_providers {
azurerm = {
source = "hashicorp/azurerm"
version = "~> 4.0"
}
}
}
main.tf
resource "azurerm_web_pubsub" "this" {
name = var.name
resource_group_name = var.resource_group_name
location = var.location
sku = var.sku
capacity = var.capacity
# Disable access-key (local) auth to force AAD-only management plane access.
local_auth_enabled = var.local_auth_enabled
# Lock down the data plane to the public internet when running private.
public_network_access_enabled = var.public_network_access_enabled
# TLS client certificate handling for mutual-TLS scenarios.
tls_client_cert_enabled = var.tls_client_cert_enabled
# AAD auth toggle (kept enabled so managed identities / SPs can call the API).
aad_auth_enabled = var.aad_auth_enabled
dynamic "identity" {
for_each = var.identity_type == null ? [] : [1]
content {
type = var.identity_type
identity_ids = var.identity_type == "UserAssigned" ? var.identity_ids : null
}
}
dynamic "live_trace" {
for_each = var.live_trace == null ? [] : [var.live_trace]
content {
enabled = live_trace.value.enabled
messaging_logs_enabled = live_trace.value.messaging_logs_enabled
connectivity_logs_enabled = live_trace.value.connectivity_logs_enabled
http_request_logs_enabled = live_trace.value.http_request_logs_enabled
}
}
tags = var.tags
}
resource "azurerm_web_pubsub_hub" "this" {
for_each = var.hubs
name = each.key
web_pubsub_id = azurerm_web_pubsub.this.id
anonymous_connections_enabled = each.value.anonymous_connections_enabled
dynamic "event_handler" {
for_each = each.value.event_handlers
content {
url_template = event_handler.value.url_template
user_event_pattern = event_handler.value.user_event_pattern
system_events = event_handler.value.system_events
dynamic "auth" {
for_each = event_handler.value.managed_identity_id == null ? [] : [1]
content {
managed_identity_id = event_handler.value.managed_identity_id
}
}
}
}
dynamic "event_listener" {
for_each = each.value.event_listeners
content {
system_event_name_filter = event_listener.value.system_event_name_filter
user_event_name_filter = event_listener.value.user_event_name_filter
eventhub_namespace_name = event_listener.value.eventhub_namespace_name
eventhub_name = event_listener.value.eventhub_name
}
}
}
variables.tf
variable "name" {
description = "Globally unique name of the Web PubSub resource (3-63 chars, letters/numbers/hyphens)."
type = string
validation {
condition = can(regex("^[a-zA-Z][a-zA-Z0-9-]{1,61}[a-zA-Z0-9]$", var.name))
error_message = "name must be 3-63 chars, start with a letter, and contain only letters, numbers, and hyphens."
}
}
variable "resource_group_name" {
description = "Name of the resource group to deploy into."
type = string
}
variable "location" {
description = "Azure region (e.g. eastus, westeurope, centralindia)."
type = string
}
variable "sku" {
description = "Pricing tier: Free_F1, Standard_S1, or Premium_P1."
type = string
default = "Standard_S1"
validation {
condition = contains(["Free_F1", "Standard_S1", "Premium_P1"], var.sku)
error_message = "sku must be one of Free_F1, Standard_S1, or Premium_P1."
}
}
variable "capacity" {
description = "Number of units. Free_F1 must be 1; Standard/Premium support 1,2,5,10,20,50,100."
type = number
default = 1
validation {
condition = contains([1, 2, 5, 10, 20, 50, 100], var.capacity)
error_message = "capacity must be one of 1, 2, 5, 10, 20, 50, 100."
}
}
variable "local_auth_enabled" {
description = "Allow access-key (local) auth. Set false to require AAD-only access."
type = bool
default = false
}
variable "public_network_access_enabled" {
description = "Whether the service is reachable from the public internet."
type = bool
default = true
}
variable "tls_client_cert_enabled" {
description = "Request a TLS client certificate from connecting clients (mutual TLS)."
type = bool
default = false
}
variable "aad_auth_enabled" {
description = "Allow Azure AD (Entra ID) authentication to the service API."
type = bool
default = true
}
variable "identity_type" {
description = "Managed identity type: SystemAssigned, UserAssigned, or null to disable."
type = string
default = "SystemAssigned"
validation {
condition = var.identity_type == null || contains(["SystemAssigned", "UserAssigned"], var.identity_type)
error_message = "identity_type must be SystemAssigned, UserAssigned, or null."
}
}
variable "identity_ids" {
description = "User-assigned identity resource IDs (required when identity_type is UserAssigned)."
type = list(string)
default = []
}
variable "live_trace" {
description = "Live Trace tool settings; null to leave at service defaults."
type = object({
enabled = optional(bool, true)
messaging_logs_enabled = optional(bool, true)
connectivity_logs_enabled = optional(bool, true)
http_request_logs_enabled = optional(bool, false)
})
default = null
}
variable "hubs" {
description = <<-EOT
Map of hub name => hub configuration. Each hub can declare event handlers
(to route client events to a backend) and optional Event Hub listeners.
EOT
type = map(object({
anonymous_connections_enabled = optional(bool, false)
event_handlers = optional(list(object({
url_template = string
user_event_pattern = optional(string, "*")
system_events = optional(list(string), [])
managed_identity_id = optional(string)
})), [])
event_listeners = optional(list(object({
system_event_name_filter = optional(list(string), [])
user_event_name_filter = optional(list(string), [])
eventhub_namespace_name = string
eventhub_name = string
})), [])
}))
default = {}
validation {
condition = alltrue([
for h in values(var.hubs) : alltrue([
for eh in h.event_handlers : alltrue([
for se in eh.system_events :
contains(["connect", "connected", "disconnected"], se)
])
])
])
error_message = "system_events entries must be one of: connect, connected, disconnected."
}
}
variable "tags" {
description = "Tags applied to the Web PubSub resource."
type = map(string)
default = {}
}
outputs.tf
output "id" {
description = "Resource ID of the Web PubSub instance."
value = azurerm_web_pubsub.this.id
}
output "name" {
description = "Name of the Web PubSub instance."
value = azurerm_web_pubsub.this.name
}
output "hostname" {
description = "FQDN of the Web PubSub service (e.g. <name>.webpubsub.azure.com)."
value = azurerm_web_pubsub.this.hostname
}
output "public_port" {
description = "Publicly accessible port of the service."
value = azurerm_web_pubsub.this.public_port
}
output "server_port" {
description = "Server-side port of the service."
value = azurerm_web_pubsub.this.server_port
}
output "external_ip" {
description = "Public IP address of the service."
value = azurerm_web_pubsub.this.external_ip
}
output "primary_connection_string" {
description = "Primary connection string. Empty when local_auth_enabled = false."
value = azurerm_web_pubsub.this.primary_connection_string
sensitive = true
}
output "primary_access_key" {
description = "Primary access key. Empty when local_auth_enabled = false."
value = azurerm_web_pubsub.this.primary_access_key
sensitive = true
}
output "identity_principal_id" {
description = "Principal ID of the system-assigned identity (null if not enabled). Use this to grant RBAC to downstream resources."
value = try(azurerm_web_pubsub.this.identity[0].principal_id, null)
}
output "identity_tenant_id" {
description = "Tenant ID of the system-assigned identity (null if not enabled)."
value = try(azurerm_web_pubsub.this.identity[0].tenant_id, null)
}
output "hub_ids" {
description = "Map of hub name => hub resource ID."
value = { for k, h in azurerm_web_pubsub_hub.this : k => h.id }
}
How to use it
module "web_pubsub" {
source = "git::https://dev.azure.com/teknohut/kloudvin/_git/terraform-modules//terraform-module-azure-web-pubsub?ref=v1.0.0"
name = "wps-collab-prod"
resource_group_name = azurerm_resource_group.app.name
location = azurerm_resource_group.app.location
sku = "Standard_S1"
capacity = 2
# Production posture: AAD-only management, identity on for handler auth.
local_auth_enabled = false
public_network_access_enabled = true
identity_type = "SystemAssigned"
live_trace = {
enabled = true
messaging_logs_enabled = true
}
hubs = {
"collab" = {
event_handlers = [
{
url_template = "https://${azurerm_linux_function_app.events.default_hostname}/api/eventhandler"
user_event_pattern = "*"
system_events = ["connect", "connected", "disconnected"]
managed_identity_id = azurerm_user_assigned_identity.wps.id
}
]
}
}
tags = {
environment = "prod"
workload = "collab-editor"
owner = "platform-team"
}
}
# Downstream: grant the Web PubSub system identity rights to invoke the Function.
resource "azurerm_role_assignment" "wps_to_function" {
scope = azurerm_linux_function_app.events.id
role_definition_name = "Contributor"
principal_id = module.web_pubsub.identity_principal_id
}
# Downstream: surface the hostname to the SPA build so the client can connect.
resource "azurerm_static_web_app_custom_domain_setting" "noop" {
# illustrative only — typically you'd pass module.web_pubsub.hostname
# into app settings or a config map consumed by the front end.
}
With Terragrunt
Terragrunt keeps this module DRY across environments — define the backend and provider once in a root config, then a thin terragrunt.hcl per environment supplies only the inputs that differ.
1. Root config — live/terragrunt.hcl (inherited by every module):
remote_state {
backend = "azurerm"
generate = { path = "backend.tf", if_exists = "overwrite" }
config = {
# ...azurerm state bucket/container + key per path...
}
}
2. Module config — live/prod/web_pubsub/terragrunt.hcl:
include "root" {
path = find_in_parent_folders()
}
terraform {
source = "git::https://dev.azure.com/teknohut/kloudvin/_git/terraform-modules//terraform-module-azure-web-pubsub?ref=v1.0.0"
}
inputs = {
name = "..."
resource_group_name = "..."
location = "..."
}
3. Deploy one environment, or roll out all modules together:
cd live/prod/web_pubsub && terragrunt apply # this module
terragrunt run-all apply # every module under live/prod
Why Terragrunt here: the backend and provider live in one place instead of being copy-pasted into every module; inputs is overridden per environment (dev / stage / prod) without forking the module; and run-all orchestrates dependencies across modules. Reach for it once you have more than one environment or more than a handful of modules — for a single stack, the plain Quickstart above is enough.
Inputs
| Name | Type | Default | Required | Description |
|---|---|---|---|---|
name |
string |
n/a | Yes | Globally unique Web PubSub name (3-63 chars, starts with a letter). |
resource_group_name |
string |
n/a | Yes | Resource group to deploy into. |
location |
string |
n/a | Yes | Azure region. |
sku |
string |
"Standard_S1" |
No | Pricing tier: Free_F1, Standard_S1, or Premium_P1. |
capacity |
number |
1 |
No | Units: one of 1, 2, 5, 10, 20, 50, 100. |
local_auth_enabled |
bool |
false |
No | Allow access-key auth; false forces AAD-only. |
public_network_access_enabled |
bool |
true |
No | Reachable from the public internet. |
tls_client_cert_enabled |
bool |
false |
No | Request a TLS client certificate (mTLS). |
aad_auth_enabled |
bool |
true |
No | Allow Azure AD (Entra ID) authentication. |
identity_type |
string |
"SystemAssigned" |
No | SystemAssigned, UserAssigned, or null. |
identity_ids |
list(string) |
[] |
No | User-assigned identity IDs (when UserAssigned). |
live_trace |
object |
null |
No | Live Trace logging settings. |
hubs |
map(object) |
{} |
No | Hubs with event handlers and Event Hub listeners. |
tags |
map(string) |
{} |
No | Tags applied to the resource. |
Outputs
| Name | Description |
|---|---|
id |
Resource ID of the Web PubSub instance. |
name |
Name of the Web PubSub instance. |
hostname |
Service FQDN (<name>.webpubsub.azure.com). |
public_port |
Publicly accessible port. |
server_port |
Server-side port. |
external_ip |
Public IP address of the service. |
primary_connection_string |
Primary connection string (sensitive; empty if local auth disabled). |
primary_access_key |
Primary access key (sensitive; empty if local auth disabled). |
identity_principal_id |
System-assigned identity principal ID for RBAC grants. |
identity_tenant_id |
System-assigned identity tenant ID. |
hub_ids |
Map of hub name to hub resource ID. |
Enterprise scenario
A SaaS company runs a browser-based collaborative document editor where many users edit the same document simultaneously. They deploy this module per region with sku = "Standard_S1", capacity = 5, and local_auth_enabled = false, defining a single collab hub whose event handler points at an Azure Function. When a client connects, the connect system event hits the Function (authenticated via the Web PubSub managed identity, no shared keys), which validates the user’s session and authorizes them into the right document group — keeping presence, cursors, and edits in sync with sub-second latency without the app servers managing any sticky WebSocket state.
Best practices
- Prefer AAD over access keys. Set
local_auth_enabled = falseand authenticate the control/data plane via the system-assigned managed identity; this leavesprimary_access_key/primary_connection_stringempty so there’s no long-lived secret to leak or rotate. - Authenticate event handlers with managed identity. Always set
managed_identity_idon event handlers (rather than an unauthenticated URL) so your Function/HTTP backend can verify the caller token and reject spoofed event POSTs. - Right-size the SKU and capacity for cost.
Free_F1is hard-capped (1 unit, 20 connections) and is for dev only; scaleStandard_S1bycapacity(each unit adds ~1,000 concurrent connections and ~1M messages/day) and reservePremium_P1for autoscale, availability zones, and private endpoints. - Lock down the network for sensitive workloads. Use
Premium_P1withpublic_network_access_enabled = falseplus a private endpoint when the service must not be internet-reachable; combine withtls_client_cert_enabled = truefor mutual-TLS device scenarios. - Turn on Live Trace and diagnostics in non-prod. Enable
live_tracewith messaging and connectivity logs to debug connection/auth failures, then dialhttp_request_logs_enableddown in production to control log volume and cost. - Name and tag deterministically. Use a region/env-suffixed convention like
wps-<workload>-<env>(globally unique, DNS-safe) and apply consistentenvironment/owner/workloadtags so cost allocation and ownership stay clear across regions.