Terraform Module: Azure SignalR Service — Serverless-ready real-time hub with upstreams and locked-down networking

Quick take — A reusable hashicorp/azurerm ~> 4.0 Terraform module for Azure SignalR Service: service_mode selection, Serverless upstream endpoints, CORS, managed identity, network ACLs, and live-trace diagnostics. New here? Jump to the Quickstart below to deploy it in minutes; read on for how it works and when to reach for it.

Quickstart (copy-paste)

Minimal, runnable configuration — drop this in a .tf file and fill in the "..." placeholders (each required input is commented):

provider "azurerm" {
  features {}
}

module "signalr" {
  source = "git::https://dev.azure.com/teknohut/kloudvin/_git/terraform-modules//terraform-module-azure-signalr?ref=v1.0.0"

  name                = "..."  # SignalR Service name (3-63 chars; becomes `<name>.servi…
  resource_group_name = "..."  # Resource group to create the service in.
  location            = "..."  # Azure region (e.g. `centralindia`).
}

Then terraform init && terraform apply. Every other input has a sensible default — see Inputs below to override behaviour.

What this module is

Azure SignalR Service is a fully managed real-time messaging backplane: it handles the WebSocket (and fallback) connections, fan-out, and connection scale-out for ASP.NET Core SignalR apps and for serverless push scenarios driven by Azure Functions. Instead of your app server holding tens of thousands of sticky WebSocket connections, the service brokers them, so your backend stays stateless and horizontally scalable. Its behaviour hinges on one critical choice — service_mode: Default (your own SignalR hub server is in the loop), Serverless (no hub server; clients talk to the service and messages arrive via Azure Functions over upstream endpoints), or Classic (legacy auto-detect, not recommended for new builds).

Hand-built, a SignalR Service tends to go wrong in mode-specific ways: someone picks Serverless but forgets to register an upstream_endpoint, so client messages have nowhere to go; CORS is left at the wildcard * so any origin can negotiate; public_network_access_enabled stays open with no network ACL; or live_trace_enabled is off, leaving you blind when a negotiate handshake fails in production. Wrapping azurerm_signalr_service in a module fixes the choice once: it validates the SKU/capacity pairing, wires the system-assigned managed identity (so upstreams can use ManagedIdentity auth instead of a shared key in the URL), defaults the network ACL to deny public with explicit allow-lists, and turns on connectivity/messaging logs and live trace. Callers consume one variable surface and inherit the same hardened, mode-aware baseline — and the module exposes hostname, server_port, and the primary_connection_string so downstream Function Apps and App Services can bind to it without anyone copying keys out of the portal.

When to use it

You are building ASP.NET Core SignalR apps (chat, live dashboards, collaborative editing) and want the connection scale-out offloaded from your web tier (Default mode).
You are pushing real-time messages from Azure Functions with no persistent hub server, using the SignalR bindings and upstream endpoints (Serverless mode).
You need a managed identity so upstream calls and Key Vault references authenticate via Entra ID rather than a key embedded in the connection string.
You run multiple environments or regions and need an identical, policy-friendly SignalR resource with consistent CORS, network ACLs, and diagnostics in each.
You want live trace and connectivity/messaging logs captured by default so negotiate/handshake failures are debuggable after the fact.

Skip it if a simple request/response API is enough, or if your “real-time” need is server-to-server eventing — reach for Azure Web PubSub (raw WebSocket/MQTT pub-sub) or Event Grid / Service Bus instead. SignalR Service is specifically for the SignalR client protocol and its negotiate/transport handshake.

Module structure

terraform-module-azure-signalr/
├── versions.tf
├── main.tf
├── variables.tf
└── outputs.tf

versions.tf

terraform {
  required_version = ">= 1.5.0"

  required_providers {
    azurerm = {
      source  = "hashicorp/azurerm"
      version = "~> 4.0"
    }
  }
}

main.tf

locals {
  # Free_F1 only supports capacity 1; Standard_S1/Premium_P1 scale in fixed unit steps.
  base_tags = merge(
    {
      "managed-by" = "terraform"
      "module"     = "terraform-module-azure-signalr"
    },
    var.tags
  )
}

resource "azurerm_signalr_service" "this" {
  name                = var.name
  resource_group_name = var.resource_group_name
  location            = var.location

  sku {
    name     = var.sku_name
    capacity = var.capacity
  }

  # Default = your hub server is in the loop; Serverless = Functions + upstreams; Classic = legacy.
  service_mode = var.service_mode

  # Diagnostics: surface negotiate/transport problems instead of guessing.
  connectivity_logs_enabled = var.connectivity_logs_enabled
  messaging_logs_enabled    = var.messaging_logs_enabled
  live_trace_enabled        = var.live_trace_enabled

  # Lock the data-plane endpoint down by default; callers opt into public access.
  public_network_access_enabled = var.public_network_access_enabled
  local_auth_enabled            = var.local_auth_enabled
  aad_auth_enabled              = var.aad_auth_enabled
  tls_client_cert_enabled       = var.tls_client_cert_enabled

  identity {
    type         = var.user_assigned_identity_ids == null ? "SystemAssigned" : "UserAssigned"
    identity_ids = var.user_assigned_identity_ids
  }

  # Explicit CORS allow-list instead of the wildcard "*" default.
  dynamic "cors" {
    for_each = length(var.allowed_origins) == 0 ? [] : [1]
    content {
      allowed_origins = var.allowed_origins
    }
  }

  # Serverless mode: where the service forwards client invocations (Function App, etc.).
  dynamic "upstream_endpoint" {
    for_each = var.upstream_endpoints
    content {
      url_template        = upstream_endpoint.value.url_template
      category_pattern    = upstream_endpoint.value.category_pattern
      event_pattern       = upstream_endpoint.value.event_pattern
      hub_pattern         = upstream_endpoint.value.hub_pattern
      user_assigned_identity_id = upstream_endpoint.value.user_assigned_identity_id
    }
  }

  tags = local.base_tags
}

# --- Common production sub-resources ---------------------------------------

# Default-deny public network, then explicitly allow named Private Endpoint connections.
resource "azurerm_signalr_service_network_acl" "this" {
  count             = var.configure_network_acl ? 1 : 0
  signalr_service_id = azurerm_signalr_service.this.id
  default_action    = var.network_acl_default_action

  public_network {
    allowed_request_types = var.public_allowed_request_types
    denied_request_types  = var.public_denied_request_types
  }

  dynamic "private_endpoint" {
    for_each = var.private_endpoint_acls
    content {
      id                    = private_endpoint.value.connection_id
      allowed_request_types = private_endpoint.value.allowed_request_types
      denied_request_types  = private_endpoint.value.denied_request_types
    }
  }
}

# Optional: ship connectivity/messaging logs and metrics to Log Analytics.
resource "azurerm_monitor_diagnostic_setting" "this" {
  count = var.log_analytics_workspace_id == null ? 0 : 1

  name                       = "diag-to-law"
  target_resource_id         = azurerm_signalr_service.this.id
  log_analytics_workspace_id = var.log_analytics_workspace_id

  enabled_log {
    category = "AllLogs"
  }

  metric {
    category = "AllMetrics"
  }
}

variables.tf

variable "name" {
  description = "Name of the SignalR Service. 3-63 chars, alphanumerics and hyphens, must start with a letter and end alphanumeric. Forms the <name>.service.signalr.net hostname."
  type        = string

  validation {
    condition     = can(regex("^[A-Za-z][A-Za-z0-9-]{1,61}[A-Za-z0-9]$", var.name))
    error_message = "name must be 3-63 chars: start with a letter, end alphanumeric, only letters/digits/hyphens."
  }
}

variable "resource_group_name" {
  description = "Resource group in which to create the SignalR Service."
  type        = string
}

variable "location" {
  description = "Azure region for the SignalR Service (e.g. centralindia, eastus)."
  type        = string
}

variable "sku_name" {
  description = "SignalR SKU. Free_F1 for dev (capacity 1 only), Standard_S1 for general production, Premium_P1 for zone redundancy + autoscale, Premium_P2 for higher scale."
  type        = string
  default     = "Standard_S1"

  validation {
    condition     = contains(["Free_F1", "Standard_S1", "Premium_P1", "Premium_P2"], var.sku_name)
    error_message = "sku_name must be one of: Free_F1, Standard_S1, Premium_P1, Premium_P2."
  }
}

variable "capacity" {
  description = "Number of SignalR units (each unit = 1,000 concurrent connections / 1,000,000 messages per day). Free_F1 supports only 1."
  type        = number
  default     = 1

  validation {
    condition     = contains([1, 2, 5, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100], var.capacity)
    error_message = "capacity must be one of: 1, 2, 5, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100."
  }
}

variable "service_mode" {
  description = "Operating mode: Default (your hub server is in the loop), Serverless (Functions + upstream endpoints, no hub server), or Classic (legacy)."
  type        = string
  default     = "Default"

  validation {
    condition     = contains(["Default", "Serverless", "Classic"], var.service_mode)
    error_message = "service_mode must be one of: Default, Serverless, Classic."
  }
}

variable "connectivity_logs_enabled" {
  description = "Log client connect/disconnect events (essential for diagnosing negotiate/transport failures)."
  type        = bool
  default     = true
}

variable "messaging_logs_enabled" {
  description = "Log message traffic through the service."
  type        = bool
  default     = true
}

variable "live_trace_enabled" {
  description = "Enable the Live Trace tool for real-time connection/message tracing during debugging."
  type        = bool
  default     = true
}

variable "public_network_access_enabled" {
  description = "Whether the data-plane endpoint is reachable over the public network. Defaults to false (use Private Link)."
  type        = bool
  default     = false
}

variable "local_auth_enabled" {
  description = "Allow access-key (local) authentication. Defaults to false to force Entra ID auth; set true only if clients negotiate with the access key."
  type        = bool
  default     = false
}

variable "aad_auth_enabled" {
  description = "Allow Entra ID (Azure AD) authentication to the data plane. Keep true so managed identities can negotiate."
  type        = bool
  default     = true
}

variable "tls_client_cert_enabled" {
  description = "Require a client certificate on the TLS handshake (mutual TLS). Not supported on Free_F1."
  type        = bool
  default     = false
}

variable "user_assigned_identity_ids" {
  description = "Optional list of user-assigned managed identity resource IDs. When null, a system-assigned identity is used instead."
  type        = list(string)
  default     = null
}

variable "allowed_origins" {
  description = "CORS allowed origins for the negotiate endpoint. Empty list means the provider applies the '*' wildcard default; supply explicit origins in production."
  type        = list(string)
  default     = []
}

variable "upstream_endpoints" {
  description = "Serverless-mode upstream endpoints the service forwards client invocations to. Use ManagedIdentity-authenticated URL templates where possible."
  type = list(object({
    url_template              = string
    category_pattern          = optional(list(string), ["*"])
    event_pattern             = optional(list(string), ["*"])
    hub_pattern               = optional(list(string), ["*"])
    user_assigned_identity_id = optional(string)
  }))
  default = []

  validation {
    condition = alltrue([
      for u in var.upstream_endpoints : can(regex("^https://", u.url_template))
    ])
    error_message = "Each upstream url_template must be an https:// URL."
  }
}

variable "configure_network_acl" {
  description = "Whether to create a network ACL (azurerm_signalr_service_network_acl) for this service."
  type        = bool
  default     = true
}

variable "network_acl_default_action" {
  description = "Default action for the network ACL: Allow or Deny. Defaults to Deny (default-deny posture)."
  type        = string
  default     = "Deny"

  validation {
    condition     = contains(["Allow", "Deny"], var.network_acl_default_action)
    error_message = "network_acl_default_action must be 'Allow' or 'Deny'."
  }
}

variable "public_allowed_request_types" {
  description = "Request types permitted from the public network when default_action is Deny (e.g. [\"ClientConnection\"]). Empty list denies all public request types."
  type        = list(string)
  default     = []
}

variable "public_denied_request_types" {
  description = "Request types explicitly denied from the public network when default_action is Allow. Mutually exclusive with public_allowed_request_types."
  type        = list(string)
  default     = []
}

variable "private_endpoint_acls" {
  description = "Per-Private-Endpoint network ACL rules (connection_id from the private endpoint connection)."
  type = list(object({
    connection_id         = string
    allowed_request_types = optional(list(string), ["ClientConnection", "ServerConnection", "RESTAPI", "Trace"])
    denied_request_types  = optional(list(string), [])
  }))
  default = []
}

variable "log_analytics_workspace_id" {
  description = "Optional Log Analytics workspace resource ID for diagnostic settings (AllLogs + AllMetrics). Set null to skip diagnostics."
  type        = string
  default     = null
}

variable "tags" {
  description = "Additional tags merged onto the SignalR Service."
  type        = map(string)
  default     = {}
}

outputs.tf

output "id" {
  description = "Resource ID of the SignalR Service."
  value       = azurerm_signalr_service.this.id
}

output "name" {
  description = "Name of the SignalR Service."
  value       = azurerm_signalr_service.this.name
}

output "hostname" {
  description = "FQDN of the SignalR Service (e.g. <name>.service.signalr.net)."
  value       = azurerm_signalr_service.this.hostname
}

output "ip_address" {
  description = "Public IP address of the SignalR Service."
  value       = azurerm_signalr_service.this.ip_address
}

output "server_port" {
  description = "Server port used by the SignalR Service (typically 443)."
  value       = azurerm_signalr_service.this.server_port
}

output "public_port" {
  description = "Public port used by the SignalR Service for client connections."
  value       = azurerm_signalr_service.this.public_port
}

output "principal_id" {
  description = "Object (principal) ID of the system-assigned managed identity, when one is used. Grant this RBAC on upstream/Key Vault scopes."
  value       = try(azurerm_signalr_service.this.identity[0].principal_id, null)
}

output "primary_access_key" {
  description = "Primary access key for the SignalR Service (sensitive)."
  value       = azurerm_signalr_service.this.primary_access_key
  sensitive   = true
}

output "primary_connection_string" {
  description = "Primary connection string for SignalR SDK / AzureSignalRConnectionString app setting (sensitive)."
  value       = azurerm_signalr_service.this.primary_connection_string
  sensitive   = true
}

How to use it

module "signalr_service" {
  source = "git::https://dev.azure.com/teknohut/kloudvin/_git/terraform-modules//terraform-module-azure-signalr?ref=v1.0.0"

  name                = "sigr-livedash-prod-cin"
  resource_group_name = azurerm_resource_group.app.name
  location            = azurerm_resource_group.app.location

  sku_name = "Premium_P1" # zone-redundant
  capacity = 2            # ~2,000 concurrent connections headroom

  # Serverless: real-time push driven by an Azure Functions app, no hub server.
  service_mode = "Serverless"

  upstream_endpoints = [
    {
      url_template              = "https://func-livedash-prod.azurewebsites.net/runtime/webhooks/signalr?code={token}"
      category_pattern          = ["messages", "connections"]
      hub_pattern               = ["dashboardHub"]
      user_assigned_identity_id = azurerm_user_assigned_identity.signalr_upstream.id
    }
  ]

  # Lock CORS to the SPA origins that call negotiate.
  allowed_origins = [
    "https://dashboard.kloudvin.com",
    "https://staging-dashboard.kloudvin.com"
  ]

  # Default-deny public; only Private Endpoint client connections allowed.
  public_network_access_enabled = false
  configure_network_acl         = true
  network_acl_default_action    = "Deny"

  log_analytics_workspace_id = azurerm_log_analytics_workspace.app.id

  tags = {
    environment = "prod"
    owner       = "realtime-platform"
  }
}

# Downstream: bind the Function App to the service via its connection string.
resource "azurerm_linux_function_app" "livedash" {
  name                       = "func-livedash-prod"
  resource_group_name        = azurerm_resource_group.app.name
  location                   = azurerm_resource_group.app.location
  service_plan_id            = azurerm_service_plan.app.id
  storage_account_name       = azurerm_storage_account.app.name
  storage_account_access_key = azurerm_storage_account.app.primary_access_key

  app_settings = {
    "AzureSignalRConnectionString" = module.signalr_service.primary_connection_string
    "SIGNALR_HOSTNAME"             = module.signalr_service.hostname
  }

  site_config {}
}

With Terragrunt

Terragrunt keeps this module DRY across environments — define the backend and provider once in a root config, then a thin terragrunt.hcl per environment supplies only the inputs that differ.

1. Root config — live/terragrunt.hcl (inherited by every module):

remote_state {
  backend = "azurerm"
  generate = { path = "backend.tf", if_exists = "overwrite" }
  config = {
    # ...azurerm state bucket/container + key per path...
  }
}

2. Module config — live/prod/signalr/terragrunt.hcl:

include "root" {
  path = find_in_parent_folders()
}

terraform {
  source = "git::https://dev.azure.com/teknohut/kloudvin/_git/terraform-modules//terraform-module-azure-signalr?ref=v1.0.0"
}

inputs = {
  name = "..."
  resource_group_name = "..."
  location = "..."
}

3. Deploy one environment, or roll out all modules together:

cd live/prod/signalr && terragrunt apply        # this module
terragrunt run-all apply                      # every module under live/prod

Why Terragrunt here: the backend and provider live in one place instead of being copy-pasted into every module; inputs is overridden per environment (dev / stage / prod) without forking the module; and run-all orchestrates dependencies across modules. Reach for it once you have more than one environment or more than a handful of modules — for a single stack, the plain Quickstart above is enough.

Inputs

Name	Type	Default	Required	Description
`name`	`string`	—	Yes	SignalR Service name (3-63 chars; becomes `<name>.service.signalr.net`).
`resource_group_name`	`string`	—	Yes	Resource group to create the service in.
`location`	`string`	—	Yes	Azure region (e.g. `centralindia`).
`sku_name`	`string`	`"Standard_S1"`	No	SKU: `Free_F1`, `Standard_S1`, `Premium_P1`, `Premium_P2`.
`capacity`	`number`	`1`	No	SignalR units (1/2/5/10…100); each unit = 1,000 connections.
`service_mode`	`string`	`"Default"`	No	`Default`, `Serverless`, or `Classic`.
`connectivity_logs_enabled`	`bool`	`true`	No	Log client connect/disconnect events.
`messaging_logs_enabled`	`bool`	`true`	No	Log message traffic.
`live_trace_enabled`	`bool`	`true`	No	Enable the Live Trace debugging tool.
`public_network_access_enabled`	`bool`	`false`	No	Expose the data-plane publicly; keep `false` and use Private Link.
`local_auth_enabled`	`bool`	`false`	No	Allow access-key auth; `false` forces Entra ID.
`aad_auth_enabled`	`bool`	`true`	No	Allow Entra ID auth so managed identities can negotiate.
`tls_client_cert_enabled`	`bool`	`false`	No	Require client certificate (mTLS); unsupported on `Free_F1`.
`user_assigned_identity_ids`	`list(string)`	`null`	No	User-assigned identity IDs; `null` uses a system-assigned identity.
`allowed_origins`	`list(string)`	`[]`	No	CORS allowed origins; empty = provider `*` default.
`upstream_endpoints`	`list(object)`	`[]`	No	Serverless upstream endpoints (`url_template`, patterns, optional identity).
`configure_network_acl`	`bool`	`true`	No	Create a network ACL for the service.
`network_acl_default_action`	`string`	`"Deny"`	No	Network ACL default: `Allow` or `Deny`.
`public_allowed_request_types`	`list(string)`	`[]`	No	Request types allowed from public network (when default is `Deny`).
`public_denied_request_types`	`list(string)`	`[]`	No	Request types denied from public network (when default is `Allow`).
`private_endpoint_acls`	`list(object)`	`[]`	No	Per-Private-Endpoint ACL rules keyed by connection ID.
`log_analytics_workspace_id`	`string`	`null`	No	Log Analytics workspace ID for diagnostics; `null` skips.
`tags`	`map(string)`	`{}`	No	Additional tags merged onto the service.

Outputs

Name	Description
`id`	Resource ID of the SignalR Service.
`name`	Name of the SignalR Service.
`hostname`	FQDN (`<name>.service.signalr.net`).
`ip_address`	Public IP address of the service.
`server_port`	Server port (typically 443).
`public_port`	Public port for client connections.
`principal_id`	Object ID of the system-assigned managed identity (when used).
`primary_access_key`	Primary access key (sensitive).
`primary_connection_string`	Primary connection string for `AzureSignalRConnectionString` (sensitive).

Enterprise scenario

A logistics company runs a live fleet-tracking dashboard for 1,500 concurrent dispatchers across two regions. They deploy this module in Serverless mode with a Premium_P1 SKU (capacity 2, zone-redundant) per region, register an upstream endpoint pointing at a regional Azure Functions app that authenticates with a user-assigned managed identity, and lock CORS to the dispatcher SPA origin only. Public network access is disabled with a default-deny network ACL exposing only Private Endpoint ClientConnection traffic, and connectivity/messaging logs plus live trace flow into a central Log Analytics workspace — so when a region’s negotiate latency spikes, the platform team can trace the failing handshakes in seconds rather than reproducing them live.

Best practices

Pick service_mode deliberately and match the wiring to it. Serverless requires at least one upstream_endpoint or client messages have nowhere to land; Default requires your own hub server. Never ship Classic for new workloads — it only exists for legacy auto-detect compatibility.
Never leave CORS at * in production. Supply explicit allowed_origins for the exact SPA/domain that calls the negotiate endpoint; the wildcard lets any origin initiate a connection and is a common review finding.
Disable public access and use Private Link with a default-deny ACL. Set public_network_access_enabled = false and a network_acl_default_action = "Deny", allowing only the Private Endpoint request types you actually need (ClientConnection, etc.).
Prefer managed identity over keys for upstreams and binding. Set local_auth_enabled = false and authenticate upstream endpoints with a user-assigned identity; treat primary_connection_string/primary_access_key as last resorts and keep them out of source by sourcing from Key Vault.
Right-size capacity and choose the SKU for the SLA, not just scale. Each unit covers ~1,000 concurrent connections and 1M messages/day; use Premium_P1/P2 when you need availability-zone redundancy and autoscale, and reserve Free_F1 (capacity 1, no mTLS) strictly for dev.
Always enable diagnostics and name consistently. Keep connectivity_logs_enabled, messaging_logs_enabled, and live_trace_enabled on and wire log_analytics_workspace_id; follow a sigr-<app>-<env>-<region> convention and tag owner/environment so cost and incident routing stay unambiguous.