Quick take — A reusable hashicorp/azurerm ~> 4.0 module for Azure Data Explorer: SKU and autoscale, system-assigned identity, double encryption, per-database hot cache and retention, RBAC database principals, and an Event Hub ingestion connection wired into clean outputs. New here? Jump to the Quickstart below to deploy it in minutes; read on for how it works and when to reach for it.
Quickstart (copy-paste)
Minimal, runnable configuration — drop this in a .tf file and fill in the "..." placeholders (each required input is commented):
provider "azurerm" {
features {}
}
module "data_explorer" {
source = "git::https://dev.azure.com/teknohut/kloudvin/_git/terraform-modules//terraform-module-azure-data-explorer?ref=v1.0.0"
cluster_name = "..." # Globally unique cluster name (4-22 lowercase alphanumer…
resource_group_name = "..." # Resource group for the cluster and databases.
location = "..." # Azure region, e.g. `centralindia`.
}
Then terraform init && terraform apply. Every other input has a sensible default — see Inputs below to override behaviour.
What this module is
Azure Data Explorer (ADX, internally Kusto) is a fully managed, columnar, time-series analytics engine — the store-and-query brain behind Log Analytics, Application Insights, and Microsoft Sentinel, exposed for your own data via the Kusto Query Language (KQL). You provision a cluster (the compute + SSD-cache tier, billed per VM-hour) and inside it one or more databases, each with its own hot-cache window and retention (soft-delete) period. Data lands either by streaming from Event Hubs / IoT Hub, by LightIngest batch loads, or by queued ingestion, and you query it interactively over billions of rows in sub-second time.
The raw resource graph rewards getting a handful of decisions right and punishes the rest: the cluster sku couples a VM family to a cache-disk size and a price point, auto_scale versus a fixed capacity is mutually exclusive, double encryption and the public-network toggle are immutable after creation, and a database’s hot_cache_period must never exceed its soft_delete_period or queries silently fall back to cold blob storage. Wrapping azurerm_kusto_cluster + azurerm_kusto_database — plus the RBAC and ingestion sub-resources every team ends up needing — in one reviewed, tagged, version-pinned module bakes those rules in so each workload ships a correctly-sized, least-privilege cluster instead of copy-pasting a block that hot-caches 30 days into a 7-day database.
When to use it
- You need interactive, ad-hoc analytics over telemetry, logs, metrics, or IoT time-series at a scale where a SQL database or Log Analytics workspace gets slow or expensive.
- You are building an observability or security data lake and want KQL parity with Sentinel/Log Analytics but with your own retention, cost, and schema control.
- You want to stream from Event Hubs / IoT Hub straight into queryable tables with a managed data connection rather than a custom consumer.
- You need per-database hot-cache and retention tuning — e.g. 31 days hot for the live dashboard, 2 years cold for compliance — under one cluster’s compute.
- You want every cluster to carry consistent SKU, autoscale bounds, managed-identity, encryption, and database-scoped RBAC enforced by code review, not portal clicks.
Reach for a different tool when your workload is transactional, needs row-level updates/deletes, or is sub-gigabyte and infrequently queried — that is Azure SQL, Cosmos DB, or a plain Log Analytics workspace, not ADX.
Module structure
terraform-module-azure-data-explorer/
├── versions.tf
├── main.tf
├── variables.tf
└── outputs.tf
versions.tf
terraform {
required_version = ">= 1.5.0"
required_providers {
azurerm = {
source = "hashicorp/azurerm"
version = "~> 4.0"
}
}
}
main.tf
locals {
# Autoscale and a fixed node count are mutually exclusive on the cluster.
autoscale_enabled = var.auto_scale != null
# Build a flat map of {db_name => {db_name, principal}} so RBAC assignments
# can be created with a single for_each across every database/principal pair.
database_principals = merge([
for db_name, db in var.databases : {
for p in db.principals :
"${db_name}/${p.tenant_id}/${p.object_id}/${p.role}" => {
database_name = db_name
principal_id = p.object_id
principal_type = p.principal_type
role = p.role
tenant_id = p.tenant_id
}
}
]...)
}
resource "azurerm_kusto_cluster" "this" {
name = var.cluster_name
resource_group_name = var.resource_group_name
location = var.location
sku {
name = var.sku_name
capacity = local.autoscale_enabled ? null : var.capacity
}
dynamic "optimized_auto_scale" {
for_each = local.autoscale_enabled ? [var.auto_scale] : []
content {
minimum_instances = optimized_auto_scale.value.minimum_instances
maximum_instances = optimized_auto_scale.value.maximum_instances
}
}
identity {
type = "SystemAssigned"
}
# Immutable after creation — set deliberately at first apply.
double_encryption_enabled = var.double_encryption_enabled
disk_encryption_enabled = var.disk_encryption_enabled
public_network_access_enabled = var.public_network_access_enabled
auto_stop_enabled = var.auto_stop_enabled
streaming_ingestion_enabled = var.streaming_ingestion_enabled
purge_enabled = var.purge_enabled
zones = var.availability_zones
tags = var.tags
}
resource "azurerm_kusto_database" "this" {
for_each = var.databases
name = each.key
resource_group_name = var.resource_group_name
location = var.location
cluster_name = azurerm_kusto_cluster.this.name
hot_cache_period = each.value.hot_cache_period
soft_delete_period = each.value.soft_delete_period
}
# Database-scoped RBAC (Admin / Ingestor / Viewer / etc.) for Entra principals.
resource "azurerm_kusto_database_principal_assignment" "this" {
for_each = local.database_principals
name = replace(each.key, "/", "-")
resource_group_name = var.resource_group_name
cluster_name = azurerm_kusto_cluster.this.name
database_name = azurerm_kusto_database.this[each.value.database_name].name
tenant_id = each.value.tenant_id
principal_id = each.value.principal_id
principal_type = each.value.principal_type
role = each.value.role
}
# Optional managed ingestion from Event Hub straight into a table.
resource "azurerm_kusto_eventhub_data_connection" "this" {
for_each = var.eventhub_connections
name = each.key
resource_group_name = var.resource_group_name
location = var.location
cluster_name = azurerm_kusto_cluster.this.name
database_name = azurerm_kusto_database.this[each.value.database_name].name
eventhub_id = each.value.eventhub_id
consumer_group = each.value.consumer_group
table_name = each.value.table_name
mapping_rule_name = each.value.mapping_rule_name
data_format = each.value.data_format
compression = each.value.compression
identity_id = azurerm_kusto_cluster.this.id
}
variables.tf
variable "cluster_name" {
description = "Globally unique ADX cluster name (4-22 chars, lowercase letters and numbers, must start with a letter)."
type = string
validation {
condition = can(regex("^[a-z][a-z0-9]{3,21}$", var.cluster_name))
error_message = "cluster_name must be 4-22 chars, start with a lowercase letter, and contain only lowercase letters and digits."
}
}
variable "resource_group_name" {
description = "Resource group that will contain the cluster, databases, and connections."
type = string
}
variable "location" {
description = "Azure region for the cluster, e.g. centralindia."
type = string
}
variable "sku_name" {
description = "Cluster SKU (VM family + cache disk). Dev/Test SKUs have no SLA. e.g. Standard_E2ads_v5, Standard_D13_v2."
type = string
default = "Standard_E2ads_v5"
validation {
condition = can(regex("^(Dev\\(No SLA\\)_)?Standard_", var.sku_name))
error_message = "sku_name must be a valid Kusto SKU such as Standard_E2ads_v5 or Dev(No SLA)_Standard_E2a_v4."
}
}
variable "capacity" {
description = "Fixed node count when autoscale is disabled. Ignored if auto_scale is set."
type = number
default = 2
validation {
condition = var.capacity >= 1 && var.capacity <= 1000
error_message = "capacity must be between 1 and 1000."
}
}
variable "auto_scale" {
description = "Optional optimized autoscale. When set, capacity is ignored and the cluster scales between the bounds."
type = object({
minimum_instances = number
maximum_instances = number
})
default = null
validation {
condition = var.auto_scale == null || (
var.auto_scale.minimum_instances >= 2 &&
var.auto_scale.maximum_instances >= var.auto_scale.minimum_instances
)
error_message = "auto_scale.minimum_instances must be >= 2 and maximum_instances >= minimum_instances."
}
}
variable "availability_zones" {
description = "Availability zones to spread cluster nodes across, e.g. [\"1\", \"2\", \"3\"]. Empty for zone-agnostic."
type = list(string)
default = []
}
variable "double_encryption_enabled" {
description = "Enable infrastructure (double) encryption. IMMUTABLE after creation."
type = bool
default = true
}
variable "disk_encryption_enabled" {
description = "Encrypt the cluster's data disks."
type = bool
default = true
}
variable "public_network_access_enabled" {
description = "Allow access over the public endpoint. IMMUTABLE — set false when fronting with Private Endpoint."
type = bool
default = false
}
variable "streaming_ingestion_enabled" {
description = "Enable low-latency streaming ingestion (required for sub-second Event Hub ingest)."
type = bool
default = true
}
variable "purge_enabled" {
description = "Allow hard data purges (GDPR/right-to-erasure). Off by default."
type = bool
default = false
}
variable "auto_stop_enabled" {
description = "Auto-stop the cluster after a period of inactivity to save cost (dev/test friendly)."
type = bool
default = false
}
variable "databases" {
description = "Map of database name => settings. hot_cache_period must be <= soft_delete_period. Use ISO 8601 durations (e.g. P31D)."
type = map(object({
hot_cache_period = optional(string, "P31D")
soft_delete_period = optional(string, "P365D")
principals = optional(list(object({
object_id = string
tenant_id = string
principal_type = string # User | Group | App
role = string # Admin | Ingestor | Monitor | User | UnrestrictedViewer | Viewer
})), [])
}))
default = {}
}
variable "eventhub_connections" {
description = "Map of connection name => Event Hub ingestion settings landing events into a database table."
type = map(object({
database_name = string
eventhub_id = string
consumer_group = optional(string, "$Default")
table_name = optional(string)
mapping_rule_name = optional(string)
data_format = optional(string, "JSON")
compression = optional(string, "None")
}))
default = {}
}
variable "tags" {
description = "Tags applied to the cluster."
type = map(string)
default = {}
}
outputs.tf
output "cluster_id" {
description = "Resource ID of the ADX cluster (use for RBAC, diagnostic settings, Private Endpoint)."
value = azurerm_kusto_cluster.this.id
}
output "cluster_name" {
description = "Name of the ADX cluster."
value = azurerm_kusto_cluster.this.name
}
output "cluster_uri" {
description = "Query endpoint URI, e.g. https://<name>.<region>.kusto.windows.net."
value = azurerm_kusto_cluster.this.uri
}
output "data_ingestion_uri" {
description = "Ingestion endpoint URI for queued/batch ingestion clients."
value = azurerm_kusto_cluster.this.data_ingestion_uri
}
output "identity_principal_id" {
description = "Object ID of the cluster's system-assigned identity (grant it Event Hub / Storage data roles)."
value = azurerm_kusto_cluster.this.identity[0].principal_id
}
output "database_ids" {
description = "Map of database name to resource ID."
value = { for k, db in azurerm_kusto_database.this : k => db.id }
}
output "database_names" {
description = "List of database names created in the cluster."
value = keys(azurerm_kusto_database.this)
}
output "eventhub_connection_ids" {
description = "Map of Event Hub data-connection name to resource ID."
value = { for k, c in azurerm_kusto_eventhub_data_connection.this : k => c.id }
}
How to use it
module "data_explorer_kusto_observability" {
source = "git::https://dev.azure.com/teknohut/kloudvin/_git/terraform-modules//terraform-module-azure-data-explorer?ref=v1.0.0"
cluster_name = "adxobsprodcin"
resource_group_name = azurerm_resource_group.analytics.name
location = azurerm_resource_group.analytics.location
sku_name = "Standard_E4ads_v5"
# Scale between 2 and 6 nodes on demand instead of a flat node count.
auto_scale = {
minimum_instances = 2
maximum_instances = 6
}
availability_zones = ["1", "2", "3"]
double_encryption_enabled = true
public_network_access_enabled = false # fronted by a Private Endpoint
streaming_ingestion_enabled = true
databases = {
"telemetry" = {
hot_cache_period = "P31D" # 31 days hot for live dashboards
soft_delete_period = "P365D" # 1 year total retention
principals = [
{
object_id = azuread_group.observability_admins.object_id
tenant_id = data.azurerm_client_config.current.tenant_id
principal_type = "Group"
role = "Admin"
},
{
object_id = azuread_group.dashboard_readers.object_id
tenant_id = data.azurerm_client_config.current.tenant_id
principal_type = "Group"
role = "Viewer"
}
]
}
"audit" = {
hot_cache_period = "P7D"
soft_delete_period = "P730D" # 2 years for compliance, mostly cold
}
}
# Stream raw telemetry from Event Hub straight into the RawEvents table.
eventhub_connections = {
"telemetry-stream" = {
database_name = "telemetry"
eventhub_id = module.event_hub.eventhub_id
consumer_group = "adx-ingest"
table_name = "RawEvents"
mapping_rule_name = "RawEvents_mapping"
data_format = "JSON"
}
}
tags = {
workload = "observability"
environment = "prod"
owner = "data-platform"
}
}
# Downstream: the cluster identity needs to read the Event Hub it ingests from.
resource "azurerm_role_assignment" "adx_eventhub_receiver" {
scope = module.event_hub.namespace_id
role_definition_name = "Azure Event Hubs Data Receiver"
principal_id = module.data_explorer_kusto_observability.identity_principal_id
}
# Downstream: ship a Function App's logs into the same cluster via diagnostics,
# referencing the cluster_id output.
resource "azurerm_monitor_diagnostic_setting" "fn_to_adx" {
name = "fn-to-adx"
target_resource_id = azurerm_linux_function_app.api.id
log_analytics_workspace_id = azurerm_log_analytics_workspace.hub.id
log_analytics_destination_type = "Dedicated"
enabled_log {
category_group = "allLogs"
}
}
With Terragrunt
Terragrunt keeps this module DRY across environments — define the backend and provider once in a root config, then a thin terragrunt.hcl per environment supplies only the inputs that differ.
1. Root config — live/terragrunt.hcl (inherited by every module):
remote_state {
backend = "azurerm"
generate = { path = "backend.tf", if_exists = "overwrite" }
config = {
# ...azurerm state bucket/container + key per path...
}
}
2. Module config — live/prod/data_explorer/terragrunt.hcl:
include "root" {
path = find_in_parent_folders()
}
terraform {
source = "git::https://dev.azure.com/teknohut/kloudvin/_git/terraform-modules//terraform-module-azure-data-explorer?ref=v1.0.0"
}
inputs = {
cluster_name = "..."
resource_group_name = "..."
location = "..."
}
3. Deploy one environment, or roll out all modules together:
cd live/prod/data_explorer && terragrunt apply # this module
terragrunt run-all apply # every module under live/prod
Why Terragrunt here: the backend and provider live in one place instead of being copy-pasted into every module; inputs is overridden per environment (dev / stage / prod) without forking the module; and run-all orchestrates dependencies across modules. Reach for it once you have more than one environment or more than a handful of modules — for a single stack, the plain Quickstart above is enough.
Inputs
| Name | Type | Default | Required | Description |
|---|---|---|---|---|
cluster_name |
string |
— | Yes | Globally unique cluster name (4-22 lowercase alphanumeric, validated). |
resource_group_name |
string |
— | Yes | Resource group for the cluster and databases. |
location |
string |
— | Yes | Azure region, e.g. centralindia. |
sku_name |
string |
"Standard_E2ads_v5" |
No | VM family + cache disk SKU. Dev/Test SKUs have no SLA. |
capacity |
number |
2 |
No | Fixed node count (1-1000) when autoscale is off. |
auto_scale |
object |
null |
No | Optimized autoscale bounds (minimum_instances >= 2). Overrides capacity. |
availability_zones |
list(string) |
[] |
No | Zones to spread nodes across, e.g. ["1","2","3"]. |
double_encryption_enabled |
bool |
true |
No | Infrastructure (double) encryption. Immutable after creation. |
disk_encryption_enabled |
bool |
true |
No | Encrypt the cluster data disks. |
public_network_access_enabled |
bool |
false |
No | Allow public endpoint. Immutable; pair false with Private Endpoint. |
streaming_ingestion_enabled |
bool |
true |
No | Enable low-latency streaming ingestion. |
purge_enabled |
bool |
false |
No | Allow hard data purges (GDPR erasure). |
auto_stop_enabled |
bool |
false |
No | Auto-stop on inactivity to save cost (dev/test). |
databases |
map(object) |
{} |
No | Per-DB hot cache, soft delete, and RBAC principals. |
eventhub_connections |
map(object) |
{} |
No | Event Hub ingestion connections into DB tables. |
tags |
map(string) |
{} |
No | Tags applied to the cluster. |
Outputs
| Name | Description |
|---|---|
cluster_id |
Resource ID of the cluster (RBAC, diagnostics, Private Endpoint). |
cluster_name |
Name of the cluster. |
cluster_uri |
KQL query endpoint URI (https://<name>.<region>.kusto.windows.net). |
data_ingestion_uri |
Ingestion endpoint URI for queued/batch clients. |
identity_principal_id |
Object ID of the cluster’s system-assigned identity. |
database_ids |
Map of database name to resource ID. |
database_names |
List of database names in the cluster. |
eventhub_connection_ids |
Map of Event Hub connection name to resource ID. |
Enterprise scenario
A fintech runs a fraud-and-observability platform on adxobsprodcin, a 3-zone, autoscaling Standard_E4ads_v5 cluster. The module provisions a telemetry database (31 days hot for the live Grafana/ADX dashboards, 1 year retained) fed by a managed Event Hub connection that lands ~80 GB/day of transaction events into a RawEvents table, plus a separate audit database kept 2 years for the regulator but only 7 days hot to keep cache cost down. Database-scoped RBAC grants the SRE group Admin and analysts Viewer via Entra groups — no standing portal access — while the cluster’s system-assigned identity is the only principal allowed to read the source Event Hub, so the entire hot path runs without a single shared secret.
Best practices
- Match
hot_cache_periodto your real query window, not your retention. Hot cache (SSD) is the expensive part of ADX;soft_delete_period(cold blob) is cheap. Keep 7-31 days hot for dashboards and let the long tail age into cold storage — the module validates nothing here, so set them deliberately and never let hot exceed soft-delete. - Use optimized autoscale instead of a flat node count. Set
auto_scalewith a floor of 2 (for SLA) and a ceiling sized to peak ingest/query; you pay for nodes by the second, so a tightmin/maxband beats over-provisioning a fixedcapacityyou only need at month-end. - Lock the cluster down and prefer the managed identity. Provision with
public_network_access_enabled = falseplus a Private Endpoint, and grant the cluster’sidentity_principal_idtheAzure Event Hubs Data Receiver/Storage Blob Data Readerroles on its sources rather than embedding connection strings — the Event Hub data connection usesidentity_idfor exactly this. - Right-size the SKU family to the workload. The
E-series (Standard_E*ads_v5) is memory/compute-balanced for typical telemetry; pick storage-optimisedL-series only for cache-heavy, query-bound workloads, and use aDev(No SLA)SKU plusauto_stop_enabled = truefor non-prod to cut idle spend. - Set encryption and zones at creation — they are immutable.
double_encryption_enabled,public_network_access_enabled, andavailability_zonescannot be flipped in place; getting them right on the first apply avoids a destroy/recreate and a full data reload. - Name and tag for cost attribution. Use a CAF-style
adx-prefixed lowercase name (no hyphens — the SKU/DNS rules forbid them) and applyworkload/environment/ownertags so per-cluster VM-hour spend lands in the right cost centre.