IaC Azure

Terraform Module: Azure Monitor Workspace (Prometheus) — managed Prometheus ingestion in one reusable block

Quick take — Provision an Azure Monitor Workspace for managed Prometheus with Terraform (azurerm ~> 4.0): default-DCE/DCR endpoints, query metrics access, public/private network controls, and clean outputs. New here? Jump to the Quickstart below to deploy it in minutes; read on for how it works and when to reach for it.

Quickstart (copy-paste)

Minimal, runnable configuration — drop this in a .tf file and fill in the "..." placeholders (each required input is commented):

provider "azurerm" {
  features {}
}

module "monitor_workspace" {
  source = "git::https://dev.azure.com/teknohut/kloudvin/_git/terraform-modules//terraform-module-azure-monitor-workspace?ref=v1.0.0"

  name                = "..."  # Name of the Monitor Workspace; 3-44 chars, validated ag…
  resource_group_name = "..."  # Resource group that contains the workspace.
  location            = "..."  # Azure region. Place it close to the clusters/agents tha…
}

Then terraform init && terraform apply. Every other input has a sensible default — see Inputs below to override behaviour.

What this module is

An Azure Monitor Workspace (resource type azurerm_monitor_workspace, formerly “Azure Monitor managed service for Prometheus”) is the regional, fully-managed store for Prometheus metrics in Azure. When you enable managed Prometheus on an AKS cluster — or when you scrape any endpoint with the Azure Monitor metrics agent — the time-series land in a Monitor Workspace and become queryable over PromQL via the workspace’s query endpoint. It is the metrics counterpart to a Log Analytics workspace, and it is what Azure Managed Grafana points at as a data source.

A single azurerm_monitor_workspace resource is deceptively small, but in production it never travels alone. The moment you create one, Azure auto-provisions a default Data Collection Endpoint (DCE) and default Data Collection Rule (DCR) behind it; you almost always want to wire Grafana’s managed identity for read access, attach diagnostic settings so the workspace’s own platform metrics flow somewhere, and decide whether the query endpoint is public or locked behind a private endpoint. Wrapping all of that in a module means every team gets the same query-access RBAC, the same diagnostics, and the same naming — instead of fifteen slightly-different hand-built workspaces where half forgot to grant Grafana the Monitoring Data Reader role.

This module exposes azurerm_monitor_workspace plus the two sub-resources you reach for on day one: a role assignment that grants a Grafana (or other) principal Monitoring Data Reader on the workspace, and an optional diagnostic setting that ships the workspace’s metrics/logs to Log Analytics.

When to use it

If you only need logs (KQL) rather than Prometheus time-series, use a Log Analytics workspace module instead — azurerm_monitor_workspace is specifically the Prometheus metrics store.

Module structure

terraform-module-azure-monitor-workspace/
├── versions.tf      # provider + Terraform version pins
├── main.tf          # azurerm_monitor_workspace + RBAC + diagnostics
├── variables.tf     # var-driven inputs with validation
└── outputs.tf       # id, name, query endpoint, default DCR/DCE ids
# versions.tf
terraform {
  required_version = ">= 1.5.0"

  required_providers {
    azurerm = {
      source  = "hashicorp/azurerm"
      version = "~> 4.0"
    }
  }
}
# main.tf

# Regional managed Prometheus metrics store.
resource "azurerm_monitor_workspace" "this" {
  name                = var.name
  resource_group_name = var.resource_group_name
  location            = var.location

  # When false, the query endpoint is reachable only via a private endpoint.
  public_network_access_enabled = var.public_network_access_enabled

  tags = var.tags
}

# Grant read access (PromQL query) to a set of principals — typically the
# Azure Managed Grafana system-assigned identity, plus any service principals
# that run dashboards or alert evaluation. "Monitoring Data Reader" is the
# least-privilege role for querying metrics from the workspace.
resource "azurerm_role_assignment" "data_reader" {
  for_each = var.query_reader_principal_ids

  scope                = azurerm_monitor_workspace.this.id
  role_definition_name = "Monitoring Data Reader"
  principal_id         = each.value
}

# Optionally ship the workspace's own platform metrics/logs to Log Analytics
# so you can alert on ingestion health and active time-series counts.
resource "azurerm_monitor_diagnostic_setting" "this" {
  count = var.diagnostic_log_analytics_workspace_id == null ? 0 : 1

  name                       = "${var.name}-diag"
  target_resource_id         = azurerm_monitor_workspace.this.id
  log_analytics_workspace_id = var.diagnostic_log_analytics_workspace_id

  enabled_metric {
    category = "AllMetrics"
  }
}
# variables.tf

variable "name" {
  description = "Name of the Azure Monitor Workspace (Prometheus metrics store)."
  type        = string

  validation {
    condition     = can(regex("^[a-zA-Z0-9][a-zA-Z0-9._-]{2,43}$", var.name))
    error_message = "name must be 3-44 chars, start alphanumeric, and use only letters, digits, '.', '_' or '-'."
  }
}

variable "resource_group_name" {
  description = "Resource group that will contain the Monitor Workspace."
  type        = string
}

variable "location" {
  description = "Azure region. Managed Prometheus is region-bound; place it near the clusters/agents that write to it."
  type        = string
}

variable "public_network_access_enabled" {
  description = "If true, the query endpoint is reachable over the public internet. Set false to require a private endpoint."
  type        = bool
  default     = true
}

variable "query_reader_principal_ids" {
  description = "Map of friendly key => principal object ID granted 'Monitoring Data Reader' (e.g. Grafana identity). Keys must be stable across applies."
  type        = map(string)
  default     = {}

  validation {
    condition = alltrue([
      for id in values(var.query_reader_principal_ids) :
      can(regex("^[0-9a-fA-F-]{36}$", id))
    ])
    error_message = "Every value in query_reader_principal_ids must be a 36-character GUID (principal object ID)."
  }
}

variable "diagnostic_log_analytics_workspace_id" {
  description = "Optional Log Analytics workspace resource ID to receive the Monitor Workspace's platform metrics. Null disables diagnostics."
  type        = string
  default     = null
}

variable "tags" {
  description = "Tags applied to the Monitor Workspace."
  type        = map(string)
  default     = {}
}
# outputs.tf

output "id" {
  description = "Resource ID of the Azure Monitor Workspace."
  value       = azurerm_monitor_workspace.this.id
}

output "name" {
  description = "Name of the Azure Monitor Workspace."
  value       = azurerm_monitor_workspace.this.name
}

output "query_endpoint" {
  description = "PromQL query endpoint URL — use this as the Prometheus data source URL in Grafana."
  value       = azurerm_monitor_workspace.this.query_endpoint
}

output "default_data_collection_endpoint_id" {
  description = "ID of the auto-created default Data Collection Endpoint (DCE) for this workspace."
  value       = azurerm_monitor_workspace.this.default_data_collection_endpoint_id
}

output "default_data_collection_rule_id" {
  description = "ID of the auto-created default Data Collection Rule (DCR). Associate this with AKS to start scraping."
  value       = azurerm_monitor_workspace.this.default_data_collection_rule_id
}

How to use it

# Reference the existing Grafana instance so we can grant its identity read access.
data "azurerm_dashboard_grafana" "platform" {
  name                = "kv-grafana-prod"
  resource_group_name = "rg-observability-prod"
}

module "monitor_workspace_prometheus_prod" {
  source = "git::https://dev.azure.com/teknohut/kloudvin/_git/terraform-modules//terraform-module-azure-monitor-workspace?ref=v1.0.0"

  name                = "kv-amw-prod-weu"
  resource_group_name = "rg-observability-prod"
  location            = "westeurope"

  # Keep the query endpoint public for now; flip to false once the
  # private endpoint + DNS zone are in place.
  public_network_access_enabled = true

  query_reader_principal_ids = {
    grafana = data.azurerm_dashboard_grafana.platform.identity[0].principal_id
  }

  diagnostic_log_analytics_workspace_id = azurerm_log_analytics_workspace.platform.id

  tags = {
    env       = "prod"
    workload  = "observability"
    managedBy = "terraform"
  }
}

# Downstream: associate the workspace's default DCR with an AKS cluster so
# the managed Prometheus agent starts shipping cluster metrics into it.
resource "azurerm_monitor_data_collection_rule_association" "aks_prometheus" {
  name                    = "amw-prometheus"
  target_resource_id      = azurerm_kubernetes_cluster.prod.id
  data_collection_rule_id = module.monitor_workspace_prometheus_prod.default_data_collection_rule_id
}

With Terragrunt

Terragrunt keeps this module DRY across environments — define the backend and provider once in a root config, then a thin terragrunt.hcl per environment supplies only the inputs that differ.

1. Root configlive/terragrunt.hcl (inherited by every module):

remote_state {
  backend = "azurerm"
  generate = { path = "backend.tf", if_exists = "overwrite" }
  config = {
    # ...azurerm state bucket/container + key per path...
  }
}

2. Module configlive/prod/monitor_workspace/terragrunt.hcl:

include "root" {
  path = find_in_parent_folders()
}

terraform {
  source = "git::https://dev.azure.com/teknohut/kloudvin/_git/terraform-modules//terraform-module-azure-monitor-workspace?ref=v1.0.0"
}

inputs = {
  name = "..."
  resource_group_name = "..."
  location = "..."
}

3. Deploy one environment, or roll out all modules together:

cd live/prod/monitor_workspace && terragrunt apply        # this module
terragrunt run-all apply                      # every module under live/prod

Why Terragrunt here: the backend and provider live in one place instead of being copy-pasted into every module; inputs is overridden per environment (dev / stage / prod) without forking the module; and run-all orchestrates dependencies across modules. Reach for it once you have more than one environment or more than a handful of modules — for a single stack, the plain Quickstart above is enough.

Inputs

Name Type Default Required Description
name string Yes Name of the Monitor Workspace; 3-44 chars, validated against the allowed character set.
resource_group_name string Yes Resource group that contains the workspace.
location string Yes Azure region. Place it close to the clusters/agents that write metrics.
public_network_access_enabled bool true No false requires the query endpoint to be reached via a private endpoint.
query_reader_principal_ids map(string) {} No Map of key => principal object ID granted Monitoring Data Reader (validated as GUIDs).
diagnostic_log_analytics_workspace_id string null No Log Analytics workspace ID for the workspace’s platform metrics; null disables diagnostics.
tags map(string) {} No Tags applied to the workspace.

Outputs

Name Description
id Resource ID of the Azure Monitor Workspace.
name Name of the Monitor Workspace.
query_endpoint PromQL query endpoint URL — use as the Prometheus data source in Grafana.
default_data_collection_endpoint_id ID of the auto-created default Data Collection Endpoint (DCE).
default_data_collection_rule_id ID of the auto-created default Data Collection Rule (DCR); associate with AKS to begin scraping.

Enterprise scenario

A fintech platform team runs twelve AKS clusters across three landing zones and needs unified Prometheus dashboards without operating self-hosted Prometheus/Thanos. They deploy this module once per region from the platform pipeline — kv-amw-prod-weu, -neu, -eus — each granting only the shared Azure Managed Grafana identity Monitoring Data Reader, and each feeding diagnostics into the central Log Analytics workspace for ingestion-health alerts. Every cluster’s data_collection_rule_association points at the matching workspace’s default_data_collection_rule_id, so onboarding a new cluster is a two-line module reference rather than a bespoke observability project.

Best practices

TerraformAzureMonitor Workspace (Prometheus)ModuleIaC
Need this built for real?

Vinod is a Senior Cloud Architect (22+ yrs) — available for Azure / AWS / GCP architecture, landing zones, and migrations.

Work with me

Comments

Keep Reading