Microsoft Sentinel is a cloud-native SIEM and SOAR built on Log Analytics. A workspace is cheap to enable and ruinously easy to operate badly — runaway ingestion bills, noisy detections, and incidents nobody triages. This guide stands one up the way it should run in production: deliberate table tiers, KQL detections mapped to MITRE ATT&CK, automated response, and cost under control.
1. Workspace and architecture decisions
Sentinel is an offering layered on a Log Analytics workspace (LAW). The decisions you make here are hard to reverse, so get them right first.
One workspace or many? Default to a single regional workspace per environment. Cross-workspace queries exist but add friction, and the per-GB ingestion model means consolidation rarely costs more. Split only for data-residency or strict tenant-isolation reasons. For MSSP or multi-tenant estates, keep workspaces in each tenant and manage them centrally with Azure Lighthouse — delegated access lets your SOC run cross-tenant hunting without storing customer logs in your own tenant.
Table tiers drive most of your bill. Sentinel supports three:
| Tier | Query window | Retention model | Use for |
|---|---|---|---|
| Analytics | Interactive, full | Hot, up to 2 years interactive | Tables that feed analytics rules |
| Basic / Auxiliary | Limited (KQL subset), 30 days interactive | Cheaper ingestion, long-term archive | High-volume, low-fidelity (firewall, NetFlow) |
| Archive | Restore or search jobs | Up to 12 years | Compliance retention |
Set interactive and total retention deliberately. Workspace default is 90 days; per-table overrides let you keep auth logs hot for a year while sending verbose proxy logs to Basic.
# Create the workspace and onboard Sentinel
az monitor log-analytics workspace create \
--resource-group rg-sec-sentinel \
--workspace-name law-sentinel-prod \
--location eastus \
--retention-time 90
az sentinel onboarding-state create \
--resource-group rg-sec-sentinel \
--workspace-name law-sentinel-prod \
--name default
The
az sentinelcommands ship in thesentinelCLI extension. Install withaz extension add --name sentinel. Much of Sentinel’s advanced surface is REST/ARM only, so expect to mix CLI, Bicep, and the portal.
2. Onboarding data connectors
Detections are only as good as your telemetry. Prioritize identity and endpoint first — that is where most real attacks are visible.
Entra ID and Microsoft Defender XDR connect through the unified Microsoft Defender portal experience or the legacy data connectors. For Entra ID, enable SignInLogs, AuditLogs, and the risk tables (AADUserRiskEvents, AADRiskyUsers). For Defender XDR, the connector streams the Device*, Email*, Alert*, and Identity* tables and synchronizes incidents bidirectionally.
Azure Activity is a built-in connector backed by a Diagnostic Setting that routes the subscription activity log into the AzureActivity table:
az monitor diagnostic-settings subscription create \
--name "send-activity-to-sentinel" \
--location eastus \
--logs '[{"category":"Administrative","enabled":true},
{"category":"Security","enabled":true},
{"category":"Policy","enabled":true}]' \
--workspace "/subscriptions/<sub-id>/resourceGroups/rg-sec-sentinel/providers/Microsoft.OperationalInsights/workspaces/law-sentinel-prod"
Syslog and CEF now flow through the Azure Monitor Agent (AMA), not the retired Log Analytics agent. You deploy a Linux forwarder (or point appliances at it), install AMA, and govern what gets collected with a Data Collection Rule (DCR). CEF lands in CommonSecurityLog; plain syslog in Syslog.
# Install the AMA extension on the Linux log forwarder VM
az vm extension set \
--resource-group rg-sec-collectors \
--vm-name vm-cef-forwarder \
--name AzureMonitorLinuxAgent \
--publisher Microsoft.Azure.Monitor \
--enable-auto-upgrade true
The DCR is where you filter at the source — set facilities and minimum log levels so you are not paying to ingest debug from every appliance. Define it once and associate it with the forwarder; the agent applies the filter before data leaves the host.
3. Writing scheduled analytics rules in KQL
Scheduled analytics rules are the core detection engine: a KQL query on a timer that raises alerts and groups them into incidents. Two things separate a usable rule from an alert cannon — entity mapping and MITRE tagging.
Here is a rule detecting brute-force success: many failures followed by a sign-in success from the same identity.
let failureThreshold = 10;
let lookback = 1h;
SigninLogs
| where TimeGenerated > ago(lookback)
| summarize
Failures = countif(ResultType != 0),
Successes = countif(ResultType == 0),
IPs = make_set(IPAddress, 50),
LastSuccess = maxif(TimeGenerated, ResultType == 0)
by UserPrincipalName, AppDisplayName
| where Failures >= failureThreshold and Successes > 0
| extend AccountName = tostring(split(UserPrincipalName, "@")[0])
Create it with entity mapping and ATT&CK technique tags so incidents arrive enriched and correlatable:
az sentinel alert-rule create \
--resource-group rg-sec-sentinel \
--workspace-name law-sentinel-prod \
--rule-id "bruteforce-success" \
--scheduled-alert-rule \
--display-name "Successful sign-in after repeated failures" \
--enabled true \
--severity Medium \
--query @bruteforce.kql \
--query-frequency PT1H \
--query-period PT1H \
--trigger-operator GreaterThan \
--trigger-threshold 0 \
--tactics CredentialAccess \
--techniques T1110
Entity mapping binds query columns to entities (Account, IP, Host, etc.). Without it, Sentinel cannot deduplicate, correlate across rules, or feed the investigation graph. Map
AccountNameto Account andIPsto IP at minimum — this is what makes “show me everything this user did” work later.
Set query period >= frequency to avoid coverage gaps, and prefer event grouping = single alert per result row when each row is a distinct incident.
4. Tuning out noise
A SOC drowns in false positives before it misses a real one. Three levers, in order of preference:
- Tighten the query. Most noise is a missing
where. Exclude known scanners, service accounts, and sanctioned automation in the KQL itself using a watchlist join rather than hardcoded values. - Near-real-time (NRT) rules for the handful of detections that must fire within a minute (e.g., break-glass account sign-in). NRT rules run roughly every minute but carry constraints — one table, no
jointo other tables. Reserve them for true time-critical cases. - Automation rules for triage logic that does not belong in a playbook: auto-close known-benign patterns, set severity, assign an owner, or add tags based on incident properties.
// Suppress sanctioned automation via a watchlist instead of inline strings
let allowed = _GetWatchlist('SanctionedServiceAccounts') | project SearchKey;
SigninLogs
| where ResultType == 0
| where UserPrincipalName !in (allowed)
Automation rules run on incident creation or update and execute conditions top-down — order them so cheap suppressions run before expensive playbook calls. Use them to auto-close, then to route, then to enrich.
5. Building SOAR playbooks with Logic Apps
Playbooks are Logic Apps triggered by Sentinel. The recommended pattern is the incident trigger (the playbook receives the full incident with mapped entities) wired up through an automation rule. Three high-value playbooks:
Disable a compromised user via Microsoft Graph. The Logic App’s managed identity (or a connection) needs the Graph User.ReadWrite.All permission.
{
"method": "PATCH",
"uri": "https://graph.microsoft.com/v1.0/users/@{triggerBody()?['object']?['properties']?['relatedEntities'][0]['properties']['aadUserId']}",
"headers": { "Content-Type": "application/json" },
"body": { "accountEnabled": false }
}
Isolate a device through the Defender for Endpoint connector action (Isolate machine), passing the device entity’s machine ID and an isolation type of Full. Always pair containment actions with an approval step or scope them to high-severity incidents only — auto-isolating production boxes on a medium-confidence alert is how SOAR earns a bad name.
Post to Teams for human-in-the-loop. Use the Teams connector to send an adaptive card to the SOC channel with the incident title, severity, entities, and Confirm / Dismiss buttons that call back into Sentinel.
Wire any playbook to incidents with an automation rule:
az sentinel automation-rule create \
--resource-group rg-sec-sentinel \
--workspace-name law-sentinel-prod \
--automation-rule-id "run-disable-user" \
--display-name "High severity -> disable user playbook" \
--order 1 \
--triggering-logic '{
"isEnabled": true,
"triggersOn": "Incidents",
"triggersWhen": "Created",
"conditions": [{
"conditionType": "Property",
"conditionProperties": {
"propertyName": "IncidentSeverity",
"operator": "Equals",
"propertyValues": ["High"]
}
}]
}' \
--actions '[{
"order": 1,
"actionType": "RunPlaybook",
"actionConfiguration": {
"logicAppResourceId": "/subscriptions/<sub-id>/resourceGroups/rg-sec-soar/providers/Microsoft.Logic/workflows/pb-disable-user",
"tenantId": "<tenant-id>"
}
}]'
Sentinel’s automation service principal needs the Microsoft Sentinel Automation Contributor role on the resource group holding your playbooks, or the run silently fails to launch.
6. UEBA and anomaly-based detections
User and Entity Behavior Analytics profiles normal behavior and surfaces deviations no static rule would catch. Enable it from Settings -> Entity behavior; it requires the identity sources — at minimum SigninLogs, AuditLogs, SecurityEvent, and the Azure Activity log.
Once enabled, UEBA enriches events into BehaviorAnalytics with peer-group context and investigation priority scores. Query it directly in detections and hunts:
BehaviorAnalytics
| where ActivityType == "LogOn"
| where InvestigationPriority >= 7
| project TimeGenerated, UserPrincipalName, SourceIPAddress,
ActivityInsights, InvestigationPriority
| order by InvestigationPriority desc
Sentinel also ships anomaly rule templates (ML-based, customizable thresholds) for things like anomalous data egress and rare process execution. Turn the relevant ones to production after observing them in flight mode — they are tuned on your own baseline, so give them data before trusting them.
7. Controlling cost
Ingestion is the bill. The discipline is simple: ingest high-fidelity data to Analytics, dump high-volume low-value data to Basic/Auxiliary, and filter the rest at the source.
- Data Collection Rules drop noise before ingestion — the cheapest GB is the one you never send.
- Basic / Auxiliary logs cut per-GB cost dramatically for tables you only query during an investigation (firewall, DNS, proxy). You trade interactive query power for price.
- Commitment tiers switch you from pay-as-you-go to a discounted daily capacity reservation once you sustain ~100 GB/day or more.
# Move a verbose table to the Basic tier
az monitor log-analytics workspace table update \
--resource-group rg-sec-sentinel \
--workspace-name law-sentinel-prod \
--name CommonSecurityLog \
--plan Basic
Watch spend with the Usage table and review it weekly — ingestion creep is gradual and nobody notices until the invoice.
Usage
| where TimeGenerated > ago(30d)
| where IsBillable == true
| summarize BillableGB = sum(Quantity) / 1000 by DataType
| order by BillableGB desc
8. Hunting, workbooks, and validating detections
Proactive hunting queries run on demand against your full data to find what scheduled rules missed; bookmark interesting results and promote recurring ones into analytics rules. Workbooks turn KQL into operational dashboards — start from the built-in templates (Security Operations Efficiency, Identity & Access) and customize.
Crucially, validate that detections fire. Do not wait for a real attacker. Generate benign, controlled signal — for example, a deliberate burst of failed sign-ins from a test account, or running attack-simulation tooling in a lab subscription — and confirm the rule produces an incident with the right entities and severity. A detection you have never seen trigger is a detection you do not have.
Enterprise scenario
A retail platform team onboarded their Palo Alto firewalls to Sentinel and watched ingestion jump to ~280 GB/day overnight — almost all of it TRAFFIC allow logs nobody queried. The reflex was to drop the table, but the SOC still needed those flows for occasional egress investigations, and compliance required 12 months of retention. Deleting the connector was off the table.
The fix was tiering plus source-side filtering, not deletion. CommonSecurityLog was already feeding two analytics rules, so blindly moving the whole table to Basic would have broken them (Basic tier can’t back scheduled rules). They split the stream: keep THREAT and deny events on Analytics where detections live, and route the high-volume allow TRAFFIC rows to Auxiliary with long-term archive. The DCR did the filtering before the bytes ever left the forwarder:
source
| where DeviceVendor == "Palo Alto Networks"
| where not(DeviceProduct == "PAN-OS"
and Activity == "TRAFFIC"
and DeviceAction == "allow")
Allowed-traffic rows were sent to a separate Auxiliary-tier custom table (PaloAltoTraffic_CL) via a second DCR with --plan Basic retention extended to 365 days. Net effect: billable Analytics ingestion fell roughly 70%, the brute-force and egress rules kept firing on the THREAT stream, and investigators could still search the archived flows when a case demanded it. The lesson — never tier or drop a table without first checking which rules depend on it; split the stream by fidelity instead.
Verify
Confirm the pipeline end to end before declaring victory:
// Data is flowing on every key table
union withsource=Tbl SigninLogs, AzureActivity, CommonSecurityLog, DeviceEvents
| where TimeGenerated > ago(1h)
| summarize Events = count(), Latest = max(TimeGenerated) by Tbl
# List enabled analytics rules and their severities
az sentinel alert-rule list \
--resource-group rg-sec-sentinel \
--workspace-name law-sentinel-prod \
--query "[?enabled].{name:displayName, severity:severity}" -o table
- Connectors show Connected and their tables return recent rows.
- A test trigger produces an incident with mapped entities and MITRE tactics.
- The bound automation rule launches the playbook (check the Logic App run history).
- The
Usagequery matches your expected daily ingestion.
Deployment checklist
Pitfalls
- Ingesting everything to Analytics. The fastest way to a six-figure bill. Tier and filter from day one.
- Skipping entity mapping. Without it, correlation, deduplication, and the investigation graph all break — your SIEM becomes a log search box.
- Auto-remediation without guardrails. Containment playbooks on low-confidence alerts will isolate production. Gate them behind severity, approval, or both.
- Missing automation permissions. No Sentinel Automation Contributor role on the playbook resource group means rules fire but playbooks never run — and the failure is silent.
- Detections that never trigger. Validate with simulated attacks; an untested rule is a false sense of security, not coverage.
Sentinel rewards discipline. Decide your tiers, ingest the telemetry that matters, write detections that map to ATT&CK and carry entities, automate the response with guardrails, and watch the bill weekly. Do that and you have a SOC platform that scales — not a log graveyard with a security label on it.