Azure has more moving auth pieces than AWS or GCP, and that is the single thing that catches teams off guard when they pick up azure.azcollection. AWS has IAM. GCP has projects. Azure has subscriptions, tenants, directories, resource groups, service principals (the user-equivalent for automation), managed identities (system-assigned and user-assigned), role assignments at five different scopes, and the special-snowflake auth_source: cli that uses your az login cache — all of which Ansible’s auth chain has to reason about. Once you internalise the four authentication modes and the scope hierarchy (management group → subscription → resource group → resource), the modules themselves are mechanical: azure_rm_<thing> modules with consistent shape, the seven canonical states (well, four really — present / absent mostly), and dynamic inventory that maps tags and resource groups into Ansible groups for free.
This lesson is the exhaustive tour. We start with the Azure mental model — subscriptions, resource groups, tenants, role assignments — because Ansible’s auth and inventory both rest on it. We then walk azure.azcollection module-by-module, focusing on the modules you actually use in production: networking (azure_rm_resourcegroup, azure_rm_virtualnetwork, azure_rm_subnet, azure_rm_networksecuritygroup), compute (azure_rm_virtualmachine, azure_rm_virtualmachinescaleset, azure_rm_aks), data (azure_rm_storageaccount, azure_rm_postgresqlflexibleserver, azure_rm_keyvault, azure_rm_keyvaultsecret), and identity (azure_rm_roleassignment, azure_rm_userassignedidentity). We cover the four authentication modes — Service Principal with secret, Service Principal with certificate, Managed Identity, Azure CLI cached — and the auth_source decision matrix that tells azure.azcollection which one to use. We cover multi-subscription patterns where one playbook hops across subscriptions via per-task subscription_id:, and revisit the azure.azcollection.azure_rm dynamic inventory plugin from a deeper angle than the dynamic inventory lesson, focusing on the Azure-specific knobs (auth_source, plain_host_names, include_vm_resource_groups, batch_fetch, default_host_filters). We finish on idempotency for the awkward modules (looking at you, azure_rm_virtualmachine), role-assignment automation (Azure RBAC at scale), and packaging an Azure-aware Execution Environment for AAP. Everything targets current Ansible (ansible-core 2.17+, azure.azcollection 2+, the Azure SDK for Python azure-mgmt-* packages, 2026), uses FQCN throughout, and ends with a free hands-on lab using Azurite (the official local Azure Storage emulator) plus an Azure free-tier subscription for the resource-graph tasks Azurite can’t fake.
Learning objectives
After this lesson you can:
- Explain Azure’s subscription / tenant / resource-group / role-assignment scope hierarchy and how Ansible’s auth and inventory map onto it.
- Pick the right
auth_sourcefor the host running Ansible:auto,cli,env,credential_file,msi. - Authenticate with a Service Principal (secret or certificate) and rotate cleanly.
- Authenticate from an Azure VM with a Managed Identity (system- or user-assigned) — credential-free.
- Drive
azure.azcollection’s headline modules across networking, compute, data, and identity. - Configure the
azure_rmdynamic inventory plugin with the Azure-specific knobs and explain when caching helps. - Operate a multi-subscription estate with per-task
subscription_id:and per-source inventory files. - Write a tag schema that turns the inventory into clean cross-cutting groups (
tag_environment_prod,location_eastus). - Ship an Azure-aware Execution Environment for AAP.
Prerequisites & where this fits
You should already be comfortable with playbooks and tasks, variables and the precedence rules, Jinja templating, roles and collections, and dynamic inventory in general. The companion expert lessons that compound here are Ansible for AWS — same shape, different cloud — Ansible for Windows — most Azure VMs are Windows — Ansible for Kubernetes for AKS, and Hybrid Orchestration. In the Ansible Zero-to-Hero programme this is the Cloud expert (Azure) lesson and a textbook EX374-grade topic.
Core concepts
Five mental models carry the whole lesson.
1. Subscription is the API boundary. Every Azure REST call is scoped to a subscription. A single playbook hopping across 5 subscriptions is making 5 distinct sets of API calls, with 5 distinct authorisations. The subscription_id: parameter on every azure_rm_* module is not a default — set it explicitly on every task, or set it once at the play level, or once per inventory source. The “I forgot the subscription_id” bug is the Azure equivalent of “I forgot name: on ec2_instance.”
2. Service Principal vs Managed Identity is the AWS-IAM-User vs IAM-Role distinction. A Service Principal is an identity Azure AD knows about, with a secret or certificate. You use it from any host (laptop, CI runner, on-prem AAP). A Managed Identity is bound to an Azure resource (a VM, an App Service, an AKS pod via Workload Identity); the platform issues short-lived tokens automatically. Always prefer Managed Identity when the control node lives in Azure. No secrets on disk, no rotation burden, automatic.
3. auth_source is the steering wheel. azure.azcollection modules read auth_source to decide which auth chain to walk: auto (try them in order), cli (use az login cache), env (env vars), credential_file (~/.azure/credentials), msi (force managed identity). On a laptop you set cli. On a CI runner you set env. On an Azure VM with managed identity you set msi. Don’t leave it on auto in production — the explicitness is what makes auth failures legible.
4. Tags + resource groups are the inventory. The azure_rm plugin builds groups from tag_<key>_<value>, location_<region>, vm_size_<size>, and one group per resource group. A consistent tag schema plus disciplined resource-group naming makes every Ansible play “target group X” instead of “iterate this list of subscription_ids and resource_groups.”
5. Most azure_rm_* modules are present/absent only. Unlike the network resource modules with seven states, the Azure modules are simpler: state: present reconciles the resource to the args you passed; state: absent deletes. Idempotency is module-internal and generally good — but watch for the modules that diff fields you didn’t intend to set (resource tags being reset, say). Always read the module’s “Notes” section in ansible-doc.
Keep these terms straight: subscription (API + billing scope), tenant (Azure AD directory; one tenant has many subscriptions), resource group (logical container; the deletion unit), service principal (automation identity in Azure AD with a secret/cert), managed identity (platform-issued identity bound to a resource — system-assigned or user-assigned), role assignment (the Azure RBAC primitive: principal × role × scope), auth_source (Ansible’s auth-chain selector), azure_rm plugin (dynamic inventory), subscription_id (per-task or per-play scope).
The Azure mental model
Tenant (Azure AD directory) — one per organisation
├── Subscription A (eu-prod-001)
│ ├── Resource group: prod-eu-network
│ │ ├── VirtualNetwork: prod-eu-vnet
│ │ └── NetworkSecurityGroup: web-nsg
│ ├── Resource group: prod-eu-app
│ │ ├── VM: prod-web-eu-1
│ │ └── VM: prod-web-eu-2
│ └── Resource group: prod-eu-data
│ └── PostgresFlexibleServer: prod-app-db
├── Subscription B (stg-eu-001)
└── Subscription C (sandbox-001)
RBAC role assignments live at: management-group | subscription | RG | resource
Internalise the diagram and Ansible’s auth + inventory both make sense.
Authentication — the four modes
| Mode | When to use | What you put in env / params | What’s at risk |
|---|---|---|---|
| Service Principal + secret | CI runners, laptops, on-prem AAP without managed identity | AZURE_CLIENT_ID, AZURE_SECRET, AZURE_TENANT, AZURE_SUBSCRIPTION_ID |
Secret rotation; secret leakage |
| Service Principal + certificate | Same as above when your org bans secrets | AZURE_CLIENT_ID, AZURE_TENANT, AZURE_AD_USER_CERT_FILE (PEM), AZURE_SUBSCRIPTION_ID |
Cert rotation; cert leakage |
Managed Identity (msi) |
Control node is an Azure VM / AKS pod | auth_source: msi (and optionally client_id for user-assigned) |
Almost nothing — credentials never on disk |
Azure CLI (cli) |
Engineer laptops only | auth_source: cli; user has run az login |
None — but never use in CI |
Service Principal with secret — the bootstrapping case
# create the SP
az ad sp create-for-rbac \
--name ansible-automation \
--role Contributor \
--scopes /subscriptions/$SUB_ID
# returns: appId, password, tenant
Set env vars (or ~/.azure/credentials):
export AZURE_CLIENT_ID=<appId>
export AZURE_SECRET=<password>
export AZURE_TENANT=<tenant>
export AZURE_SUBSCRIPTION_ID=<subId>
Now any azure_rm_* task with auth_source: env (or auto) authenticates as that SP.
Managed Identity — the production answer
Run AAP on an Azure VM with a system-assigned managed identity. Grant the identity Contributor (or finer) on the relevant resource groups. Set auth_source: msi on tasks (or as a play default), and Ansible authenticates with no secrets:
- name: Create RG (managed identity)
azure.azcollection.azure_rm_resourcegroup:
name: prod-eu-app
location: eastus
auth_source: msi
subscription_id: "{{ subscription_id }}"
state: present
For user-assigned managed identity (one identity attached to many VMs), pass client_id::
auth_source: msi
client_id: 11111111-2222-3333-4444-555555555555
auth_source decision matrix
| Where Ansible runs | auth_source |
Why |
|---|---|---|
Engineer laptop with az login |
cli |
Uses your interactive session; no secrets |
| GitHub Actions / GitLab CI | env |
Federated OIDC or SP-secret env vars |
| AAP Controller on Azure VM | msi |
Managed identity, credential-free |
| AAP Container Group on AKS with Workload Identity | msi |
Workload Identity is MSI-flavoured |
| AAP Controller on-prem | env or credential_file |
SP secret/cert via AAP credential |
azure.azcollection — the headline modules
# requirements.yml
collections:
- name: azure.azcollection
version: ">=2.0.0"
ansible-galaxy collection install -r requirements.yml
pip install -r ~/.ansible/collections/ansible_collections/azure/azcollection/requirements.txt
| Module | Purpose | Idempotent? | Notes |
|---|---|---|---|
azure_rm_resourcegroup |
Resource groups | Yes | force_delete_nonempty: true for cleanup plays |
azure_rm_virtualnetwork |
VNets | Yes | address_prefixes_cidr: is a list |
azure_rm_subnet |
Subnets | Yes | NSG attachment via security_group: |
azure_rm_networksecuritygroup |
NSGs | Yes | Rule diffs computed by name |
azure_rm_virtualmachine |
VMs | Idempotent on name: + resource_group: |
Heavy module — careful with vm_size: changes (rebuilds) |
azure_rm_virtualmachinescaleset |
VMSS | Yes | upgrade_policy: Manual is safer |
azure_rm_storageaccount |
Storage accounts | Yes | Names must be globally unique |
azure_rm_keyvault |
Key Vaults | Yes | enable_rbac_authorization: true for RBAC mode (recommended) |
azure_rm_keyvaultsecret |
Secrets in KV | Yes | Use lookups to read secrets at play time |
azure_rm_postgresqlflexibleserver |
Postgres Flex servers | Yes | state: started/stopped to save costs |
azure_rm_aks |
AKS clusters | Idempotent on name: |
Use Terraform for big AKS provisioning; Ansible for ops |
azure_rm_roleassignment |
RBAC role assignments | Yes (idempotent on principal_id + role_definition_id + scope) |
The single most-useful Azure module — automates RBAC at scale |
azure_rm_userassignedidentity |
User-assigned MIs | Yes | The identity plays “service account” role |
A canonical play
- name: Provision a web tier in eu-prod
hosts: localhost
gather_facts: false
connection: local
vars:
subscription_id: "11111111-1111-1111-1111-111111111111"
rg: prod-eu-app
location: eastus
tasks:
- name: Resource group
azure.azcollection.azure_rm_resourcegroup:
name: "{{ rg }}"
location: "{{ location }}"
subscription_id: "{{ subscription_id }}"
auth_source: msi
tags:
environment: prod
owner: platform
state: present
- name: VNet
azure.azcollection.azure_rm_virtualnetwork:
resource_group: "{{ rg }}"
name: prod-eu-vnet
address_prefixes_cidr: ["10.42.0.0/16"]
subscription_id: "{{ subscription_id }}"
auth_source: msi
tags: { environment: prod }
state: present
- name: Web subnet
azure.azcollection.azure_rm_subnet:
resource_group: "{{ rg }}"
virtual_network_name: prod-eu-vnet
name: web
address_prefix_cidr: "10.42.1.0/24"
subscription_id: "{{ subscription_id }}"
auth_source: msi
state: present
- name: NSG with HTTPS open
azure.azcollection.azure_rm_networksecuritygroup:
resource_group: "{{ rg }}"
name: web-nsg
rules:
- name: allow-https
priority: 1010
direction: Inbound
access: Allow
protocol: Tcp
destination_port_range: "443"
source_address_prefix: "*"
subscription_id: "{{ subscription_id }}"
auth_source: msi
state: present
- name: Web VM
azure.azcollection.azure_rm_virtualmachine:
resource_group: "{{ rg }}"
name: prod-web-eu-1
vm_size: Standard_B2s
admin_username: azureuser
ssh_public_keys:
- { path: /home/azureuser/.ssh/authorized_keys, key_data: "{{ ssh_pubkey }}" }
image:
publisher: Canonical
offer: 0001-com-ubuntu-server-jammy
sku: 22_04-lts-gen2
version: latest
managed_disk_type: Premium_LRS
os_disk_size_gb: 64
virtual_network_name: prod-eu-vnet
subnet_name: web
public_ip_allocation_method: Disabled
tags: { environment: prod, role: web }
subscription_id: "{{ subscription_id }}"
auth_source: msi
state: present
register: vm
The repetition of subscription_id: and auth_source: per task is annoying — set them once via play-level module_defaults:
- name: Provision a web tier
hosts: localhost
gather_facts: false
connection: local
module_defaults:
group/azure.azcollection.azure:
subscription_id: "11111111-1111-1111-1111-111111111111"
auth_source: msi
tasks:
- azure.azcollection.azure_rm_resourcegroup:
name: prod-eu-app
location: eastus
state: present
# ... no more per-task subscription_id/auth_source ...
module_defaults with the group/azure.azcollection.azure group applies to every module in the collection.
Multi-subscription patterns
Three options:
Option A — subscription_id: per task
Brutally explicit; each task carries its target subscription. Works for small playbooks that genuinely cross subscriptions per resource (rare).
Option B — one play per subscription
The clean shape. Each play sets subscription_id once via module_defaults. Plays are imported in order:
- import_playbook: plays/sub-prod.yml
- import_playbook: plays/sub-stg.yml
- import_playbook: plays/sub-sandbox.yml
Option C — one inventory source per subscription
If your work is mostly targeting VMs (rather than provisioning resources), one azure_rm inventory file per subscription with subscription_id: set there:
# inventory/sub-prod.azure_rm.yml
plugin: azure.azcollection.azure_rm
auth_source: msi
include_vm_resource_groups: ["prod-eu-app", "prod-eu-data"]
default_host_filters:
- name: powerstate
include: ["VM running"]
plain_host_names: true
keyed_groups:
- prefix: tag
key: tags
- prefix: location
key: location
- prefix: vm_size
key: hardware_profile.vm_size
# inventory/sub-stg.azure_rm.yml
plugin: azure.azcollection.azure_rm
auth_source: msi
include_vm_resource_groups: ["stg-eu-app"]
plain_host_names: true
keyed_groups:
- prefix: tag
key: tags
Then inventory: points at the directory; merging is automatic. Note: the subscription_id for the inventory plugin comes from AZURE_SUBSCRIPTION_ID env per source, OR you point at one tenant-wide SP/MI that has access to multiple subscriptions and the plugin lists VMs across them all.
azure_rm inventory plugin — Azure-specific knobs
| Knob | Default | Purpose |
|---|---|---|
auth_source |
auto |
Same modes as modules — pick msi in production |
plain_host_names |
false |
If true, host name is the VM name; if false, name + region/RG suffix |
include_vm_resource_groups |
(all) | Pre-filter by RG list — biggest performance lever |
batch_fetch |
true |
Fewer API calls; turn off only if your subscription is huge and you want streaming |
default_host_filters |
(none) | Per-host filter — drop powered-off VMs etc. |
hostvar_expressions |
(none) | Like compose: — Jinja-derived host vars |
conditional_groups |
(none) | Like groups: — named groups via Jinja |
keyed_groups |
(none) | Same as universal lever — prefix: + key: per dimension |
fail_on_template_errors |
true |
Catch typos early |
cache / cache_plugin / cache_timeout |
(none) | Cache the inventory query |
A production-grade Azure inventory file:
plugin: azure.azcollection.azure_rm
auth_source: msi
plain_host_names: true
include_vm_resource_groups:
- prod-eu-app
- prod-eu-data
- prod-us-app
default_host_filters:
- name: powerstate
include: ["VM running"]
hostvar_expressions:
ansible_host: private_ipv4_addresses[0]
env: tags.environment | default('unknown')
role: tags.role | default('unknown')
keyed_groups:
- prefix: tag
key: tags
- prefix: location
key: location
- prefix: vm_size
key: hardware_profile.vm_size
- prefix: rg
key: resource_group
conditional_groups:
prod_eu: tags.environment == 'prod' and location.startswith('east')
needs_patch: tags.patched is not defined or tags.patched != 'true'
cache: true
cache_plugin: jsonfile
cache_connection: /var/cache/ansible_inventory
cache_timeout: 600
Tagging strategy
Same shape as AWS — required tags first, governed via Azure Policy (Require a tag and its value policy):
| Tag | Required | Purpose |
|---|---|---|
environment |
Yes | prod/stg/dev |
role |
Yes | web/db/worker |
owner |
Yes | Team email or Slack channel |
costcenter |
Yes | Finance attribution |
project |
Recommended | Cross-cutting |
patchgroup |
Recommended | Drives Update Manager |
Azure Policy at the management-group level rejects RG creation that doesn’t have the required tags. Ansible inherits the safety.
Idempotency & check-mode for awkward modules
| Module | Idempotency mechanism | Sharp edge |
|---|---|---|
azure_rm_virtualmachine |
name: + resource_group: |
Changing vm_size: rebuilds; image: is read-only after creation |
azure_rm_storageaccount |
Globally unique name: |
Naming collisions across the whole of Azure |
azure_rm_aks |
Idempotent on name: |
Most prop changes require a new node pool, not a cluster mutation |
azure_rm_roleassignment |
Tuple of principal_id + role_definition_id + scope |
The principal must exist before the assignment |
azure_rm_keyvault |
name: (globally unique) |
RBAC vs access-policy modes are mutually exclusive — pick enable_rbac_authorization: true |
Always test with --check --diff first.
Hands-on free lab — Azurite + free subscription
The lab uses two pieces:
- Azurite (free, local) for storage-account / blob testing
- Azure free-tier subscription (12 months free) for the resource-graph tasks Azurite can’t fake
# Azurite for storage testing
docker run -d --name azurite -p 10000:10000 -p 10001:10001 -p 10002:10002 \
mcr.microsoft.com/azure-storage/azurite
# Real Azure for everything else
az login
SUB_ID=$(az account show --query id -o tsv)
az ad sp create-for-rbac --name ansible-lab --role Contributor --scopes /subscriptions/$SUB_ID
# capture the output:
export AZURE_CLIENT_ID=<appId>
export AZURE_SECRET=<password>
export AZURE_TENANT=<tenant>
export AZURE_SUBSCRIPTION_ID=$SUB_ID
ansible-galaxy collection install azure.azcollection
pip install -r ~/.ansible/collections/ansible_collections/azure/azcollection/requirements.txt
Now create a tiny play:
# play.yml
- hosts: localhost
gather_facts: false
connection: local
module_defaults:
group/azure.azcollection.azure:
auth_source: env
subscription_id: "{{ lookup('env', 'AZURE_SUBSCRIPTION_ID') }}"
tasks:
- name: RG
azure.azcollection.azure_rm_resourcegroup:
name: ansible-lab
location: eastus
tags: { environment: lab, owner: me }
state: present
- name: VNet
azure.azcollection.azure_rm_virtualnetwork:
resource_group: ansible-lab
name: lab-vnet
address_prefixes_cidr: ["10.42.0.0/16"]
state: present
- name: Subnet
azure.azcollection.azure_rm_subnet:
resource_group: ansible-lab
virtual_network_name: lab-vnet
name: web
address_prefix_cidr: "10.42.1.0/24"
state: present
- name: Storage account
azure.azcollection.azure_rm_storageaccount:
resource_group: ansible-lab
name: "labsa{{ 99999 | random }}"
account_type: Standard_LRS
kind: StorageV2
access_tier: Hot
state: present
ansible-playbook play.yml --diff
ansible-playbook play.yml --diff # second run — changed=0
Inventory test:
# inv.azure_rm.yml
plugin: azure.azcollection.azure_rm
auth_source: env
plain_host_names: true
include_vm_resource_groups: ["ansible-lab"]
keyed_groups:
- prefix: tag
key: tags
ansible-inventory -i inv.azure_rm.yml --graph
Tear down:
ansible localhost -m azure.azcollection.azure_rm_resourcegroup -a "name=ansible-lab state=absent force_delete_nonempty=true" -e auth_source=env -e subscription_id=$AZURE_SUBSCRIPTION_ID
docker rm -f azurite
Common mistakes & troubleshooting
AzureHttpError: 401. Wrong tenant or wrong subscription. Run az account show to confirm what cli mode would resolve to; or verify AZURE_TENANT/AZURE_SUBSCRIPTION_ID env values.
Inventory returns 0 hosts. Either: (a) enable_plugins doesn’t list azure.azcollection.azure_rm; (b) the file isn’t named *.azure_rm.yml; © auth_source: msi but the host doesn’t have a managed identity; (d) the SP/MI lacks Reader on the resource groups.
azure_rm_virtualmachine rebuilds the VM unexpectedly. You changed vm_size: or image: — those are not in-place mutations. Use azure_rm_virtualmachine_resize for size changes; image: is essentially write-once.
azure_rm_keyvault says permission denied trying to read a secret. RBAC mode (enable_rbac_authorization: true) bypasses access policies. The principal needs Key Vault Secrets User (or Reader for read-only); access-policy mode is the legacy path.
azure_rm_roleassignment fails with “principal not found.” Newly-created managed identities take a few seconds to propagate in Azure AD. Add a wait_for or poll loop, or use until on the role-assignment task.
shell: az group create … everywhere. The Azure equivalent of the AWS-CLI anti-pattern. Replace with azure_rm_resourcegroup (and friends) for idempotency, check-mode, and --diff.
Slow inventory. No cache:, no include_vm_resource_groups. Add both.
Best practices
auth_source: msiwhen the control node is in Azure. Always.module_defaultswithgroup/azure.azcollection.azureto avoid repeatingsubscription_idandauth_sourceper task.- One play per subscription — clean and legible.
- Tag schema enforced via Azure Policy. Make untagged resources illegal.
azure_rm_keyvaultin RBAC mode (enable_rbac_authorization: true). Access-policy mode is legacy.- Cache the inventory.
cache: truewith a 5-10 minute timeout. - Pin collection versions.
azure.azcollection 2.x. - Build an Azure EE with
azure.azcollection, the SDK requirements (azure-mgmt-*,msrestazure,azure-identity), and theazCLI for the rare modules that shell out to it. - Use Workload Identity for AKS-hosted Container Groups — federated tokens, no SP secret.
- Mesh execution nodes inside the VNet — Azure regional API endpoints are public but private-endpoint setups force in-VNet execution.
Security notes
- Service Principal secrets are crown jewels. Rotate every 90 days, store in Ansible Vault or AAP credentials. Never commit them.
- Prefer Workload Identity / Managed Identity to long-lived SP secrets. Federated tokens are short-lived; the audit story is cleaner.
- Tag-based RBAC via custom roles + ABAC conditions: write a role that allows
Microsoft.Compute/virtualMachines/startonly when the VM has tagowner == ${userTag/team}. Ansible then can’t accidentally start the wrong team’s VMs. - Activity Log every play. Set
principal_id_session_name-equivalents (Azure logs the SP/MI identity automatically); aggregate to Log Analytics. - RBAC mode for Key Vault is harder to misconfigure than access policies — fewer ways to accidentally grant
*permissions. - Air-gap-friendly EE. Push to Private Automation Hub, pin AAP by digest, mirror SDK packages internally.
- Storage account public access blocked by default.
allow_blob_public_access: falseon everyazure_rm_storageaccount.
Interview & exam Q&A
Q1. Why prefer Managed Identity over Service Principal? Managed Identity issues short-lived tokens automatically, with no secrets on disk and no rotation burden. Service Principal secrets must be stored, rotated, and protected. Use SP only when the control node lives outside Azure.
Q2. Difference between system-assigned and user-assigned managed identity? System-assigned MIs are tied to a single resource (one VM = one identity); they live and die with the resource. User-assigned MIs are standalone identities you attach to multiple resources, useful when an identity outlives any one VM (think “AAP execution-node MI” attached to all execution nodes).
Q3. What does auth_source: msi actually do?
It tells the SDK to call the Instance Metadata Service (http://169.254.169.254/metadata/identity/oauth2/token) to obtain an OAuth token bound to the resource’s identity. No client_id/secret/tenant — pure platform-issued.
Q4. How do you run a single playbook across 5 Azure subscriptions?
Either: (a) one play per subscription, with module_defaults setting subscription_id; or (b) per-task subscription_id:. Avoid implicit “default subscription” — be explicit.
Q5. What’s the difference between azure_rm_keyvault RBAC mode and access-policy mode?
RBAC mode (recommended) uses Azure RBAC role assignments at the vault scope (Key Vault Secrets User, Key Vault Crypto Officer, …). Access-policy mode is the legacy per-vault permission list. RBAC mode is auditable, principle-of-least-privilege-friendly, and what new vaults should default to.
Q6. How does the azure_rm inventory handle multi-subscription?
Either one inventory file per subscription with AZURE_SUBSCRIPTION_ID set per source, or one tenant-wide SP/MI with cross-subscription read access and the plugin enumerating VMs across all of them. The first is more explicit and cacheable.
Q7. What’s the most performance-impactful knob on the azure_rm plugin?
include_vm_resource_groups. Listing VMs in a 20-RG subscription is fast; listing them across the whole subscription is slow. Pre-filter aggressively.
Q8. How do you automate Azure RBAC at scale?
azure_rm_roleassignment per (principal_id, role_definition_id, scope) triple. Build a YAML inventory of “who gets what role at what scope” and loop the module over it.
Q9. Why does azure_rm_virtualmachine sometimes “rebuild” my VM?
Because you changed a property that isn’t in-place mutable — vm_size:, image:, os_disk_* are common ones. Use the dedicated azure_rm_virtualmachine_resize for size changes; image and OS disk are essentially write-once.
Q10. What’s the right place for Storage Account secrets / connection strings in plays?
Don’t have them. azure_rm_keyvaultsecret reads secrets at play time via lookup; module_defaults injects them; never write them in clear in role defaults. Or use Managed Identity on the running app — no connection string at all.
Q11. How does Workload Identity (AKS) compare to Managed Identity (VM)? Workload Identity uses Kubernetes ServiceAccount tokens federated against Azure AD. Pods using a configured SA get short-lived tokens automatically — same UX as MI, container-native. AAP Container Groups in AKS should use Workload Identity.
Q12. Why is module_defaults with group/azure.azcollection.azure a big deal?
It removes 80% of the boilerplate. subscription_id, auth_source, tenant, client_id set once at play level; every azure_rm_* module inherits. Plays are dramatically more readable.
Q13. How do you bootstrap from a laptop without a service principal?
az login once; set auth_source: cli. Ansible reads the cached credentials in ~/.azure/. Useful for local development and for the creation of the automation Service Principal itself.
Q14. What goes into a production Azure EE?
ansible-builder with azure.azcollection, the Python SDK (azure-identity, azure-mgmt-resource, azure-mgmt-compute, azure-mgmt-network, azure-mgmt-storage, msrestazure), the az CLI (for the rare modules that shell out), and your shared utility collections. Push to Private Automation Hub.
Quick check
- What
auth_sourceshould an AAP Controller running on an Azure VM use? - How do you avoid repeating
subscription_id:andauth_source:on every Azure module call? - Which inventory plugin knob is the biggest performance lever?
- What’s the recommended access mode for new Azure Key Vaults?
- Which Azure module is the right tool for “give principal X the role Y at scope Z”?
(Answers: msi; module_defaults with group/azure.azcollection.azure; include_vm_resource_groups; RBAC mode (enable_rbac_authorization: true); azure.azcollection.azure_rm_roleassignment.)
Exercise
With your free-tier subscription:
- Create a Service Principal with
Contributoron a single RG. Set up~/.azure/credentialsand confirmauth_source: credential_fileworks. - Write a play that creates an RG, VNet, subnet, NSG, and storage account, all idempotent. Use
module_defaultsto avoid repetition. - Add
azure_rm_virtualmachineto launch one Ubuntu VM. After creation, register the public IP and useadd_host+wait_for_connection. - Build an
azure_rminventory pointing at the RG. Verify--graphshows tag and location groups. - (Stretch) Add an
azure_rm_roleassignmenttask that grants a Managed IdentityReaderon the RG. - Run with
--check --diff. Then for real. Then again —changed=0.
Certification mapping
| Cert | Coverage |
|---|---|
| EX374 — Red Hat Certified Specialist in Ansible Automation | Direct: cloud collections, dynamic inventory, EE. |
| AZ-104 / AZ-305 | Indirect: subscription/RG/role-assignment mental model. |
| AZ-400 (DevOps Engineer Expert) | Direct: deployment automation, RBAC at scale. |
Glossary
- Subscription — Azure billing/API boundary; every API call is scoped to one.
- Tenant — Azure AD directory; one per organisation, many subscriptions.
- Resource group — logical resource container; the unit of deletion.
- Service Principal (SP) — automation identity in Azure AD with secret/cert.
- Managed Identity (MI) — platform-issued identity bound to an Azure resource.
- Workload Identity — MI for AKS pods via OIDC federation.
auth_source— Ansible’s auth-mode selector forazure.azcollection.module_defaults— play-level defaults applied to every module in a group/FQCN.azure_rmplugin — dynamic inventory plugin inazure.azcollection.- RBAC mode (Key Vault) — Azure RBAC at vault scope; recommended over access policies.
Next steps
You can now drive Azure from Ansible. Continue with Ansible for GCP to round out the cloud trio, Ansible for Windows because most Azure VMs run Windows, Ansible for Kubernetes for AKS-native ops, and Hybrid Multi-Cloud Orchestration to compose Azure + AWS + GCP + on-prem in one workflow.