Ansible Lesson 5 of 42

Ansible Playbooks, In Depth: Plays, Tasks, Modules, Become & Your First Playbook

Ad-hoc commands are wonderful for a one-off ansible.builtin.ping or restarting a service across fifty boxes, but they are typed, ephemeral, and unreviewable. The moment you want to describe the state of a system — “this package is installed, this file has these contents, this service is enabled and running” — and have that description live in Git, be code-reviewed, run in CI, and re-applied a thousand times without drift, you have left ad-hoc territory and entered the world of playbooks. A playbook is the unit of automation in Ansible. Everything serious you will ever do — provisioning, configuration, deployment, orchestration — is a playbook.

This lesson is a deep, every-keyword tour of the playbook. We will take apart a play keyword by keyword (the table alone is worth bookmarking), then a task, then walk the exact order in which Ansible executes them across your hosts. We will spend a long time on become — privilege escalation — because it is simultaneously the feature people use every single day and the one they understand least, and it is a guaranteed interview and exam topic. Finally we will go through every flag of the ansible-playbook command, because the difference between a junior and a senior at the keyboard is usually --check --diff, --limit, --tags, and --start-at-task. By the end you will write, syntax-check, dry-run, and execute a real first playbook against localhost and a couple of containers — for ₹0.

Everything here targets ansible-core 2.17+ / Ansible 10+ (the 2026 baseline) and uses FQCN (fully-qualified collection names like ansible.builtin.copy) throughout, which is the modern, unambiguous, and exam-correct way to name modules.

Learning objectives

By the end of this lesson you will be able to:

Prerequisites & where this fits

You should have completed Ansible Ad-Hoc Commands & Modules (ansible-ad-hoc-commands-modules-ansible-doc) — you need to be comfortable running ansible all -m ping, you need a working control node with ansible-core installed, an inventory (even a one-line one), and SSH access to at least one managed node (localhost counts). You should know what a module is (an idempotent unit of work that runs on the target and returns JSON) and what FQCN means. This lesson sits in the Foundation tier of the Ansible Zero-to-Hero course, module Playbooks. The very next lesson, Ansible Core Modules for Real Work (ansible-core-modules-package-service-copy-file-user), fills the playbooks you write here with the modules that do the actual work; this lesson is the grammar, that one is the vocabulary.

Core concepts: the playbook hierarchy

A playbook is a YAML file with a specific, layered structure. Get the four-level mental model straight and the rest is detail:

YAML basics you must respect: indentation is two spaces, never tabs; a list item starts with - ; a mapping is key: value; strings with special characters (:, {, }, [) should be quoted. The document may begin with ---. A single stray tab or a misaligned dash is the single most common reason a beginner’s playbook will not run, so let your editor show whitespace.

Here is the smallest complete playbook, annotated. Keep it in your head as the skeleton everything below decorates:

---
- name: Configure the web tier            # the PLAY (one item in the playbook list)
  hosts: web                              # which inventory hosts/groups this play targets
  become: true                            # escalate privilege for this whole play (→ root)
  gather_facts: true                      # collect system facts before tasks (default true)
  vars:                                   # play-scoped variables
    http_port: 80
  tasks:                                  # the ordered list of TASKS
    - name: Install nginx                 # task name (shown in output — always set one)
      ansible.builtin.package:            # the MODULE (FQCN)
        name: nginx                       #   module ARGS
        state: present
    - name: Ensure nginx is running
      ansible.builtin.service:
        name: nginx
        state: started
        enabled: true

Plays vs roles vs tasks (when you reach for each)

A common early confusion: tasks, roles, and plays all “contain things.” The distinction:

Concept What it is Reusable across playbooks? When you use it
Task One module call No (it lives inline) The atomic action: install, copy, restart
Block A group of tasks sharing keywords/error-handling No (inline) Apply when/become/rescue to several tasks at once
Play hosts ↦ tasks/roles binding No (it is the orchestration) “On these hosts, run this work as this user”
Role A packaged, parameterised bundle of tasks/handlers/templates/defaults Yes (the unit of reuse) Anything you will reuse: an nginx role, a users role

Roles get their own lesson (ansible-roles-structure-dependencies-galaxy-collections). For now: a play can run tasks and roles; tasks are the literal grammar of work.

The play, keyword by keyword

A play is a mapping of keywords (sometimes called “directives”). The full set is large; the table below covers the ones you will actually use, with what each does, its accepted values, its default, and the gotcha. Keywords not in this table (e.g. connection, port, remote_user, vars_prompt, environment, module_defaults, collections, pre_tasks/post_tasks, run_once, throttle, order) are noted briefly after it.

Play keyword What it does Values / type Default Gotcha
name A human label for the play (printed in PLAY […]) string unnamed Not required but always set one — output is unreadable otherwise
hosts The inventory pattern this play runs against pattern string or list (web, all, web:!db, web[0:2]) required If the pattern matches nothing, the play is skipped with a warning, not an error
become Turn on privilege escalation for every task in the play boolean (true/false) false Inherited by tasks; a task can override it. See the become section
become_user The user to become username string root Becoming a non-root user is a frequent source of “permission denied” — see gotchas
become_method How to escalate sudo/su/doas/pbrun/pfexec/runas/ksu/machinectl/dzdo sudo runas is for Windows; su needs become_user’s password, not yours
become_flags Extra flags passed to the become program string empty e.g. -i to get a login shell for sudo
gather_facts Run the implicit setup task to collect system facts before tasks boolean true (configurable) Set false to speed up plays that don’t use facts; then ansible_* vars are unavailable
gather_subset Limit which facts are gathered list (min, network, hardware, !all, …) platform default Use min for a big speed-up when you only need a little
vars Variables scoped to this play mapping none Lower precedence than -e extra-vars and task vars
vars_files Files of variables to load into the play list of paths none Loaded at play start; paths are relative to the playbook
vars_prompt Prompt the operator for variables interactively list of prompt specs none Breaks unattended/CI runs — avoid in automation
tasks The ordered list of tasks to run list of task mappings none The heart of the play
pre_tasks Tasks that run before roles and before tasks list none Handlers notified here flush before roles run
post_tasks Tasks that run after tasks list none Useful for smoke tests at the end
roles Roles to apply (run after pre_tasks) list of role names/dicts none Role tasks run between pre_tasks and tasks
handlers Handlers (tasks triggered by notify) list none Run once, at end of play, only if notified
serial Rolling batch size: how many hosts to process at a time int, percentage, or list ([1, 5, "30%"]) all hosts at once The classic rolling-deploy lever; a failed batch can abort the rest
strategy Host scheduling strategy linear / free / host_pinned / debug linear free lets fast hosts race ahead; linear keeps them in lock-step per task
max_fail_percentage Abort the play if more than N% of hosts in a batch fail number (0–100) 100 (effectively off) Pairs with serial for safe rollouts
any_errors_fatal If any host fails a task, stop all hosts boolean false “All or nothing” — good for tightly-coupled clusters
ignore_unreachable Continue even if a host is unreachable boolean false Unreachable ≠ failed; this controls the former
force_handlers Run notified handlers even if the play later fails boolean false (config: force_handlers) Without it, a failure mid-play means notified handlers never fire
check_mode Force this play into check (dry-run) mode regardless of CLI boolean inherits CLI check_mode: false forces a play to always really run
diff Force diff output for this play boolean inherits CLI Per-play override of --diff
tags Tags applied to the whole play list/string none --tags/--skip-tags select on these
connection Connection plugin for this play ssh / local / winrm / psrp / community.docker.docker … ssh (config) Use local for control-node-only plays
remote_user The SSH login user (the user you connect as, before become) username ansible.cfg / current user Distinct from become_user (who you become after connecting)
port SSH port for this play int 22 Per-host ansible_port usually wins
environment Environment variables for tasks (e.g. proxies, PATH) mapping none Applies to module execution on the target
module_defaults Default args applied to a module/group across the play mapping none DRY way to set, e.g., a region for all amazon.aws.* tasks
collections Search path for unqualified module names list none Prefer FQCN instead — this is legacy convenience
run_once Run a task on only the first host, share the result with all boolean (task-level usually) false Great for one-time actions (DB migration) in a multi-host play
throttle Cap concurrent hosts for a specific task int 0 (no cap) Finer than serial; per-task
order Order hosts are processed within a play inventory/sorted/reverse_*/shuffle inventory shuffle helps avoid always hammering the same host first

That table is the play. Note three pairs people conflate: remote_user (who you log in as) vs become_user (who you become after); serial (batch size) vs strategy (scheduling within a batch); and any_errors_fatal (one host’s failure kills all) vs max_fail_percentage (a threshold). Interviewers love all three.

The task, keyword by keyword

Inside tasks: each list item is a task: a mapping with exactly one module key plus task-level keywords. Here is the exhaustive table of the keywords you will use on tasks. (Many also apply to blocks and roles — Ansible calls them “task keywords” generally.)

Task keyword What it does Values / type Default Gotcha
name The label printed in TASK […] string unnamed Always name tasks; --start-at-task matches on this
(the module) The work itself, e.g. ansible.builtin.copy: one module key with its args required Exactly one module per task
args Alternative way to pass module args (as a sub-mapping) mapping n/a Rarely needed — pass args under the module key directly
vars Variables scoped to this single task mapping none Highest-but-one precedence among play vars
when Run the task only if the condition is true expression / list of expressions (AND) always run It’s Jinja without {{ }}; a list = all must be true
loop Repeat the task once per item list (or {{ var }}) no loop The modern replacement for with_*; item is {{ item }}
loop_control Tune the loop (loop_var, label, index_var, pause) mapping none Use label: to keep loop output readable
with_<lookup> Legacy looping (with_items, with_dict, …) varies no loop Prefer loop; with_* maps to lookup plugins
notify Trigger handler(s) if this task reports changed handler name or list none Fires only on changed, and handlers run at end of play
register Save the task’s result into a variable variable name none Inspect .rc, .stdout, .stdout_lines, .changed, .results (loops)
changed_when Override when the task counts as changed expression / bool module decides changed_when: false for read-only commands (a must for command/shell)
failed_when Override when the task counts as failed expression / bool rc≠0 / module decides Express your own failure condition (e.g. grep output)
ignore_errors Continue the play even if this task fails boolean false Marks ...ignoring; the host is not removed from the play
tags Tags for selecting/skipping this task list/string none always always runs; never only when explicitly named
become / become_user / become_method / become_flags Per-task privilege escalation (override the play) as play-level inherits play Set become: true on just the one task that needs root
delegate_to Run this task on a different host hostname / localhost the current host The facts/vars are still the original host’s
run_once Run on first host only, copy result to the rest boolean false Combine with delegate_to: localhost for control-node one-offs
local_action Shorthand for delegate_to: localhost module + args n/a Legacy; delegate_to is clearer
no_log Suppress this task’s input/output in logs boolean false Always set on tasks handling passwords/secrets
environment Env vars for this task only mapping inherits Per-task proxy/PATH overrides
retries / delay / until Retry the task until a condition holds int / int / expression no retry Requires until:; without it retries is ignored
async / poll Run the task asynchronously (fire-and-forget or poll) int seconds / int seconds sync poll: 0 = fire-and-forget; check later with async_status
throttle Max hosts running this task concurrently int 0 Per-task concurrency cap
check_mode Force this task into (or out of) check mode boolean inherits CLI check_mode: false runs a read task for real even under --check
diff Force/suppress diff for this task boolean inherits CLI Pair with no_log to avoid leaking secrets in a diff
delegate_facts Assign gathered/registered facts to the delegated host boolean false Subtle; for advanced delegation

Two task keywords deserve a flag now because new users trip on them constantly:

The execution model: how a play actually runs

This is the concept that separates people who can debug Ansible from people who cannot. The default strategy is linear, and it works horizontally, task by task:

  1. Ansible reads the play and resolves hosts into a concrete list of target hosts (filtered further by --limit).
  2. Unless gather_facts: false, it runs the implicit setup task on every host to collect facts.
  3. It takes task 1 and runs it on every host in parallel (up to forks, default 5 — see ansible.cfg). It waits for all hosts to finish task 1.
  4. Only then does it move to task 2, again across all hosts. And so on, top to bottom.

So the order is outer loop = tasks, inner loop = hosts — not “finish host A entirely, then host B.” A host that fails a task is removed from the rest of the play (it does not attempt later tasks) unless ignore_errors: true, rescue:, or ignore_unreachable says otherwise. The remaining hosts carry on. This is why a single failing host doesn’t abort everyone (unless you ask for that with any_errors_fatal or max_fail_percentage).

Two play keywords bend this model:

forks (in ansible.cfg or -f) is the parallelism knob: how many hosts Ansible talks to at once within a batch. The default of 5 is conservative; bump it for large fleets.

Diagram-worthy summary: tasks march down the play; for each task, all targeted hosts run it together; failures peel hosts off; serial slices the fleet into waves; strategy decides whether hosts move in lock-step or run free.

Ansible playbook anatomy and execution model: a playbook contains plays, a play binds hosts to an ordered list of tasks, and the linear strategy runs each task across all hosts before advancing

The diagram shows the playbook → play → task → module hierarchy on the left, the horizontal (task-by-task, host-by-host) execution sweep in the middle, and the privilege-escalation flow (connect as remote_user, then become become_user via become_method) on the right.

Privilege escalation: become in full depth

Most useful work needs root: installing packages, writing to /etc, managing services. But you should never SSH in as root (it’s a security anti-pattern, and many distros disable it by default). The pattern everywhere is: log in as an ordinary user, then escalate with sudo (or su, doas, etc.). Ansible’s name for this is become, and it is a small system with a few moving parts that you must understand precisely — this is the single most-tested operational topic in RHCE EX294.

The three become keywords (plus flags)

Keyword Question it answers Default Examples
become Do we escalate? false become: true
become_user Who do we become? root become_user: postgres
become_method How do we escalate? sudo become_method: su
become_flags Extra flags for the become program empty become_flags: '-i' (login shell), '-s /bin/sh'

Read them as one sentence: “become the become_user using become_method.” The default sentence is “become root using sudo.”

The become methods (every one)

become_method selects a become plugin. The complete set shipped with ansible-core / common collections:

Method Platform What it runs Password it needs Notes
sudo Linux/Unix sudo your sudo password (if any) The default; configured via /etc/sudoers
su Linux/Unix su the target user’s password No sudoers needed; classic on older systems
doas OpenBSD/Linux doas your doas password (if any) The minimalist sudo alternative; /etc/doas.conf
pbrun Unix PowerBroker pbrun per PowerBroker policy Enterprise privilege management
pfexec Solaris/illumos pfexec RBAC profile Solaris role-based access
dzdo Unix Centrify dzdo per Centrify policy Centrify DirectAuthorize
ksu Unix (Kerberos) ksu Kerberos Kerberised su
runas Windows run as another user the target user’s password The Windows escalation method (with WinRM)
machinectl systemd Linux machinectl shell polkit For systemd-nspawn / user sessions

The two you will use 99% of the time are sudo (default, policy in /etc/sudoers) and occasionally su (when there’s no sudoers entry but you know the password). Remember the password difference: sudo asks for your password; su asks for the destination user’s password. Mixing these up is the most common become failure.

Where to set become: play, block, and task scope

become (and its companions) can be set at three levels; the inner level wins:

- name: Mixed-privilege play
  hosts: web
  become: true                 # PLAY level: default to root for the whole play
  tasks:
    - name: Read a public file (no root needed)
      ansible.builtin.command: cat /etc/hostname
      become: false            # TASK override: drop privilege for this one task
      changed_when: false

    - name: Database maintenance as the postgres user
      block:                   # BLOCK level: applies to every task inside
        - name: Run a vacuum
          ansible.builtin.command: vacuumdb --all
          changed_when: false
      become: true
      become_user: postgres    # become a NON-root user for the whole block

    - name: Install a package (inherits play-level become  root via sudo)
      ansible.builtin.package:
        name: htop
        state: present

Precedence is simply task > block > play > ansible.cfg/CLI defaults. A widespread good practice is to leave become: false at the play level and switch it on only for the specific tasks/blocks that genuinely need it — least privilege, and it makes the intent obvious in review.

How the become password flows

When become_method requires a password (sudo configured with a password, or any su), you must supply it. There are four ways, in increasing order of how production-appropriate they are:

Mechanism How Use when
--ask-become-pass (-K) ansible-playbook site.yml -K prompts once for the become password Interactive runs from your laptop
ansible_become_password var Set as a host/group var (ansible_become_pass is the older alias) Per-host, but must be Vault-encrypted
Ansible Vault Put ansible_become_password in a vault-encrypted group_vars file The right way to store it for unattended runs
Passwordless sudo Configure NOPASSWD in /etc/sudoers for the ansible user Common on cloud images; no password to manage at all

Two related details: -K is become password; do not confuse it with -k (lowercase), which is the SSH connection password (--ask-pass). And there are matching ansible.cfg toggles under [privilege_escalation] (become, become_method, become_user, become_ask_pass) so you can set fleet-wide defaults. Environment overrides exist too (ANSIBLE_BECOME, ANSIBLE_BECOME_METHOD, ANSIBLE_BECOME_USER, ANSIBLE_BECOME_ASK_PASS).

The classic become gotchas

These cause real-world and exam failures, so internalise them:

  1. Becoming a non-root unprivileged user breaks file transfer. Modules need to write temp files that the target user can read. By default Ansible uses a world-readable temp dir or setfacl; if neither works you’ll get "Failed to set permissions on the temporary files". The fix is to install acl on the target, or set allow_world_readable_tmpfiles (security trade-off), or use become_flags/pipelining appropriately. Becoming root never hits this; becoming postgres often does.
  2. become: true without a password where sudo needs oneMissing sudo password. Supply -K or vault ansible_become_password, or grant NOPASSWD.
  3. su failing with the wrong password — you supplied your password but su wants the destination user’s. Switch to sudo, or supply the right password.
  4. requiretty in sudoers (old systems) blocks sudo over SSH. Either remove Defaults requiretty for the ansible user or set become_flags: '-n' appropriately. Modern distros don’t set this.
  5. Forgetting that become doesn’t change remote_user. You still connect as remote_user/ansible_user; become happens after the connection. Sudoers must permit that login user to sudo.
  6. pipelining + sudo requiretty are incompatible; if you enable pipelining (a big speed-up) make sure requiretty is off.

Building your first real playbook

Let’s put the grammar to work. We’ll write a small web-server playbook that demonstrates: a named play, become, facts, vars, multiple tasks with different keywords, a handler, a tag, register, and changed_when. Save it as webserver.yml:

---
- name: Provision a simple web server
  hosts: web
  become: true                       # most tasks need root
  gather_facts: true
  vars:
    page_title: "Hello from Ansible"
  tasks:
    - name: Install nginx
      ansible.builtin.package:
        name: nginx
        state: present
      tags: [packages]

    - name: Deploy the index page
      ansible.builtin.copy:
        dest: /usr/share/nginx/html/index.html
        content: "<h1>{{ page_title }} on {{ ansible_facts['hostname'] }}</h1>\n"
        owner: root
        group: root
        mode: "0644"
      notify: Restart nginx          # only fires if this task reports 'changed'
      tags: [content]

    - name: Ensure nginx is enabled and running
      ansible.builtin.service:
        name: nginx
        state: started
        enabled: true

    - name: Check that nginx answers locally (read-only)
      ansible.builtin.command: "curl -fsS http://localhost/"
      register: curl_result
      changed_when: false            # a check never "changes" anything
      failed_when: page_title not in curl_result.stdout

  handlers:
    - name: Restart nginx
      ansible.builtin.service:
        name: nginx
        state: restarted

Notice how every keyword from the tables shows up: a play name/hosts/become/gather_facts/vars/tasks/handlers; task name/module/args/notify/tags/register/changed_when/failed_when; a Jinja reference to a fact (ansible_facts['hostname']) and to a var ({{ page_title }}). The handler runs once, at the end of the play, only if the copy task changed the file.

The ansible-playbook command, every flag

You run a playbook with ansible-playbook [options] playbook.yml. The flags are where day-to-day fluency lives. Here is the complete, grouped reference.

Selection & inventory

Flag Long form What it does
-i --inventory Inventory source(s); repeatable. A trailing comma makes a literal host list: -i host1,
-l --limit Restrict to a subset/pattern of the play’s hosts: -l web02, -l 'web:!db', -l @retry.file
-e --extra-vars Set variables (highest precedence): -e key=val, -e @vars.yml, -e '{"k":"v"}'
--list-hosts Print the hosts each play would target, then exit (no execution)
--list-tasks Print the tasks that would run (respects tags), then exit
--list-tags Print all tags available in the playbook, then exit

Dry-run, safety & diff

Flag Long form What it does
-C --check Dry run: report what would change without changing it (module-dependent)
-D --diff Show line-by-line diffs of files/templates a task changes (pair with --check to preview)
--syntax-check Parse the playbook (and includes) for YAML/structure errors only; run nothing
--step Prompt (N)o/(y)es/(c)ontinue before each task — step through interactively
--start-at-task Begin execution at the first task whose name matches the given string
--flush-cache Clear the fact cache before running
--force-handlers Run all notified handlers even if a later task in the play fails

Tags

Flag Long form What it does
-t --tags Run only tasks/blocks/roles with these tags (--tags content,packages)
--skip-tags Run everything except these tags

Special tag values: always runs unless explicitly skipped; never runs only if its tag is named; tagged / untagged / all are meta-selectors.

Privilege escalation & connection

Flag Long form What it does
-b --become Force become on (overrides the playbook)
--become-user The user to become
--become-method sudo / su / doas / runas / …
-K --ask-become-pass Prompt for the become (privilege-escalation) password
-k --ask-pass Prompt for the SSH connection password
-u --user The remote SSH login user (the remote_user)
-c --connection Connection plugin: ssh (default), local, winrm, docker …
--private-key / --key-file SSH private key file
-T --timeout SSH connection timeout (seconds)
--ssh-common-args / --ssh-extra-args Pass extra args to ssh/scp/sftp (e.g. a ProxyJump)

Vault

Flag Long form What it does
-J --ask-vault-password Prompt for the Vault password
--vault-password-file Read the Vault password from a file/script
--vault-id Specify a labelled vault: --vault-id prod@prompt

Parallelism, output & verbosity

Flag Long form What it does
-f --forks Number of hosts to act on in parallel (default 5)
-v … -vvvv --verbose Increase verbosity: -v results, -vv task input, -vvv connection, -vvvv connection debug
--check + -D (combo) the standard “show me exactly what this would do” pre-flight

The five flags you will reach for daily, in order of importance: --syntax-check (does it even parse?), --check --diff / -C -D (what would it do?), --limit / -l (do it to just this host first), --tags / -t (run only the relevant slice), and --start-at-task (resume a long playbook after fixing a failure). Commit those to muscle memory.

Reading the play recap

Every run ends with a PLAY RECAP line per host. Each counter has a precise meaning — interviewers ask “what’s the difference between changed and ok?” and “failed vs unreachable?”:

Counter Meaning
ok Tasks that ran and the host was already in the desired state (no change made), plus successful gather_facts
changed Tasks that made a change (this is your drift indicator — a fully converged system shows changed=0 on a re-run)
unreachable Ansible could not connect (SSH/auth/host down) — a transport failure, not a task failure
failed A task ran and failed (non-zero rc, module error, or failed_when)
skipped Tasks skipped by when or tag selection
rescued Tasks in a block that failed but were handled by a rescue:
ignored Tasks that failed but had ignore_errors: true

The single most useful thing to know: a correctly written, idempotent playbook re-run should report changed=0. If a re-run still shows changes, either the system genuinely drifted, or (more often) one of your tasks isn’t idempotent — usually a command/shell task missing changed_when: false.

Hands-on lab: write, check, and run your first playbook (₹0)

Everything here runs on your control node plus two throwaway containers, so there is no cloud spend. You need ansible-core and either Docker or Podman installed locally.

Step 1 — start two target containers. We use systemd-capable images so service/systemd work like a real box:

# Two CentOS-Stream-9 containers with systemd as PID 1
for n in 1 2; do
  docker run -d --name web0$n --hostname web0$n \
    --tmpfs /run --tmpfs /tmp -v /sys/fs/cgroup:/sys/fs/cgroup:ro \
    quay.io/centos/centos:stream9 /usr/sbin/init
done
docker ps --format '{{.Names}}\t{{.Status}}'

Step 2 — make an inventory that targets them via the Docker connection (no SSH needed). Create inventory.ini:

[web]
web01 ansible_connection=docker
web02 ansible_connection=docker

[web:vars]
ansible_python_interpreter=/usr/bin/python3

Confirm connectivity:

ansible -i inventory.ini web -m ansible.builtin.ping
# Expect: web01 | SUCCESS => {"ping": "pong"}  (and web02)

Step 3 — save the webserver.yml from the section above into the same directory.

Step 4 — syntax-check first (always):

ansible-playbook -i inventory.ini webserver.yml --syntax-check
# Expect: playbook: webserver.yml   (no errors)

Step 5 — see what it would do with a dry run + diff (note: under the Docker connection, become is unnecessary because the container’s default user is root, so these examples omit -K):

ansible-playbook -i inventory.ini webserver.yml --check --diff
# 'changed' items are previewed; the index.html diff is printed. Nothing is actually altered.

Step 6 — run it for real:

ansible-playbook -i inventory.ini webserver.yml

Expected recap (first run): every host shows several changed items and failed=0 unreachable=0, e.g.

PLAY RECAP *********************************************************************
web01  : ok=5  changed=3  unreachable=0  failed=0  skipped=0  rescued=0  ignored=0
web02  : ok=5  changed=3  unreachable=0  failed=0  skipped=0  rescued=0  ignored=0

Step 7 — prove idempotency. Run the exact same command again:

ansible-playbook -i inventory.ini webserver.yml
# This time: changed=0 on every host. The handler does NOT fire (nothing changed).

Seeing changed=0 on the second run is the whole point of Ansible. Let it sink in.

Step 8 — exercise the flags:

# Only the content task (by tag), only on web01:
ansible-playbook -i inventory.ini webserver.yml --tags content --limit web01

# List what would run, with current tags:
ansible-playbook -i inventory.ini webserver.yml --list-tasks
ansible-playbook -i inventory.ini webserver.yml --list-tags

# Resume from a named task:
ansible-playbook -i inventory.ini webserver.yml --start-at-task "Ensure nginx is enabled and running"

# Step through interactively:
ansible-playbook -i inventory.ini webserver.yml --step

Step 9 — confirm the result from inside a container:

docker exec web01 curl -s http://localhost/
# Expect: <h1>Hello from Ansible on web01</h1>

Validation checklist. You have succeeded if: --syntax-check passed; the first run showed changed>0, failed=0; the second run showed changed=0; --tags content ran only the copy task; and the curl inside the container returns your page.

Cleanup (remove every container — leave nothing behind):

docker rm -f web01 web02
docker ps -a --format '{{.Names}}' | grep -E 'web0[12]' || echo "cleaned up"

Cost note: ₹0. Local containers only — no cloud resources are created at any point. The only cost is the disk space of the CentOS image, reclaimed on cleanup.

Common mistakes & troubleshooting

Symptom Likely cause Fix
ERROR! Syntax Error while loading YAML Tabs instead of spaces, or a misaligned -/key Use 2-space indent, no tabs; run --syntax-check; let your editor show whitespace
Every re-run shows changed for a command task command/shell always reports changed Add changed_when: false (read-only) or a real changed_when: expression
Missing sudo password become: true but sudo needs a password Add -K, or vault ansible_become_password, or grant NOPASSWD
Failed to set permissions on the temporary files Becoming a non-root user without ACL support Install acl on the target, or use a root become, or set allow_world_readable_tmpfiles (trade-off)
Handler never runs The notifying task reported ok, not changed (or the play failed before end) Make the task actually change; or --force-handlers / force_handlers: true
--start-at-task doesn’t start where expected It matches task names; unnamed tasks can’t be targeted Name every task; match the exact name: string
Variable looks like a string "{{ x }}" literally A bare {{ }} at the start of an unquoted value confuses YAML Quote the whole value: key: "{{ x }}"
Play runs on no hosts (“skipping: no hosts matched”) hosts: pattern matches nothing in the inventory Check ansible-inventory --graph; verify group/host names and --limit
Confusing -k and -K -k = SSH password, -K = become password Remember: lowercase k = connect, uppercase K = escalate

Best practices

Security notes

Interview & exam questions

  1. What is the difference between a play and a task? A play maps a set of inventory hosts to an ordered list of tasks (and handlers/roles) plus play-level settings like become; a task is a single call to one module. A playbook is a list of plays.
  2. Explain Ansible’s execution order with the default strategy. Linear strategy runs task by task across all targeted hosts: it runs task 1 on every host (up to forks), waits, then task 2 on every host, and so on. A host that fails a task is dropped from the rest of the play.
  3. What does become do, and what are become_user and become_method? become enables privilege escalation; become_user is who you become (default root); become_method is how (default sudo; also su, doas, runas, etc.). You connect as remote_user, then escalate.
  4. sudo vs su for become — what’s the password difference? sudo prompts for your (the connecting user’s) password; su prompts for the target user’s password. Supplying the wrong one is a common failure.
  5. What’s the difference between -k and -K? -k (--ask-pass) prompts for the SSH connection password; -K (--ask-become-pass) prompts for the privilege-escalation password.
  6. How do you do a dry run and preview changes? ansible-playbook site.yml --check --diff (-C -D): --check reports what would change without changing it; --diff shows the line-level changes.
  7. Why might a command task always report changed, and how do you fix it? Ansible can’t know whether an arbitrary command changed anything, so command/shell default to changed. Add changed_when: false for read-only commands (or a proper changed_when: expression).
  8. When do handlers run, and what triggers them? A handler runs once, at the end of the play, and only if a task that notifys it reported changed. meta: flush_handlers forces them to run earlier; force_handlers/--force-handlers runs them even if the play later fails.
  9. What’s the difference between serial and strategy? serial sets the batch size (how many hosts at a time — a rolling deploy); strategy controls scheduling within a batch (linear = lock-step per task; free = each host races ahead).
  10. unreachable vs failed in the recap? unreachable is a connection/transport failure (SSH, auth, host down); failed means a task ran and failed. They are counted separately.
  11. How do you run only part of a playbook? By tags (--tags/--skip-tags), by host (--limit), or by resuming at a named task (--start-at-task "name"); --step walks task by task.
  12. How should you handle the become password in an unattended pipeline? Store ansible_become_password in a Vault-encrypted group/host var (or use scoped NOPASSWD sudo) — never plaintext, never --ask-become-pass in CI.

Quick check

  1. In ansible.builtin.copy:, the keys under it (src, dest, mode) are called what?
  2. Which flag prompts for the privilege-escalation password — -k or -K?
  3. What does a re-run’s changed=0 tell you about your playbook?
  4. At which scopes can you set become (name three)?
  5. Which ansible-playbook flag parses the file for errors without running anything?

Answers

  1. The module’s arguments (module options/parameters).
  2. -K (--ask-become-pass). -k is the SSH connection password.
  3. It is idempotent — the system was already in the desired state, so nothing was changed (a converged, correct run).
  4. Play, block, and task level (also via ansible.cfg/CLI). Inner scope wins.
  5. --syntax-check.

Exercise

Extend webserver.yml into a small, production-flavoured playbook:

  1. Add a play-level vars_files: that loads vars/site.yml containing page_title and a new admin_user.
  2. Add a block that runs as a non-root become_user (create the user first with ansible.builtin.user as root, then in a later block become_user: "{{ admin_user }}" and write a file into that user’s home).
  3. Add a tags: [smoke] task at the end (a post_tasks command with changed_when: false) that curls the page and uses failed_when to fail if the title is missing.
  4. Set serial: 1 so the play rolls one container at a time, and add max_fail_percentage: 0.
  5. Run it with --check --diff, then for real, then again to prove changed=0; then run --tags smoke --limit web02 only. Confirm the recap counters match your expectations, then clean up the containers.

Bonus: add no_log: true to a task that writes a fake “password” line and confirm -v no longer prints it.

Certification mapping

This lesson maps directly to the Red Hat Certified Engineer (RHCE) EX294 objectives:

It also underpins the broader DevOps/automation competencies tested in CKA-adjacent and platform-engineering interviews, where “explain how a playbook executes across hosts” and “how do you escalate privilege safely” are staple questions.

Glossary

Next steps

ansibleplaybooksbecomeprivilege-escalationansible-playbookrhce
Need this built for real?

Vinod is a Senior Cloud Architect (22+ yrs) — available for Azure / AWS / GCP architecture, landing zones, and migrations.

Work with me

Comments