Ansible Lesson 15 of 42

Ansible Delegation, Strategies & Rolling Updates, In Depth: delegate_to, run_once, serial & free

By default Ansible runs every task of a play on every host in that play, host after host, batch after batch, until the whole inventory is done. That is exactly what you want for “install nginx on all web servers” — and exactly what you must not do for “upgrade the web tier without taking the site down”. A real production change needs two extra dimensions of control that the playbook basics never give you: where a task actually executes (you want the health-check to hit the load balancer, not the web node you are upgrading) and how the fleet is walked (a few hosts at a time, drained out of the pool first, with the whole rollout aborting if too many fail). This lesson is the complete reference for both dimensions. We cover delegation (delegate_to, delegate_facts, local_action, connection: local, run_once) and execution control (serial, the strategy plugins, forks, throttle, order), and then we assemble them into the single most important pattern an Ansible operator must be able to write from memory: a zero-downtime rolling update.

By the end you will know precisely which host a delegated task runs on and which host’s variables it sees, why run_once plus delegate_to: localhost is the idiomatic “do this exactly once, here” construct, the difference between serial: 2 and serial: "30%" and a ramp-up list like [1, 5, "30%"], how forks and serial interact (and which one actually limits concurrency), what the free strategy changes about play semantics, and how max_fail_percentage together with a pre_tasks drain and a post_tasks add-back gives you a rollout that fails safe.

Learning objectives

Prerequisites & where this fits

You should be comfortable writing multi-task plays with become, registering results and referencing variables and facts, and using when, loop, and handlers — the material in Ansible Conditionals, Loops, Handlers & Tags, In Depth and Ansible Error Handling, In Depth. Delegation leans on register/hostvars constantly, and the rolling-update pattern is built directly on max_fail_percentage and any_errors_fatal from the error-handling lesson, so have those fresh. This is lesson C1 in the Advanced tier of the KloudVin Ansible Zero-to-Hero course, in the Execution module. Everything here is ansible.builtin and ships with ansible-core (2.17+ assumed, Ansible 10+ in 2026) — no collections are required for the features themselves, though the lab uses community.docker to create throwaway practice hosts. FQCN (ansible.builtin.*) is used throughout. The next lesson, Tuning Ansible for Speed & Scale, In Depth, picks up forks and the free strategy from the performance angle.

Core concepts

Five mental models carry this entire lesson. Internalise them and the rest is detail.

Keep those five sentences in your head.


delegate_to: running a task on a different host

delegate_to: <hostname> is a task-level keyword that says “run this one task on <hostname> instead of on the host currently being iterated”. The play keeps iterating its normal hosts (web1, web2, …); but for the delegated task, the connection, the module execution, and any files are handled on the delegate.

The textbook use case is acting on infrastructure that is shared across the hosts you are looping over — a load balancer, a DNS server, a monitoring system, a database — once per iterated host. While upgrading web1, you delegate “remove web1 from the pool” to the load balancer; while upgrading web2, you delegate “remove web2” to the same load balancer; and so on.

- name: Take this web host out of the HAProxy pool before upgrading it
  community.general.haproxy:
    state: disabled
    host: "{{ inventory_hostname }}"     # the host being iterated…
    socket: /var/run/haproxy.sock
    backend: web-backend
    wait: true
  delegate_to: lb1                         # …but the action happens ON lb1

The mental model: you are still “on” web1 (that is what inventory_hostname resolves to, and that is whose hostvars you can read), but the task executes on lb1. This is why the example reads naturally — host: "{{ inventory_hostname }}" passes the iterated host’s name as the thing to disable, and delegate_to: lb1 chooses where the disabling happens.

Which variables and facts are in scope

This is the number-one delegation exam point and the number-one source of bugs:

Quantity Value during a delegate_to: lb1 task while iterating web1
inventory_hostname web1 (the iterated host, not the delegate)
Regular vars, host_vars, group_vars web1’s
ansible_facts / ansible_* web1’s gathered facts (delegate’s facts are not substituted by default)
Connection (user, host, port, become) the delegate’s connection settings (it really connects to lb1)
hostvars['lb1'] available, so you can explicitly read the delegate’s vars/facts if you need them
Registered result stored against web1 (the iterated host), as usual

So a delegated task connects to the delegate but “thinks” in terms of the original host’s data. If you need the delegate’s facts (e.g. the load balancer’s own IP), read them explicitly via hostvars['lb1']['ansible_facts'][...] — they exist only if lb1 was in a prior play that gathered facts, or you gather them, or you use delegate_facts (below).

delegate_to: localhost — the most common delegate

By far the most frequent delegate is the control node itself. “Send a Slack notification”, “call a cloud API”, “render a report locally”, “look something up” — none of those should run on the managed host; they should run where Ansible runs:

- name: Post a deploy notification (runs on the control node, once)
  community.general.slack:
    token: "{{ slack_token }}"
    msg: "Deploying build {{ build_id }} to {{ ansible_play_hosts | length }} hosts"
  delegate_to: localhost
  run_once: true

delegate_to: localhost runs the task on the control node but still with the iterated host’s variables in scope — which is exactly why it pairs so naturally with run_once (do it one time) and with reading hostvars (gather data from all hosts, act once locally).

Connection details for the delegate

The delegate must be reachable and connectable like any host. Practical points:

delegate_to with loops and vars per delegate

delegate_to is evaluated per item when combined with loop, so you can delegate each iteration to a different host. And vars: on a delegated task lets you override connection variables for that task only — handy when the delegate needs special settings:

- name: Add each app host to the monitoring server's target list
  ansible.builtin.lineinfile:
    path: /etc/prometheus/targets.yml
    line: "  - {{ hostvars[item]['ansible_host'] }}:9100"
  delegate_to: monitor1
  loop: "{{ groups['app'] }}"

delegate_facts: storing gathered facts on the delegate

By default, facts gathered during a delegated task are attributed to the iterated host, not the delegate — which is usually not what you want when the whole point was to learn something about the delegate. delegate_facts: true flips this: facts set or gathered during the task are stored against the delegate instead.

The classic use is gathering facts about a host that is not in the current play, so you can read them from hostvars:

- name: Gather facts about the database host even though it's not in this play
  ansible.builtin.setup:
  delegate_to: "{{ db_host }}"
  delegate_facts: true
  run_once: true

- name: Now use the DB host's memory to size a connection pool
  ansible.builtin.debug:
    msg: "DB has {{ hostvars[db_host]['ansible_facts']['memtotal_mb'] }} MB RAM"
  run_once: true
Setting set_fact/gathered facts during the task are stored on… Read later via
delegate_facts omitted (default false) the iterated host (inventory_hostname) hostvars[inventory_hostname]
delegate_facts: true the delegate host hostvars[<delegate>]

The gotcha: people set delegate_to: dbhost, gather facts, then try to read hostvars['dbhost'].ansible_facts and find it empty — because without delegate_facts: true the facts landed on the iterated host. Set delegate_facts: true whenever you gather or set_fact on a delegate and intend to read it back as the delegate’s data.


local_action and connection: local: running on the control node

There are three ways to make a task run on the control node; they overlap but are not identical.

connection: local

connection: local overrides the connection plugin so Ansible executes the task on the control node as the host it is iterating — i.e. it does not open SSH; it runs the module locally, but inventory_hostname is still the target. This is what you put on a whole play whose hosts are abstract (e.g. cloud resources you are creating via API, where “the host” is just a name to iterate):

- name: Provision cloud VMs (no SSH; everything runs locally via API)
  hosts: to_create
  connection: local
  gather_facts: false
  tasks:
    - name: Create each VM through the provider API
      community.general.<provider>_instance:
        name: "{{ inventory_hostname }}"
        state: present

You can also set connection: local on a single task. Set it at the play level when every task runs locally; set ansible_connection=local in inventory for hosts that are always local (e.g. localhost ansible_connection=local).

local_action

local_action is shorthand for a single task with delegate_to: localhost and connection: local. It runs the task on the control node. The syntax is more compact (and older — you will see it in legacy playbooks):

# These two are equivalent:
- name: Wait for the host's SSH to come back (run from the control node)
  local_action:
    module: ansible.builtin.wait_for
    host: "{{ inventory_hostname }}"
    port: 22
    delay: 10
    timeout: 300

- name: Same thing, modern spelling
  ansible.builtin.wait_for:
    host: "{{ inventory_hostname }}"
    port: 22
    delay: 10
    timeout: 300
  delegate_to: localhost

local_action also has a terse single-line form — local_action: ansible.builtin.command echo hi — but prefer the dictionary form (or delegate_to: localhost) for readability. wait_for from the control node, waiting for a rebooted host’s port to reopen, is the textbook local_action/delegate_to: localhost use (you obviously cannot run the wait on the host that is rebooting).

The three spellings compared

Construct Runs on inventory_hostname is Variables in scope Typical use
connection: local (play or task) control node the iterated host the iterated host’s whole-play local work (API provisioning) where no SSH is wanted
delegate_to: localhost control node the iterated host the iterated host’s one task locally (notify, lookup, render, wait_for) while iterating real hosts
local_action: {...} control node the iterated host the iterated host’s exact shorthand for delegate_to: localhost + connection: local; legacy

In practice: use delegate_to: localhost for one-off local tasks inside a play that targets real hosts; use connection: local on a play whose hosts are abstractions you drive via API; reach for local_action only when maintaining code that already uses it. Note a subtle difference: delegate_to: localhost uses whatever connection localhost has (usually local), while connection: local forces local on the iterated host without changing the executing host’s identity. For 99% of cases the behaviour is identical and delegate_to: localhost is the clearer choice.

A small but important gotcha: delegate_to: localhost and delegate_to: 127.0.0.1 can behave differently if your inventory defines localhost with ansible_connection=local but treats 127.0.0.1 as an SSH host. Standardise on localhost.


run_once: do this exactly once for the whole play

run_once: true makes a task execute a single time, on the first host in the current play (or current serial batch), rather than once per host. Ansible then copies that result to every host in the play, so hostvars and register see the same value everywhere.

Use it whenever the task’s effect is fleet-global, not per-host: run a database migration once, create a shared resource once, send one notification, compute one value all hosts will use.

- name: Run the schema migration exactly once
  ansible.builtin.command: /opt/app/migrate.sh
  run_once: true                 # runs on the first host of the play only
  register: migration

- name: Every host can now read the migration result
  ansible.builtin.debug:
    msg: "Migration applied: {{ migration.stdout }}"
  # migration.stdout is available on ALL hosts, not just the first

run_once paired with delegate_to — the canonical idiom

On its own, run_once still picks some host (the first iterated one) to run the task on. Usually you do not care which managed host runs a fleet-global task — and often you do not want it to run on a managed host at all. The idiomatic construct is therefore run_once: true + delegate_to: localhost (“do this one time, on the control node”) or run_once: true + delegate_to: <shared-host> (“do this one time, on the load balancer / DB”):

- name: Notify the team once, from the control node
  community.general.slack:
    token: "{{ slack_token }}"
    msg: "Rollout complete across {{ ansible_play_hosts_all | length }} hosts"
  run_once: true
  delegate_to: localhost

- name: Reload the load balancer once after all hosts are back in the pool
  ansible.builtin.command: /usr/local/bin/lb-reload
  run_once: true
  delegate_to: lb1

Without delegate_to, run_once would run the migration/notification on web1 (the first web host) — usually harmless for a local-effect task but wrong for anything that should run on the control node or a specific shared host. run_once chooses how many times; delegate_to chooses where. Together they express “once, here”.

run_once behaviour under serial and free

These edges are heavily tested:

Context run_once behaviour
Normal linear play, no serial runs once on the play’s first host; result shared to all
With serial runs once per batch — i.e. once for each serial group, on that batch’s first host (a frequent surprise: it is not once for the entire play when serial is set)
With free strategy still runs once, but which host is “first” is less predictable; combine with delegate_to to make it deterministic
Failure of the run_once task by default fails only the one host it ran on; pair with any_errors_fatal: true to abort the play, since a failed global step usually invalidates the whole run

The serial interaction is the one to remember: run_once is once per batch, not once per play, when serial is in effect. If you truly need once-per-entire-rollout, run that task in a separate, non-serial play (e.g. a pre_tasks-style play that targets the group once before the rolling play begins), or guard it so only the very first batch runs it.


serial: batching the play for rolling updates

By default a play runs all its tasks on every host before finishing — but it walks the hosts in waves of forks for each task. serial changes the unit of work: instead of “task by task across all hosts”, the play runs the entire play, start to finish, on a subset of hosts, then repeats on the next subset. Each subset is a batch (also called a serial group). This is the foundation of every rolling update: finish hosts 1–2 completely (drain, upgrade, verify, restore) before touching hosts 3–4.

Integer, percentage, and ramp-up list

serial accepts three forms:

serial value Meaning Example walk for 10 hosts
an integer that many hosts per batch serial: 2 → batches of 2 → 5 batches of 2
a percentage string that fraction of the current host count, rounded down, minimum 1 serial: "30%" → 3, 3, 3, 1
a list (ramp-up / canary) each element is one batch size, applied in order; the last element repeats for the remainder serial: [1, 5, "30%"] → 1, then 5, then 30% (=3, since 30% of the original 10 → but applied to remaining), then the last element repeats until done

The list form is the canary pattern: roll one host first (1), watch it, then a small wave (5), then larger waves. Mixing integers and percentages in the list is allowed: serial: [1, "10%", "25%"].

- name: Rolling upgrade of the web tier, canary first
  hosts: webservers
  serial:
    - 1            # batch 1: a single canary host
    - 5            # batch 2: five hosts
    - "30%"        # batch 3 onward: 30% of the hosts each, then repeat the last size
  tasks:
    - name: ...the drain / upgrade / verify / restore tasks...

Three precise rules people get wrong:

  1. Percentages are of the play’s host count and round down, with a floor of 1. "30%" of 4 hosts is 1 (1.2 rounded down). "10%" of 3 hosts is 1 (the floor kicks in), not 0 — Ansible never makes an empty batch.
  2. In a list, the final element is reused for all remaining batches. serial: [1, 2] over 10 hosts is 1, 2, 2, 2, 2, 1 (the last 2 repeats; the final partial batch is whatever is left).
  3. serial on a play means each batch runs the whole play. Handlers flush at the end of each batch, not at the very end. run_once is per batch. max_fail_percentage is evaluated per batch (see the rolling-update section).

Where serial goes and what it touches

serial is a play keyword (you cannot set it per task). It interacts with several other play settings:


Execution strategies: linear, free, host_pinned, debug

A strategy plugin decides how hosts move through the tasks of a play. It is set with the play-level strategy: keyword (or strategy = ... in ansible.cfg, or ANSIBLE_STRATEGY). There are four built-in strategies.

Strategy How it walks tasks Sync point When to use Trade-off / gotcha
linear (default) All hosts run task 1, Ansible waits for every host to finish task 1, then all run task 2, and so on — lock-step After every task (a barrier per task) The default; predictable, required when later tasks depend on all hosts finishing a prior one; rolling updates The whole batch moves at the pace of the slowest host on each task
free Each host races through all its tasks as fast as it can, independently — host A may be on task 9 while host B is still on task 3 None until the end of the play Many hosts, independent work, latency-bound tasks, big fleets where you want max throughput No cross-host ordering; you cannot rely on “all hosts did X before any does Y”; handler/run_once timing is less predictable
host_pinned Like a bounded free: each worker (fork) picks a host and runs all that host’s tasks before picking the next host, instead of round-robining hosts per task None per task; per-host completion Fewer connections churned; useful with connection plugins where setup/teardown is expensive; keeps a host on one worker Still no global ordering; behaves like free for dependency purposes
debug linear, but drops into the interactive playbook debugger on a task error (or when debugger: triggers) After every task (like linear) Developing/troubleshooting a playbook interactively Not for unattended runs — it pauses for input on failure

The two that matter day to day are linear and free.

linear (the default) and its barriers

With linear, there is a barrier after every task: Ansible will not start task N+1 on any host until task N has completed (or been skipped/failed-and-handled) on every host in the batch. This is what makes “gather facts from all hosts, then act on the aggregate” work, and it is essential for rolling updates where a pre_task must complete on all of a batch before the upgrade begins. The cost is that each task runs at the speed of the slowest host.

free and what it changes

- name: Independent maintenance on a large fleet, as fast as possible
  hosts: all
  strategy: free
  tasks:
    - name: Each host patches and reboots on its own timeline
      ansible.builtin.dnf:
        name: '*'
        state: latest

free removes the per-task barrier: every host charges through its task list independently. On a 500-host fleet of independent, latency-bound work, this can be dramatically faster. But you lose all cross-host ordering guarantees — so free is wrong for any play where a later task assumes an earlier one finished everywhere, and it muddies run_once/handler timing. Use free for embarrassingly parallel, host-independent maintenance; keep linear for anything with cross-host dependencies or rolling semantics. We revisit free from the performance angle in the next lesson.


forks: the parallelism ceiling

forks is the maximum number of hosts Ansible talks to at the same time. It is a global setting (not per-play): ansible.cfg [defaults] forks = N, the -f/--forks N command-line flag, or ANSIBLE_FORKS. The default is 5, which is conservative — five hosts in flight at once.

How forks interacts with the strategies:

Sizing forks: raise it well above 5 for any non-trivial fleet (50–100 is common), but mind (a) the control node’s CPU/RAM and open-file limits — each fork is a process and SSH connection — and (b) downstream rate limits (a package mirror, a cloud API). On big runs, forks plus SSH multiplexing (ControlPersist) and pipelining are the main throughput levers (next lesson).

The serial / forks relationship — the classic confusion

This is the single most-asked interview question in this area, so be exact:

Worked example: 30 hosts, serial: 10, forks: 5. The play runs in 3 batches of 10. Within a batch of 10, each task runs on at most 5 hosts at a time (two waves of 5 to cover the batch’s 10), with a barrier after the task (linear). So at no point are more than 5 hosts executing a given task, and the rollout proceeds 10 hosts at a time.

Setting Unit it controls Scope Default
serial hosts per batch (whole-play passes) per play unset (= all hosts in one batch)
forks hosts per task running concurrently global 5
throttle hosts per task running concurrently, but for that task only per task/block unset (= bounded by forks)

A common rolling-update setup makes forks >= serial so that an entire batch runs each task in one wave (no sub-batching within the batch). If serial: 10 and forks: 5, each task in the batch still happens in two waves of 5 — fine, but be aware of it. If you want the batch to truly move in lock-step, set forks at least as large as the batch.


throttle: a per-task concurrency limit

throttle: N caps how many hosts run a specific task (or block) at once, below whatever forks allows. Where forks is a global ceiling, throttle is a local one for a task that must not be hammered in parallel — e.g. a step that hits a fragile licence server, or writes to a shared resource that tolerates only a few concurrent writers, or a database step you want done two-at-a-time even though forks is 50.

- name: Register each host with the licence server  only 2 at a time
  ansible.builtin.command: /opt/app/register-licence.sh
  throttle: 2          # at most 2 hosts run THIS task concurrently, regardless of forks

Rules:

This is the surgical tool: keep forks high for speed, drop a throttle onto the one task that cannot take the parallelism.


order: the sequence hosts are processed in

order is a play keyword controlling the order in which Ansible walks the play’s hosts (which also determines who lands in which serial batch and which host run_once/linear treats as “first”).

order value Host order Use
inventory (default) the order hosts appear in the inventory predictable, matches your inventory layout
sorted alphabetical/natural sort by name deterministic regardless of inventory order
reverse_sorted reverse alphabetical deterministic, opposite end first
reverse_inventory reverse of inventory order walk the fleet from the other end
shuffle random each run spread risk / avoid always hitting the same host first; load-test ordering
- name: Walk hosts in a stable alphabetical order so batches are reproducible
  hosts: webservers
  order: sorted
  serial: "25%"
  tasks: ...

Why it matters for this lesson: with serial, order decides which hosts go in which batch. order: sorted makes your canary and batches reproducible run-to-run; order: shuffle is useful when you do not want the same host to always be the canary (so a latent “always upgrade web1 first” assumption never hardens). order also fixes which host is “first” for run_once and for the head of a linear wave.


The complete zero-downtime rolling-update pattern

This is the payoff — the pattern an Ansible operator must be able to write from memory. The shape is always the same:

  1. serial to upgrade a few hosts at a time (often a canary list).
  2. max_fail_percentage so the rollout aborts before too much of the fleet is broken.
  3. pre_tasks to drain each host out of the load balancer (a delegate_to the LB), and optionally wait for in-flight connections to drain.
  4. The upgrade itself (deploy code, restart the service) plus a health check that the host is actually serving before proceeding.
  5. post_tasks to add the host back into the pool (again delegate_to the LB), and verify it is back in rotation.

Because the host is out of the pool while it is being touched, users never hit a half-upgraded or restarting instance — that is what “zero-downtime” means here.

- name: Zero-downtime rolling upgrade of the web tier
  hosts: webservers
  become: true
  serial:
    - 1                       # canary: one host first
    - "30%"                   # then 30% per batch
  max_fail_percentage: 25     # abort the whole rollout if >25% of a batch fails
  order: sorted               # reproducible batch membership

  pre_tasks:
    - name: Drain this host out of the load balancer pool
      community.general.haproxy:
        state: disabled
        host: "{{ inventory_hostname }}"
        backend: web-backend
        socket: /var/run/haproxy/admin.sock
        wait: true                 # wait until it stops receiving new connections
        drain: true                # let existing sessions finish
      delegate_to: "{{ groups['loadbalancers'][0] }}"

    - name: Give in-flight requests a moment to complete
      ansible.builtin.wait_for:
        timeout: 15
      delegate_to: localhost

  tasks:
    - name: Deploy the new application build
      ansible.builtin.unarchive:
        src: "/builds/app-{{ build_id }}.tar.gz"
        dest: /opt/app
        remote_src: false
      notify: Restart app

    - name: Apply the restart now (don't wait for end of batch)
      ansible.builtin.meta: flush_handlers

    - name: Confirm the app is healthy on its own port before re-adding it
      ansible.builtin.uri:
        url: "http://{{ ansible_host | default(inventory_hostname) }}:8080/health"
        status_code: 200
      register: health
      until: health.status == 200
      retries: 12
      delay: 5
      delegate_to: localhost       # poll from the control node, not the host itself

  post_tasks:
    - name: Put this host back into the load balancer pool
      community.general.haproxy:
        state: enabled
        host: "{{ inventory_hostname }}"
        backend: web-backend
        socket: /var/run/haproxy/admin.sock
        wait: true
      delegate_to: "{{ groups['loadbalancers'][0] }}"

    - name: Verify it is actually back in rotation
      ansible.builtin.uri:
        url: "http://{{ ansible_host | default(inventory_hostname) }}:8080/health"
        status_code: 200
      delegate_to: localhost

  handlers:
    - name: Restart app
      ansible.builtin.service:
        name: app
        state: restarted

Walk through why each piece is there:

The reason this is per-host within a serial batch matters: each iterated host is independently drained, upgraded, verified, and restored, but the batch moves through these stages together (linear barriers), and the play only advances to the next batch when the current one is within tolerance. That is the whole machine.

Variations and add-ons


Diagram

Ansible execution control: delegate_to redirecting a task to the load balancer, run_once collapsing a task to once-per-batch, serial slicing the fleet into rolling batches with forks/throttle capping per-task parallelism, and the drain-upgrade-verify-restore rolling-update loop

The diagram shows the two dimensions side by side: on the left, a play iterating web1..webN with one task delegated to lb1 (the arrow leaving the host lane), a run_once task collapsing to a single execution, and forks/throttle shown as the width of the concurrent wave; on the right, serial slicing the fleet into ordered batches, each batch flowing through drain → upgrade → health-check → restore, with max_fail_percentage as the gate that aborts the rollout if a batch fails too heavily.

Hands-on lab

You will build and run a miniature rolling update on localhost plus a few throwaway containers — no cloud, no cost. You need ansible-core 2.17+ and either Docker or Podman; we use community.docker only to create the practice hosts. The delegation/serial features themselves are pure ansible.builtin.

Step 0 — set up

ansible --version                                  # expect 2.17 or newer
ansible-galaxy collection install community.docker # only to spin up practice hosts
mkdir -p ~/ansible-c1 && cd ~/ansible-c1

Create inventory.ini — three “web” hosts, one “lb” host, and the control node:

[web]
web1 ansible_connection=community.docker.docker
web2 ansible_connection=community.docker.docker
web3 ansible_connection=community.docker.docker

[lb]
lb1 ansible_connection=community.docker.docker

[local]
localhost ansible_connection=local

Step 1 — create the practice containers

Create setup.yml:

- name: Spin up practice containers
  hosts: localhost
  connection: local
  gather_facts: false
  tasks:
    - name: Run lightweight containers we can target
      community.docker.docker_container:
        name: "{{ item }}"
        image: python:3.12-slim     # has python so Ansible modules work
        command: sleep infinity
        state: started
      loop: [web1, web2, web3, lb1]
ansible-playbook setup.yml          # changed (or ok on a re-run)

Step 2 — prove delegation, run_once, and serial

Create delegate.yml. There is no real load balancer, so we simulate the pool with a file on lb1 that we add/remove the host from — which is exactly the delegation pattern, just with lineinfile standing in for the HAProxy module.

- name: Demonstrate delegate_to, run_once, serial, throttle, order
  hosts: web
  gather_facts: true
  order: sorted                 # reproducible batches: web1, web2, web3
  serial:
    - 1                         # canary: web1
    - "100%"                    # then the rest in one batch (web2, web3)

  pre_tasks:
    - name: ANNOUNCE which batch we are in (runs once per BATCH, on the control node)
      ansible.builtin.debug:
        msg: "Starting batch containing: {{ ansible_play_batch | join(', ') }}"
      run_once: true
      delegate_to: localhost

    - name: Remove THIS host from the simulated LB pool (action happens ON lb1)
      ansible.builtin.lineinfile:
        path: /tmp/pool.txt
        line: "{{ inventory_hostname }}"
        state: absent
        create: true
      delegate_to: lb1

  tasks:
    - name: "Upgrade" the app  at most 2 hosts at a time even if forks is higher
      ansible.builtin.command: "echo upgrading {{ inventory_hostname }}"
      changed_when: true
      throttle: 2

    - name: Health check, polled from the control node (delegate_to localhost)
      ansible.builtin.command: "echo healthy {{ inventory_hostname }}"
      delegate_to: localhost
      changed_when: false

  post_tasks:
    - name: Add THIS host back to the simulated LB pool (on lb1)
      ansible.builtin.lineinfile:
        path: /tmp/pool.txt
        line: "{{ inventory_hostname }}"
        state: present
        create: true
      delegate_to: lb1

Step 3 — run it and read the output

ansible-playbook -i inventory.ini delegate.yml -f 5

Expected observations:

ansible -i inventory.ini lb1 -m ansible.builtin.command -a "cat /tmp/pool.txt"
# After a full successful run, all three web hosts are present (added back in post_tasks).

Now see a strategy difference. Create strategy.yml:

- name: Compare linear vs free pacing
  hosts: web
  gather_facts: false
  strategy: free               # try 'linear' too and compare ordering
  tasks:
    - name: A slow step whose duration varies per host
      ansible.builtin.command: "sleep {{ 2 if inventory_hostname == 'web1' else 1 }}"
      changed_when: false
    - name: A second step
      ansible.builtin.debug:
        msg: "{{ inventory_hostname }} reached step 2"
ansible-playbook -i inventory.ini strategy.yml          # free: fast hosts reach step 2 first
sed -i.bak 's/strategy: free/strategy: linear/' strategy.yml
ansible-playbook -i inventory.ini strategy.yml          # linear: ALL finish step 1 before ANY do step 2

With free, web2/web3 print “reached step 2” while web1 is still sleeping; with linear, nobody reaches step 2 until every host has finished step 1 — the per-task barrier made visible.

Validation

# The simulated pool should contain all three web hosts after a clean run:
ansible -i inventory.ini lb1 -m ansible.builtin.command -a "sort /tmp/pool.txt"

# Confirm delegate_to really executed on lb1 (the file exists THERE, not on web hosts):
ansible -i inventory.ini web -m ansible.builtin.stat -a "path=/tmp/pool.txt" \
  | grep -E '"exists": (true|false)'   # expect false on web hosts — the file lives on lb1

Cleanup

cat > teardown.yml <<'YAML'
- hosts: localhost
  connection: local
  gather_facts: false
  tasks:
    - name: Remove practice containers
      community.docker.docker_container:
        name: "{{ item }}"
        state: absent
      loop: [web1, web2, web3, lb1]
YAML
ansible-playbook teardown.yml
rm -rf ~/ansible-c1

Cost note

₹0. Everything runs on localhost and four local containers using a public base image. No cloud resources are created, so there is nothing to bill and nothing left running after teardown.

Common mistakes & troubleshooting

Symptom Likely cause Fix
Delegated task uses the delegate’s facts, but they’re empty facts were never gathered on the delegate; by default a delegated task sees the iterated host’s facts gather the delegate’s facts with delegate_to + delegate_facts: true, then read hostvars[delegate]
hostvars['lb1'].ansible_facts is empty after gathering on lb1 gathered without delegate_facts: true, so facts landed on the iterated host add delegate_facts: true to the gather task
run_once task ran multiple times serial is set — run_once is once per batch, not once per play move it to a separate non-serial play, or accept per-batch, or guard to the first batch
run_once ran on a random/unexpected host run_once picks the first host of the batch; with free that is non-deterministic add delegate_to: localhost (or a fixed host) to control where it runs
Health check passes against the old app version the restart handler hasn’t fired yet (handlers flush at end of batch) insert ansible.builtin.meta: flush_handlers before the health check
Rolling update marched on and broke the whole fleet no max_fail_percentage/any_errors_fatal, so failures didn’t stop the rollout set max_fail_percentage (or any_errors_fatal: true) on the serial play
serial: "30%" made batches of 1 on a small fleet percentages round down with a floor of 1 use integers for small fleets, or accept the rounding
Whole batch crawls at one host’s pace linear has a barrier after every task — slowest host paces each step use free/host_pinned if tasks are host-independent; otherwise it’s inherent
throttle didn’t speed anything up / had no effect throttle only lowers concurrency below forks; setting it above forks does nothing raise forks; use throttle only to limit a sensitive task
delegate_to: localhost connected over SSH unexpectedly inventory defined localhost as an SSH host, or you used 127.0.0.1 ensure localhost ansible_connection=local; standardise on localhost
Effective concurrency lower than forks the current serial batch is smaller than forks (min(forks, batch)) enlarge the batch, or accept it — batch size caps per-task parallelism
Delegated become escalated on the wrong host become on a delegated task escalates on the delegate, not the iterated host set become/become_user mindful of where the task runs

Best practices

Security notes

Interview & exam questions

  1. When a task has delegate_to: lb1 while the play is iterating web1, which host’s facts and variables are in scope? web1’s — the iterated host. The task executes on lb1 (its connection, become, files), but inventory_hostname is web1 and the variables/facts are web1’s. To use the delegate’s own data, read hostvars['lb1'], and to gather the delegate’s facts, add delegate_facts: true.
  2. What does delegate_facts: true change? It stores facts gathered/set_fact-ed during the task against the delegate instead of the iterated host, so hostvars[delegate].ansible_facts is populated. Without it, delegated-gathered facts land on the iterated host — the classic “why is hostvars['db'].ansible_facts empty?” bug.
  3. Explain run_once and how it behaves under serial. run_once: true runs the task once (on the batch’s first host) and shares the result to all hosts. Under serial it runs once per batch, not once per play — a frequent surprise. Pair it with delegate_to to control where the single execution happens.
  4. Why is run_once: true + delegate_to: localhost such a common idiom? run_once controls how many times (once); delegate_to: localhost controls where (the control node). Together they express “do this fleet-global thing exactly once, locally” — perfect for notifications, API calls, or computing a value all hosts will use.
  5. What is the difference between serial and forks? serial is the number of hosts in a batch — a complete pass of the whole play (the rolling-update unit). forks is the number of hosts running a single task concurrently (the parallelism ceiling, default 5, global). They are independent; effective per-task concurrency is min(forks, batch_size).
  6. Walk 30 hosts with serial: 10 and forks: 5. What happens? Three batches of 10. Within each batch, every task runs on at most 5 hosts at once (two waves of 5), with a linear barrier after each task. So never more than 5 hosts execute a given task, and the rollout advances 10 hosts at a time.
  7. What does serial: [1, 5, "30%"] mean, and how are percentages computed? Batch 1 = 1 host (canary), batch 2 = 5, batch 3 onward = 30% of the host count, and the last element repeats for the remainder. Percentages are of the play’s host count, rounded down, with a floor of 1 (so "30%" of 2 hosts is 1, never 0).
  8. Compare the linear and free strategies. linear (default) has a barrier after every task — all hosts finish task N before any starts N+1, so cross-host ordering holds and the batch moves at the slowest host’s pace. free lets each host race through all its tasks independently — maximum throughput, but no ordering guarantees and murkier run_once/handler timing. Use free only for host-independent work.
  9. What is host_pinned and how does it differ from free? host_pinned is a bounded free: each worker (fork) takes a host and runs all of that host’s tasks before moving to the next host, rather than round-robining hosts per task. It reduces connection churn but, like free, gives no global ordering.
  10. How does throttle relate to forks? throttle: N caps concurrency for a single task/block below forks; it can only lower, never raise (min(forks, throttle, batch)). Use it to protect a fragile or shared step while the rest of the play runs at full forks.
  11. What does max_fail_percentage do in a rolling update and why is it essential? Evaluated per batch, it aborts the entire play once more than that percentage of a batch’s hosts fail — stopping a bad rollout before it breaks the whole fleet. Without it (or any_errors_fatal), a serial play happily marches through every batch, breaking hosts as it goes. max_fail_percentage: 0 means “stop on any failure”.
  12. Describe the complete zero-downtime rolling-update pattern. serial (often a canary list) + max_fail_percentage for safety; pre_tasks to drain each host from the LB (delegate_to the LB, keyed on inventory_hostname, with graceful drain); the upgrade + flush_handlers + a health check (often delegate_to: localhost, with until/retries); post_tasks to add the host back to the LB and verify it is serving. Each host is out of rotation while touched, so users never hit a half-upgraded instance.
  13. Why use flush_handlers in a rolling update? Handlers normally fire at the end of the batch — after your in-play health check. meta: flush_handlers forces the restart immediately so the health check tests the restarted application, not the old one, before you re-add the host to the pool.
  14. What does order: shuffle give you in a rolling context, versus sorted? shuffle randomises host order each run, so the same host is not always the canary/first — spreading risk and preventing a hidden “always upgrade web1 first” assumption from hardening. sorted gives reproducible batch membership instead. Both override the default inventory order.

Quick check

  1. During delegate_to: lb1 while iterating web1, whose facts does the task see by default?
  2. Under serial, how many times does a run_once task run — once per play or once per batch?
  3. Which keyword limits how many hosts run a single task in parallel: serial or forks?
  4. What play keyword aborts a rolling update once too many hosts in a batch fail?
  5. Which strategy removes the per-task barrier so hosts progress independently?

Answers

  1. web1’s — the iterated host’s facts/vars are in scope; the task merely executes on lb1. Use delegate_facts: true (to store) and hostvars['lb1'] (to read) for the delegate’s own data.
  2. Once per batch. run_once is per serial batch, not per whole play — a common gotcha.
  3. forks (global per-task parallelism, default 5). serial sizes the batch, not per-task concurrency. (throttle lowers it further for one task.)
  4. max_fail_percentage (or any_errors_fatal: true to stop on any single failure).
  5. free (each host races through all its tasks; no cross-host ordering). host_pinned is its bounded cousin.

Exercise

Write a single playbook rolling.yml for hosts: appservers (assume groups appservers and loadbalancers exist, and a database group with one host) that performs a safe rolling deploy. (a) Use serial: [1, "25%"] with order: sorted and max_fail_percentage: 20. (b) In pre_tasks, run a single schema-readiness check once per play (not per batch) by giving it run_once: true delegated to the database host and gathering that host’s facts with delegate_facts: true so a later step can read hostvars[db].ansible_facts; then drain the current host from the load balancer with a task delegate_to the first member of loadbalancers, keyed on inventory_hostname. © In tasks, deploy the build, notify a Restart app handler, flush_handlers, then a health check with until/retries/delay delegated to localhost; throttle a “warm the cache” task to throttle: 1. (d) In post_tasks, add the host back (delegate_to the LB) and verify, then send one Slack-style notification with run_once: true + delegate_to: localhost. Finally, in two sentences, explain (i) why the schema check uses run_once and delegate_to while the drain uses only delegate_to, and (ii) what max_fail_percentage: 20 protects you from that a bare serial does not.

Certification mapping

Glossary

Next steps

You can now control where a task runs (delegate_to, delegate_facts, local_action, connection: local), how many times (run_once), and how the fleet is walked (serial batches, the linear/free/host_pinned/debug strategies, forks, throttle, and order) — and you can assemble them into a safe, zero-downtime rolling update with max_fail_percentage and a delegated drain/add-back. The next lesson, Tuning Ansible for Speed & Scale, In Depth, takes the forks and free levers further into pure performance — SSH multiplexing and pipelining, fact gathering and caching, async/poll for long-running tasks, and profiling — so your rollouts are not just safe but fast. To revisit the flow-control primitives these patterns build on, see Ansible Conditionals, Loops, Handlers & Tags, In Depth; and for the failure-handling foundations under max_fail_percentage and any_errors_fatal, return to Ansible Error Handling, In Depth.

AnsibleDelegationRolling UpdatesserialStrategiesrun_once
Need this built for real?

Vinod is a Senior Cloud Architect (22+ yrs) — available for Azure / AWS / GCP architecture, landing zones, and migrations.

Work with me

Comments