Shell Lesson 32 of 42

Writing systemd Units That Wrap Shell Scripts Properly: Type Selection, Restart Policy, Hardening, Watchdogs, Timers & sd_notify

Why a systemd Unit Around Your Script Matters

Your script runs fine when you SSH in and run it. You want it to run on reboot, restart on failure, log to journald, time out after an hour, run with restricted privileges, and pull secrets from /etc/myapp/env only readable by root. You write a unit file:

[Service]
ExecStart=/opt/myapp/bin/run.sh

It works. Six months later you discover:

This lesson covers the unit file as a contract between your script and systemd: how to declare what the script promises (will exit normally, will signal readiness, won’t fork into the background), how to harden the runtime, and how to test that the unit does what you think it does. It’s the capstone of Tier 4 — every script you write at this level should ship with a hardened unit.

The Anatomy of a Service Unit File

A systemd unit lives at /etc/systemd/system/myapp.service (system-wide) or ~/.config/systemd/user/myapp.service (per-user). Reload after editing:

sudo systemctl daemon-reload
sudo systemctl enable --now myapp.service
sudo systemctl status myapp.service

Unit file structure:

[Unit]                          # metadata: when to start, what it depends on
Description=My App Service
After=network-online.target     # ordering: start after this
Wants=network-online.target     # weak dep: pulls it in if available
Requires=postgresql.service     # strong dep: fail if this can't start
ConditionPathExists=/etc/myapp/config.json  # don't start if this is missing

[Service]                       # how to run
Type=simple                     # how systemd knows when "started"
User=myapp                      # don't run as root
Group=myapp
EnvironmentFile=/etc/myapp/env  # source env vars from a file
ExecStartPre=/opt/myapp/bin/preflight.sh  # run BEFORE main; failure aborts service
ExecStart=/opt/myapp/bin/run.sh # the main process
ExecReload=/bin/kill -HUP $MAINPID  # systemctl reload sends this
ExecStopPost=/opt/myapp/bin/cleanup.sh  # run AFTER stop, success or fail
Restart=on-failure
RestartSec=5
TimeoutStartSec=60
TimeoutStopSec=30

[Install]                       # who triggers `systemctl enable`
WantedBy=multi-user.target      # start at multi-user (normal boot)

Three sections. [Unit] is the metadata; [Service] is the contract; [Install] is what systemctl enable activates.

Type=: How systemd Knows When You’re “Ready”

Type= is the most-confused field in unit files. Pick wrong and your dependencies start before your service is actually ready, or systemd thinks your service crashed when it didn’t.

Type Meaning Use for
simple (default) Service is “active” the moment ExecStart’s first process is forked Daemons that don’t fork; wrong for scripts that exit
forking Service is “active” once the parent process exits (the child becomes the daemon) Old-style daemons that double-fork (Apache 2.2, dhcpd)
oneshot Run ExecStart, wait for it to exit, mark service as completed (or failed) Setup scripts, one-shot jobs, anything that’s not long-running
notify Service must call sd_notify(READY=1) to signal it’s ready Long-running daemons that have a non-trivial init phase
notify-reload Same as notify, but supports reload via sd_notify Services with reload handling
dbus Service is ready when it claims a D-Bus name D-Bus services
idle Like simple, but delay execution until other jobs finish Avoid log-spam at boot

Type=simple is wrong for shell scripts that exit

# WRONG: a script that runs to completion, with Type=simple.
[Service]
Type=simple
ExecStart=/opt/myapp/bin/sync-data.sh

This unit “succeeds” the moment fork returns, even if the script exits 200ms later. systemctl is-active returns “active” briefly, then “inactive” — confusing for monitoring. Worse, dependents that say After=myapp.service will start concurrently with sync-data.sh, not after.

Use Type=oneshot for “run once and exit”

[Service]
Type=oneshot
ExecStart=/opt/myapp/bin/sync-data.sh
RemainAfterExit=yes   # report active even after exit (so dependents see "completed")

oneshot waits for the script to exit. RemainAfterExit=yes keeps is-active returning “active” so dependents like backup-completion services can chain off of it.

Use Type=notify for daemons with a real init phase

[Service]
Type=notify
NotifyAccess=main           # only main process may signal
ExecStart=/opt/myapp/bin/run-daemon.sh
WatchdogSec=30

Inside the script:

#!/usr/bin/env bash
set -Eeuo pipefail

# ... initialization that takes a while ...
load_config
warm_caches
open_database

# Tell systemd we're ready.
systemd-notify --ready --status="Listening on :8080"

# Main loop.
while :; do
  # ... do work ...
  systemd-notify WATCHDOG=1   # reset the watchdog timer
  sleep 10
done

systemd-notify is the shell-friendly wrapper around sd_notify(3). --ready tells systemd “I’m done initializing.” After that, dependents start.

WatchdogSec=30 says: if the service doesn’t send WATCHDOG=1 within 30 seconds, systemd assumes it’s hung and restarts it. This is the canonical way to detect a daemon that’s running but stuck.

Restart Policy: Don’t Death-Loop

Restart=on-failure
RestartSec=5
StartLimitIntervalSec=60
StartLimitBurst=5

Restart=:

Value Restart on
no Never (default for oneshot)
on-success Clean exit (rare; useful for retry-loops)
on-failure Non-zero exit, signal kill, watchdog timeout
on-abnormal Signal kill or watchdog only (not non-zero)
on-watchdog Watchdog only
on-abort SIGABRT only
always Every exit, success or fail

RestartSec=5: wait 5 seconds before restarting. Without this, a script that crashes immediately respawns at 100% CPU.

StartLimitIntervalSec=60 + StartLimitBurst=5: if there are 5 starts within 60 seconds, refuse further restarts. This is the kill-switch that prevents infinite respawn loops.

Combined: a buggy script gets 5 retries, then systemd gives up, marks the unit failed, and stops trying. You see systemctl status say “start request repeated too quickly” — exactly the diagnosis you want, not a CPU-burning host.

Sandboxing: Defense in Depth From the Service File

Hardening directives let you restrict the script’s privileges from outside the script. Even if the script has bugs (or is compromised), the impact is contained.

[Service]
# ─── Identity ────────────────────────────────────────────────────────
User=myapp
Group=myapp
DynamicUser=no            # if yes: systemd creates a transient user; great for one-shots

# ─── Filesystem isolation ───────────────────────────────────────────
ProtectSystem=strict       # /usr, /boot, /efi read-only; everything else inaccessible
ReadWritePaths=/var/lib/myapp /var/log/myapp   # opt-in writable paths
ProtectHome=true           # /home, /root, /run/user invisible
PrivateTmp=true            # private /tmp, /var/tmp; cleared on stop
PrivateDevices=true        # only /dev/null, /dev/zero, /dev/random, etc.
ProtectKernelTunables=true # /proc/sys, /sys read-only
ProtectKernelModules=true  # cannot load modules
ProtectControlGroups=true  # /sys/fs/cgroup read-only
ProtectClock=true          # cannot change system time

# ─── Capabilities & privilege ───────────────────────────────────────
NoNewPrivileges=true       # PR_SET_NO_NEW_PRIVS; cannot gain privileges via setuid
CapabilityBoundingSet=     # drop ALL capabilities (empty = none)
AmbientCapabilities=       # no ambient caps either
RestrictSUIDSGID=true      # cannot create SUID/SGID files

# ─── Network ─────────────────────────────────────────────────────────
RestrictAddressFamilies=AF_UNIX AF_INET AF_INET6   # block AF_PACKET, AF_NETLINK
PrivateNetwork=false        # set true for no network at all (offline-only scripts)
IPAddressDeny=any           # deny all (then allow specific):
IPAddressAllow=10.0.0.0/8 127.0.0.0/8

# ─── System call filtering ──────────────────────────────────────────
SystemCallFilter=@system-service
SystemCallFilter=~@privileged @resources @debug @mount @raw-io
SystemCallArchitectures=native    # block 32-bit ABI on 64-bit kernels

# ─── Resource limits ────────────────────────────────────────────────
MemoryMax=512M
CPUQuota=50%
TasksMax=128
LimitNOFILE=4096

Hardening rationale

Test what hardening actually applies

# Show effective security settings on a running service.
systemd-analyze security myapp.service
# Outputs a score 0-10 (lower = more hardened) and a per-directive breakdown.

# Show what each setting expanded to.
systemctl show myapp.service | grep -E '^(Protect|Restrict|Cap|System|Memory)'

systemd-analyze security is the audit tool. Score under 3 is excellent; under 5 is acceptable; over 7 means you’re running with way too much privilege.

Logging: Just Use Journald

[Service]
StandardOutput=journal
StandardError=journal
SyslogIdentifier=myapp

journal is the default. SyslogIdentifier sets the tag in journalctl output (otherwise journald uses the executable name).

In your script: write to stdout/stderr. Don’t open /var/log/myapp.log yourself. journald captures everything, indexes by unit, retains structured fields.

journalctl -u myapp.service                    # all logs for this unit
journalctl -u myapp.service -f                 # follow
journalctl -u myapp.service --since '1h ago'   # last hour
journalctl -u myapp.service -p err             # errors and worse
journalctl -u myapp.service -o json            # structured output

For structured logs from shell, use the journald protocol:

# In your script:
log() {
  printf '<%s>%s: %s\n' "$1" "${SYSLOG_IDENTIFIER:-myapp}" "$2"
}
log 6 "service starting"     # priority 6 = info (RFC 5424 numeric)
log 3 "database unreachable" # priority 3 = error

systemd-journald reads the leading <N> and assigns the priority field. Combined with journalctl -p err, you can filter precisely.

EnvironmentFile: Secrets and Config Without Hard-Coding

[Service]
EnvironmentFile=-/etc/myapp/env
EnvironmentFile=/etc/myapp/local-env

The leading - means “okay if missing.” Files are loaded in order; later files override earlier.

The file:

# /etc/myapp/env
DATABASE_URL=postgres://app:hidden@db.internal/app
API_KEY=secret-value
LOG_LEVEL=info

Restrict access:

chown root:myapp /etc/myapp/env
chmod 0640 /etc/myapp/env

Better: LoadCredential (systemd 250+)

For modern systemd, LoadCredential= reads a secret into the service’s ${CREDENTIALS_DIRECTORY} and exposes it without leaking via env or /proc/$pid/environ:

[Service]
LoadCredential=db-password:/etc/myapp/db-password
ExecStart=/opt/myapp/bin/run.sh

In the script:

DB_PASSWORD=$(< "${CREDENTIALS_DIRECTORY}/db-password")

CREDENTIALS_DIRECTORY is a tmpfs mount only the service can see, never visible to other processes. Strictly preferable to env-var secrets if your systemd is new enough.

Timer Units: Replacing cron

A timer unit triggers a service unit on a schedule. Two files: the timer and its corresponding service.

# /etc/systemd/system/backup.service
[Unit]
Description=Nightly backup
ConditionACPower=true       # don't run on battery (laptops)

[Service]
Type=oneshot
User=backup
ExecStart=/opt/backup/bin/run.sh
Nice=19                     # lowest CPU priority
IOSchedulingClass=idle      # only run when no other I/O

# /etc/systemd/system/backup.timer
[Unit]
Description=Run backup nightly

[Timer]
OnCalendar=*-*-* 03:00:00     # every day at 3 AM
RandomizedDelaySec=900        # spread load: actual run is 03:00–03:15
Persistent=true               # if missed (host was off), run on next boot
Unit=backup.service           # what to start

[Install]
WantedBy=timers.target

Enable:

sudo systemctl enable --now backup.timer
systemctl list-timers --all
# NEXT                         LEFT       LAST  PASSED  UNIT          ACTIVATES
# Tue 2025-01-14 03:11:23 UTC  14h left   -     -       backup.timer  backup.service

OnCalendar syntax

OnCalendar=daily               # 00:00:00 every day
OnCalendar=hourly              # 00:00 every hour
OnCalendar=Mon..Fri 09:00      # weekdays at 9 AM
OnCalendar=*-*-01 04:00:00     # 1st of every month at 4 AM
OnCalendar=2025-12-31 23:59:00 # one specific time
OnCalendar=*:0/15              # every 15 minutes
OnCalendar=*-*-* 03:00:00      # every day at 3 AM

systemd-analyze calendar 'Mon..Fri 09:00' validates and shows the next firing.

Cron equivalents

cron line systemd OnCalendar
0 3 * * * *-*-* 03:00:00
*/15 * * * * *:0/15
0 9 * * 1-5 Mon..Fri 09:00
0 0 1 * * *-*-01 00:00:00
@reboot OnBootSec=2min (Timer) or After=multi-user.target

Why timers beat cron

A Hardened Production Template

# /etc/systemd/system/myapp.service

[Unit]
Description=My App
After=network-online.target postgresql.service
Wants=network-online.target
Requires=postgresql.service
StartLimitIntervalSec=60
StartLimitBurst=5

[Service]
Type=notify
NotifyAccess=main
WatchdogSec=30

User=myapp
Group=myapp
SupplementaryGroups=

EnvironmentFile=-/etc/myapp/env
LoadCredential=db-password:/etc/myapp/db-password
WorkingDirectory=/opt/myapp

ExecStartPre=/opt/myapp/bin/preflight.sh
ExecStart=/opt/myapp/bin/run.sh
ExecReload=/bin/kill -HUP $MAINPID

Restart=on-failure
RestartSec=5
TimeoutStartSec=120
TimeoutStopSec=30
KillSignal=SIGTERM
KillMode=mixed

# Logging
StandardOutput=journal
StandardError=journal
SyslogIdentifier=myapp
LogRateLimitIntervalSec=10
LogRateLimitBurst=200

# Filesystem
ProtectSystem=strict
ReadWritePaths=/var/lib/myapp /var/log/myapp
ProtectHome=true
PrivateTmp=true
PrivateDevices=true
ProtectKernelTunables=true
ProtectKernelModules=true
ProtectKernelLogs=true
ProtectControlGroups=true
ProtectClock=true
ProtectHostname=true
ProtectProc=invisible
ProcSubset=pid

# Privilege
NoNewPrivileges=true
CapabilityBoundingSet=
AmbientCapabilities=
RestrictSUIDSGID=true
RestrictRealtime=true
LockPersonality=true
MemoryDenyWriteExecute=true

# Network
RestrictAddressFamilies=AF_UNIX AF_INET AF_INET6
RestrictNamespaces=true

# Syscalls
SystemCallFilter=@system-service
SystemCallFilter=~@privileged @resources @debug @mount @raw-io @reboot @swap
SystemCallArchitectures=native
SystemCallErrorNumber=EPERM

# Resources
MemoryMax=1G
MemoryHigh=768M
CPUQuota=200%
TasksMax=256
LimitNOFILE=8192
LimitNPROC=128
LimitCORE=0

# Misc
UMask=0027

[Install]
WantedBy=multi-user.target

This template scores well on systemd-analyze security (typically 1–2 out of 10, “OK” range). Adjust by removing things your script actually needs (e.g., remove MemoryDenyWriteExecute=true for JIT languages).

Real-World Recipes

Recipe 1: One-shot setup script with idempotent guard

[Unit]
Description=One-time database initialization
ConditionPathExists=!/var/lib/myapp/initialized
After=postgresql.service
Requires=postgresql.service

[Service]
Type=oneshot
User=myapp
ExecStart=/opt/myapp/bin/init-db.sh
ExecStartPost=/usr/bin/touch /var/lib/myapp/initialized
RemainAfterExit=yes

[Install]
WantedBy=multi-user.target

The ConditionPathExists=!/var/lib/myapp/initialized means: don’t run if the marker file exists. After the script succeeds, ExecStartPost creates the marker. On reboot, the unit is “skipped (precondition not met)” and journalctl logs that. Idempotent across reboots.

Recipe 2: Long-running daemon with watchdog

/opt/myapp/bin/run-daemon.sh:

#!/usr/bin/env bash
set -Eeuo pipefail

# Initialization phase.
load_config
warm_caches
open_database

# Tell systemd we're ready.
systemd-notify --ready --status="Listening on :8080"

# Main loop with periodic watchdog ping.
while :; do
  if ! main_iteration; then
    systemd-notify --status="Iteration failed; exiting"
    exit 1
  fi
  systemd-notify WATCHDOG=1 --status="Iteration completed at $(date -u +%FT%TZ)"
  sleep 5
done

Unit sets WatchdogSec=30, Type=notify. If main_iteration hangs > 30 seconds, watchdog fires and systemd restarts the service.

Recipe 3: Timer-driven backup with offline persistence

/etc/systemd/system/myapp-backup.service:

[Unit]
Description=MyApp backup

[Service]
Type=oneshot
User=backup
ExecStart=/opt/myapp/bin/backup.sh
Nice=19
IOSchedulingClass=idle
TimeoutStartSec=2h
StandardOutput=journal
SyslogIdentifier=myapp-backup
ProtectSystem=strict
ReadWritePaths=/var/lib/backup /var/lib/myapp
PrivateTmp=true
NoNewPrivileges=true

/etc/systemd/system/myapp-backup.timer:

[Unit]
Description=Run myapp backup daily

[Timer]
OnCalendar=*-*-* 03:00:00
RandomizedDelaySec=15min
Persistent=true
Unit=myapp-backup.service

[Install]
WantedBy=timers.target

Persistent=true means: if the host was off at 03:00, run the backup as soon as the host is up. This is what cron’s @reboot should be — guaranteed catch-up.

Recipe 4: Service with reload that re-reads config

[Service]
Type=notify
ExecStart=/opt/myapp/bin/run.sh
ExecReload=/bin/kill -HUP $MAINPID

In the script:

reload_config() {
  echo "received SIGHUP; reloading config"
  load_config
  systemd-notify --reloading
  warm_caches
  systemd-notify --ready --status="Reloaded at $(date -u +%FT%TZ)"
}
trap reload_config HUP

# main loop

systemctl reload myapp sends SIGHUP, the script re-reads config without exit. systemctl reload is preferred over systemctl restart when the change is config-only; no downtime.

Recipe 5: Per-instance template units

You have 5 worker queues, identical config except for the queue name. Use a template unit:

/etc/systemd/system/myapp-worker@.service:

[Unit]
Description=MyApp worker for queue %i

[Service]
Type=notify
User=myapp
Environment=QUEUE_NAME=%i
ExecStart=/opt/myapp/bin/worker.sh
Restart=on-failure

Enable with the instance name after @:

sudo systemctl enable --now myapp-worker@orders.service
sudo systemctl enable --now myapp-worker@billing.service
sudo systemctl enable --now myapp-worker@notifications.service

%i in the unit file is replaced with the part after @. Five instances, one unit file, individual control: systemctl restart myapp-worker@orders.

Footgun List

  1. Type=simple for a script that exits. Use oneshot, not simple. Service will appear to “succeed” then immediately become inactive.

  2. Restart=always without RestartSec. A crash-loop pegs the CPU. Always set RestartSec= and StartLimitBurst=.

  3. User=root because you didn’t think about it. Default is root. Always set User= to a service account; for one-shots, consider DynamicUser=true.

  4. After=network.target instead of network-online.target. network.target only guarantees the network stack is initialized, not that the network is up. For network-dependent services, use network-online.target and Wants=network-online.target.

  5. EnvironmentFile= with permissive mode. chmod 0640 with group read for the service account; never world-readable.

  6. Logging to a file you also tee to journald. Pick one. Logs in two places means a 2x storage bill and grep confusion.

  7. ExecReload=systemctl reload-or-try-restart (recursive) — don’t do this. ExecReload is the implementation of reload, usually kill -HUP $MAINPID.

  8. ProtectSystem=strict with no ReadWritePaths=. Service has nowhere to write. Add ReadWritePaths=/var/lib/myapp etc. for the legitimate writable paths.

  9. WatchdogSec= without Type=notify. WatchdogSec only fires for Type=notify services that send WATCHDOG=1. Other types ignore it.

  10. ConditionPathExists= confused with RequiresMountsFor=. ConditionPathExists is checked once at start; if false, the unit is skipped. RequiresMountsFor= ensures a path’s mount is up. Different semantics.

  11. Editing the unit and forgetting daemon-reload. systemd caches unit files; without daemon-reload, your changes don’t apply.

  12. Forgetting [Install] means systemctl enable does nothing. The WantedBy= is what creates the symlink that triggers auto-start.

Quick-Reference Card

┌─ Type SELECTION ──────────────────────────────────────────────────────┐
│  Type=oneshot     scripts that run-and-exit (with RemainAfterExit)   │
│  Type=simple      daemons that don't fork; trivial init               │
│  Type=notify      daemons with non-trivial init (call sd_notify)      │
│  Type=forking     legacy double-forking daemons (rare today)          │
└────────────────────────────────────────────────────────────────────────┘

┌─ RESTART POLICY ──────────────────────────────────────────────────────┐
│  Restart=on-failure    most services                                  │
│  RestartSec=5          back-off between restarts                      │
│  StartLimitIntervalSec=60                                             │
│  StartLimitBurst=5     max 5 starts in 60s; then "failed"             │
└────────────────────────────────────────────────────────────────────────┘

┌─ HARDENING (DEFAULT-ON FOR NEW SERVICES) ─────────────────────────────┐
│  ProtectSystem=strict + ReadWritePaths=...                            │
│  ProtectHome=true                                                     │
│  PrivateTmp=true                                                      │
│  NoNewPrivileges=true                                                 │
│  CapabilityBoundingSet=                                               │
│  SystemCallFilter=@system-service                                     │
│  RestrictAddressFamilies=AF_UNIX AF_INET AF_INET6                     │
│  MemoryMax=N TasksMax=N CPUQuota=X%                                   │
└────────────────────────────────────────────────────────────────────────┘

┌─ sd_notify FROM SHELL ────────────────────────────────────────────────┐
│  systemd-notify --ready                  service is started            │
│  systemd-notify WATCHDOG=1               kick the watchdog timer       │
│  systemd-notify --status="..."          set status field for status   │
│  systemd-notify --stopping              shutting down                  │
│  systemd-notify --reloading             reloading config               │
└────────────────────────────────────────────────────────────────────────┘

┌─ TIMER ESSENTIALS ────────────────────────────────────────────────────┐
│  OnCalendar=*-*-* 03:00:00     daily 3 AM                             │
│  RandomizedDelaySec=15min      jitter to avoid thundering herd        │
│  Persistent=true               run on boot if missed                  │
│  systemd-analyze calendar EXP  validate the expression                │
└────────────────────────────────────────────────────────────────────────┘

┌─ AUDIT COMMANDS ──────────────────────────────────────────────────────┐
│  systemd-analyze security UNIT     hardening score                    │
│  systemctl show UNIT               all expanded settings              │
│  systemctl status UNIT             current state + recent log         │
│  journalctl -u UNIT [-f] [--since] log access                         │
│  systemctl list-timers --all       all configured timers              │
│  systemd-analyze calendar 'EXPR'   validate timer schedule            │
└────────────────────────────────────────────────────────────────────────┘

Tier 4 Capstone

This lesson closes Tier 4. You now have the toolset:

What ties them together: every script you write at this level is inspectable, reversible, and bounded. You can trace what it did, you can roll it back, and you can put a wall around what it can do (Linux capabilities, systemd hardening, IAM scope). These are the skills that separate scripts that survive five years from scripts that break next quarter.

The next tier (Wave 4: Tier 5 Specialist) takes these foundations and applies them to specific operator domains: bootstrap and cloud-init, monitoring and watchdogs, backup/restore, database admin, log analysis at scale, self-healing systems, migrations, compliance, and forensics. Each lesson treats shell as the integration glue between disciplined script craft and the operational realities of running production systems.

shellsystemdservice-managementlinuxtimerswatchdogshardeningsandboxingsd-notifyinit
Need this built for real?

Vinod is a Senior Cloud Architect (22+ yrs) — available for Azure / AWS / GCP architecture, landing zones, and migrations.

Work with me

Comments