Configure PostgreSQL Continuous Archiving and Point-in-Time Recovery with pgBackRest to S3

At 02:14 on a Tuesday a release migration on a fintech’s primary PostgreSQL 16 cluster runs UPDATE ledger_entry SET status = 'void' without its WHERE clause, and 9.4 million rows flip to void before someone kills the session at 02:17. The nightly pg_dump is fourteen hours stale, so restoring it would lose a full trading day. What the on-call DBA actually needs is the ability to rewind the database to 02:13:30 — the instant before the bad transaction — and not one second further. That is point-in-time recovery (PITR), and pg_dump cannot do it. This guide builds the capability that can: pgBackRest taking full and incremental backups, PostgreSQL continuously archiving its write-ahead log (WAL) to Amazon S3, and a rehearsed recovery to an exact timestamp or transaction ID. We wire it the way a regulated shop runs it — credentials out of files, recoveries gated through change control, and the whole thing watched.

PITR works because PostgreSQL already writes every change to the WAL before it touches a data file. If you keep a base backup plus an unbroken chain of WAL segments from that backup forward, you can restore the base and then replay WAL up to any recovery_target_time, recovery_target_xid, or named restore point in between. pgBackRest is the tool that makes this robust at scale: parallel compressed backups, incremental and differential backups, block-level deltas, integrity checksums on every file, and first-class S3 support so your backups and WAL live in durable, versioned object storage rather than on the box that just failed.

Prerequisites

A PostgreSQL 13+ primary you administer (examples use PostgreSQL 16 on Ubuntu 22.04, data dir /var/lib/postgresql/16/main).
pgbackrest 2.50+ installed on the database host (apt-get install pgbackrest or the PGDG repo).
An S3 bucket with Object Lock and versioning available — examples use s3://kv-pgbackrest-prod in ap-south-1.
An AWS IAM role or user scoped to that bucket only (policy shown in Step 1). In production the credential is delivered by HashiCorp Vault’s AWS secrets engine as a short-lived, dynamically generated key — never a static key in a file.
Human and machine identity through Okta federated to AWS IAM Identity Center (SSO), so a DBA assumes the pgbackrest-operator role with their own audited identity rather than a shared login. Entra ID federates the same way for staff who live in the Microsoft tenant.
sudo/root on the DB host, and the ability to restart PostgreSQL once.

Target topology

Configure PostgreSQL Continuous Archiving and Point-in-Time Recovery with pgBackRest to S3 — topology

The shape is deliberately simple. The PostgreSQL primary runs the pgbackrest binary locally. On a schedule (driven by cron, or by a Jenkins job or GitHub Actions workflow for auditable, logged runs), pgBackRest pushes full, differential, and incremental backups to an S3 repository. Independently and continuously, PostgreSQL’s archive_command hands every completed WAL segment to pgBackRest, which copies it to the same S3 repository — so the gap between “last backup” and “now” is always covered by archived WAL. A standby or a throwaway recovery host can pull a base backup plus the WAL chain back from S3 and replay to any chosen point. Terraform provisions the bucket, IAM, KMS key, and lifecycle rules; Ansible installs and templates pgbackrest.conf across hosts so every node is configured identically.

1. Provision the S3 repository and a scoped IAM identity

Treat the bucket as immutable infrastructure. This Terraform creates a versioned, encrypted, Object-Lock bucket plus a lifecycle rule that expires old backup objects, and an IAM policy scoped to exactly that bucket. Object Lock in GOVERNANCE mode means even a compromised DB credential cannot delete recent backups — directly defeating ransomware that targets backups first.

resource "aws_s3_bucket" "pgbackrest" {
  bucket              = "kv-pgbackrest-prod"
  object_lock_enabled = true
}

resource "aws_s3_bucket_versioning" "pgbackrest" {
  bucket = aws_s3_bucket.pgbackrest.id
  versioning_configuration { status = "Enabled" }
}

resource "aws_s3_bucket_server_side_encryption_configuration" "pgbackrest" {
  bucket = aws_s3_bucket.pgbackrest.id
  rule {
    apply_server_side_encryption_by_default {
      sse_algorithm     = "aws:kms"
      kms_master_key_id = aws_kms_key.pgbackrest.arn
    }
    bucket_key_enabled = true
  }
}

resource "aws_s3_bucket_lifecycle_configuration" "pgbackrest" {
  bucket = aws_s3_bucket.pgbackrest.id
  rule {
    id     = "expire-old-backups"
    status = "Enabled"
    noncurrent_version_expiration { noncurrent_days = 35 }
  }
}

The IAM policy the DB host (or Vault) uses is the minimum pgBackRest needs — list, get, put, and delete inside the bucket prefix only:

{
  "Version": "2012-10-17",
  "Statement": [
    { "Effect": "Allow",
      "Action": ["s3:ListBucket", "s3:GetBucketLocation"],
      "Resource": "arn:aws:s3:::kv-pgbackrest-prod" },
    { "Effect": "Allow",
      "Action": ["s3:GetObject", "s3:PutObject", "s3:DeleteObject"],
      "Resource": "arn:aws:s3:::kv-pgbackrest-prod/*" }
  ]
}

For dynamic credentials, configure Vault’s AWS secrets engine so the host leases a key that auto-expires:

vault write aws/roles/pgbackrest-prod \
  credential_type=iam_user \
  policy_arns="arn:aws:iam::123456789012:policy/pgbackrest-s3-rw"
# At runtime the host reads a short-lived key, scoped and time-boxed:
vault read aws/creds/pgbackrest-prod   # -> access_key, secret_key, lease_duration=1h

2. Install and configure pgBackRest on the primary

Install the package, then create the config. The [global] section defines the S3 repository; the [kv-prod] stanza names this specific cluster.

sudo apt-get install -y pgbackrest
sudo -u postgres mkdir -p /var/log/pgbackrest /var/lib/pgbackrest
sudo install -d -o postgres -g postgres -m 750 /etc/pgbackrest

Write /etc/pgbackrest/pgbackrest.conf (in production this file is rendered by Ansible from a Jinja2 template so every host is byte-identical and the S3 key is injected from Vault, never hardcoded):

[global]
repo1-type=s3
repo1-s3-bucket=kv-pgbackrest-prod
repo1-s3-endpoint=s3.ap-south-1.amazonaws.com
repo1-s3-region=ap-south-1
repo1-s3-key-type=auto          ; use the host's IAM role / Vault-leased key
repo1-path=/pgbackrest
repo1-cipher-type=aes-256-cbc   ; client-side encryption on top of SSE-KMS
repo1-cipher-pass=ENV:PGBACKREST_REPO_CIPHER   ; never the literal passphrase
repo1-retention-full=4         ; keep 4 full backups
repo1-retention-diff=6
repo1-bundle=y                  ; bundle small files for fewer S3 PUTs
repo1-block=y                   ; block-level incrementals
process-max=4                   ; parallel workers
compress-type=zst
compress-level=3
start-fast=y
log-level-console=info
log-level-file=detail

[kv-prod]
pg1-path=/var/lib/postgresql/16/main
pg1-port=5432

repo1-cipher-pass=ENV:... tells pgBackRest to read the repository encryption passphrase from an environment variable; the variable itself is populated from Vault at process start, so the passphrase is never written to disk. Lose it and the backups are unreadable — store it in Vault as the single source of truth and back that up.

3. Turn on continuous WAL archiving in PostgreSQL

This is the half people forget, and without it you have backups but no PITR. Point PostgreSQL’s archive_command at pgBackRest so every completed WAL segment is shipped to S3.

sudo -u postgres psql -c "ALTER SYSTEM SET archive_mode = 'on';"
sudo -u postgres psql -c "ALTER SYSTEM SET archive_command = 'pgbackrest --stanza=kv-prod archive-push %p';"
sudo -u postgres psql -c "ALTER SYSTEM SET wal_level = 'replica';"
sudo -u postgres psql -c "ALTER SYSTEM SET archive_timeout = '60';"   # force a segment at least every 60s
sudo -u postgres psql -c "ALTER SYSTEM SET max_wal_senders = 10;"

archive_mode and wal_level require a restart; the rest reload. Schedule this through change control — it is the one disruptive step:

sudo systemctl restart postgresql@16-main
sudo -u postgres psql -c "SHOW archive_command;"   # confirm it stuck

archive_timeout = 60 caps your recovery point objective: even on an idle database a WAL segment is forced to S3 at least once a minute, so you can never lose more than ~60 seconds. The archive_command must return non-zero on failure so PostgreSQL retains the segment and retries — pgBackRest does this correctly, which is exactly why you do not roll your own aws s3 cp here.

4. Initialize the stanza and take the first full backup

Create the stanza (one-time repository setup for this cluster), verify the wiring, then take a full backup.

sudo -u postgres pgbackrest --stanza=kv-prod stanza-create
sudo -u postgres pgbackrest --stanza=kv-prod check        # validates archiving end-to-end
sudo -u postgres pgbackrest --stanza=kv-prod --type=full --log-level-console=info backup

check is the most important command in this guide — it forces a WAL switch and confirms the segment actually arrived in S3, catching a broken archive_command before you depend on it. Now wire the schedule. A pragmatic rotation: a weekly full, daily differentials, and hourly incrementals, run from Jenkins (or a GitHub Actions scheduled workflow) so each run is logged, retried, and visible, rather than a silent cron line:

# crontab for the postgres user (or equivalent Jenkins stages)
00 02 * * 0   pgbackrest --stanza=kv-prod --type=full backup    # Sun 02:00 full
00 02 * * 1-6 pgbackrest --stanza=kv-prod --type=diff backup    # Mon-Sat differential
00 *  * * *   pgbackrest --stanza=kv-prod --type=incr backup    # hourly incremental

Inspect what you have at any time:

sudo -u postgres pgbackrest --stanza=kv-prod info

stanza: kv-prod
    status: ok
    db (current)
        full backup: 20260610-020000F
            timestamp start/stop: 2026-06-10 02:00:01 / 2026-06-10 02:03:12
            wal start/stop: 0000000100000A8B000000F2 / 0000000100000A8B000000F4
        incr backup: 20260610-090000F_20260610-100000I
            backup reference list: 20260610-020000F

5. Recover to a point in time

The drill that matters. To rewind to 02:13:30 — just before the runaway UPDATE — stop PostgreSQL, restore the base, set the recovery target, and let WAL replay stop precisely there. Always recover onto a separate host or a stopped clone first; never overwrite a live primary blind.

sudo systemctl stop postgresql@16-main

# Restore base + WAL, targeting the exact instant. --delta only fetches changed files.
sudo -u postgres pgbackrest --stanza=kv-prod \
  --type=time --target="2026-06-10 02:13:30+05:30" \
  --target-action=promote --delta restore

pgBackRest writes the recovery settings into postgresql.auto.conf and drops recovery.signal. Start the server; it replays WAL from the base backup up to the target and then stops:

sudo systemctl start postgresql@16-main
sudo -u postgres psql -c "SELECT pg_is_in_recovery();"   # 'f' once promoted to the target
sudo -u postgres psql -c "SELECT count(*) FROM ledger_entry WHERE status='void';"  # back to normal

You can target other anchors instead of a wall-clock time, which is often more precise:

# To a transaction id (recover up to but not including a known-bad xid):
--type=xid --target="987654321" --target-exclusive

# To a named restore point you created before a risky migration:
#   SELECT pg_create_restore_point('pre_ledger_migration');
--type=name --target="pre_ledger_migration"

# To the latest possible moment (full disaster recovery, lose nothing):
--type=default

Once you confirm the data is correct and pg_is_in_recovery() returns f, the timeline has forked; take a fresh --type=full backup immediately so the new timeline has its own backup baseline.

Validation

Trust nothing you have not tested. Run these after setup and on a schedule:

# 1. End-to-end archiving health (run hourly via monitoring):
sudo -u postgres pgbackrest --stanza=kv-prod check

# 2. Verify every backup's checksums against the repo (catches bit-rot / partial uploads):
sudo -u postgres pgbackrest --stanza=kv-prod verify

# 3. Confirm WAL is actually flowing and not piling up locally:
sudo -u postgres psql -c "SELECT last_archived_wal, last_failed_wal, last_archived_time
                          FROM pg_stat_archiver;"

# 4. A real restore rehearsal to a scratch host — the only test that proves DR works:
pgbackrest --stanza=kv-prod --type=time --target="$(date -d '-1 hour' '+%F %T')" \
  --pg1-path=/var/lib/postgresql/16/restore_test restore

Make rehearsal #4 a monthly, ticketed event. The metrics — last_failed_wal being non-null, backup duration, repo size, time-since-last-good-backup — should flow into Datadog or Dynatrace via the PostgreSQL integration and a small exporter parsing pgbackrest info --output=json. Wire an alert: any failed WAL archive, or no successful backup in 26 hours, pages the on-call and auto-opens a ServiceNow incident so the gap is tracked, not just logged. A successful monthly restore rehearsal closes a recurring ServiceNow change task — auditors love that paper trail.

Rollback / teardown

If you need to back this capability out cleanly (e.g. decommissioning the cluster), reverse the order: stop archiving first, then remove the repository.

# 1. Stop PostgreSQL from archiving (reload, no restart needed to disable the command):
sudo -u postgres psql -c "ALTER SYSTEM SET archive_command = '/bin/true';"
sudo -u postgres psql -c "SELECT pg_reload_conf();"
# To fully disable archiving (requires a restart):
sudo -u postgres psql -c "ALTER SYSTEM SET archive_mode = 'off';"
sudo systemctl restart postgresql@16-main

# 2. Drop the stanza and expire backups from the repo:
sudo -u postgres pgbackrest --stanza=kv-prod stop
sudo -u postgres pgbackrest --stanza=kv-prod --force stanza-delete

# 3. Tear down the bucket via Terraform ONLY after the Object Lock retention window
#    has elapsed; locked objects cannot be deleted before then, by design.
terraform destroy -target=aws_s3_bucket.pgbackrest

To abort a single bad recovery instead, stop PostgreSQL and re-run restore with a corrected --target — restores are idempotent with --delta, so you can iterate toward the right instant without re-fetching the whole base each time.

Common pitfalls

Archiving silently broken. If archive_command fails, PostgreSQL keeps WAL in pg_wal/ and the disk fills until the database stops accepting writes. The pg_stat_archiver.last_failed_wal check and a disk alert catch this; never deploy without them.
Forgetting wal_level / archive_mode need a restart. Setting them with ALTER SYSTEM and a reload does nothing until the restart — people “enable” PITR and discover months later it was never on. Run pgbackrest check to prove it.
Losing repo1-cipher-pass. Client-side encryption is non-recoverable without the passphrase. Store it in Vault, and back up Vault. A backup you cannot decrypt is not a backup.
Time zone ambiguity in --target. Always pass an explicit offset (02:13:30+05:30) or run in UTC. A naive timestamp can replay to the wrong instant by hours.
Restoring over the live primary. Recover to a separate path or host, validate, then promote or repoint. Overwriting the primary turns a data-loss incident into an outage.
No retention math. repo1-retention-full=4 with weekly fulls means ~4 weeks of recoverability. Confirm that meets your RPO/retention policy before an auditor asks.

Security notes

Defense in depth: S3 SSE-KMS encrypts at rest, pgBackRest client-side aes-256-cbc encrypts before upload so the bytes are opaque to AWS, and Object Lock (GOVERNANCE) plus versioning make recent backups undeletable by a compromised host — the single best control against backup-targeting ransomware. The IAM credential is least-privilege and scoped to one bucket, ideally Vault-leased and short-lived rather than static. Human access to run recoveries flows Okta → AWS IAM Identity Center (or Entra ID for the Microsoft tenant), so every restore is performed under a named, MFA-backed identity, not a shared key. Wiz (and Wiz Code scanning the Terraform in the pull request) continuously checks the bucket for public-exposure drift, missing encryption, or an over-broad policy, and flags it before merge or in production. CrowdStrike Falcon runs on the DB host for runtime threat detection — a process exfiltrating WAL or tampering with pgbackrest.conf is exactly the behavior it surfaces to the SOC. If your DB sits behind one, route S3 egress through the approved virtual appliance/NAT path and via Akamai or an S3 VPC endpoint so backup traffic stays on controlled networks.

Cost notes

The dominant costs are S3 storage and S3 request charges. zst compression typically shrinks PostgreSQL backups 3–5x; repo1-bundle=y collapses thousands of tiny WAL and file objects into far fewer S3 PUTs, which materially cuts request cost on busy clusters. Use incrementals and differentials so only changed blocks ship after the weekly full, and a lifecycle rule to expire non-current versions (35 days above) and optionally transition older fulls to S3 Glacier Instant Retrieval. Keep the repository in the same region as the database (ap-south-1 here) to avoid cross-region transfer fees, and budget for the restore-rehearsal GETs as a small, predictable line item — a DR capability you never exercise is the truly expensive one. For a mid-size cluster (a few hundred GB, hourly incrementals, 4 weekly fulls), expect single-digit-thousands of rupees per month — a rounding error against the cost of losing a trading day, which is the whole point.

Configure PostgreSQL Continuous Archiving and Point-in-Time Recovery with pgBackRest to S3

Prerequisites

Target topology

1. Provision the S3 repository and a scoped IAM identity

2. Install and configure pgBackRest on the primary

3. Turn on continuous WAL archiving in PostgreSQL

4. Initialize the stanza and take the first full backup

5. Recover to a point in time

Validation

Rollback / teardown

Common pitfalls

Security notes

Cost notes

Written by Vinod

Comments

Keep Reading

Data Contracts and Schema Registry for Reliable Pipelines

Data Quality and Observability Architecture

Enterprise Data Catalog, Lineage and Governance