Skip to content

Backup and Recovery

PostgreSQL offers multiple backup strategies from logical dumps to continuous WAL archiving. The choice depends on database size, RPO/RTO requirements, and operational complexity tolerance.

Backup Tools Comparison

Tool Type Incremental Parallel Cloud-Native
pg_dump Logical No Restore only No
pg_basebackup Physical No No No
pg_probackup Physical Yes (FULL/PAGE/DELTA) Yes S3 via Ceph
WAL-G Physical Yes (delta) Yes S3/GCS/Azure

pg_dump / pg_dumpall

Logical backup - exports SQL statements. Works across PostgreSQL versions.

# Single table
pg_dump -h HOST -p 5432 -U user -d dbname --table=tablename > dump.sql

# Custom format (compressed, parallel restore)
pg_dump -Fc dbname > dump.custom
pg_restore -d newdb -j 4 dump.custom  # parallel restore with 4 jobs

# All databases + roles
pg_dumpall > full_dump.sql

pg_basebackup

Physical backup - copies entire data directory. Base for streaming replication.

pg_basebackup -h PRIMARY_IP -p 5432 -U repuser -R -D /path/to/data
# -R creates standby.signal + connection info
# --no-slot avoids replication slot creation

pg_probackup

Advanced backup utility: incremental backups, validation, multi-threaded operations.

# Initialize catalog
pg_probackup init -B /backup/catalog
pg_probackup add-instance -B /backup/catalog -D /var/lib/postgresql/15/main --instance main

# Full backup
pg_probackup backup -B /backup/catalog --instance main -b FULL --stream

# Incremental (PAGE mode - only changed pages)
pg_probackup backup -B /backup/catalog --instance main -b PAGE --stream

# Restore
pg_probackup restore -B /backup/catalog --instance main -i BACKUP_ID

At scale (1500+ instances): DELTA-only strategy (no periodic FULL), --no-validate and --no-sync during backup, merge retention. Incremental ratio ~0.12 (12% of full backup size).

WAL-G

Cloud-native backup with S3/GCS/Azure support.

# Configure for S3
export WALG_S3_PREFIX=s3://my-bucket/pg-backups
export AWS_ACCESS_KEY_ID=...
export AWS_SECRET_ACCESS_KEY=...

# Full backup
wal-g backup-push /var/lib/postgresql/15/main

# Restore latest
wal-g backup-fetch /var/lib/postgresql/15/main LATEST

PITR (Point-in-Time Recovery)

Uses WAL archiving to restore to any point in time.

# Enable WAL archiving (postgresql.conf)
archive_mode = on
archive_command = 'cp %p /archive/%f'

# Restore to specific time
recovery_target_time = '2024-06-15 14:30:00'

Each recovery creates a new timeline. pg_rewind re-syncs diverged replicas without full re-copy.

Pre-Loading Optimizations (Bulk Load)

Before bulk loading large datasets: 1. Disable autocommit 2. Drop indexes (recreate after load) 3. Drop foreign keys (recreate after) 4. Increase maintenance_work_mem 5. Increase max_wal_size

Data Checksums

# Enable at initdb or via utility (requires shutdown)
pg_checksums --enable -D /var/lib/postgresql/15/main

Detects silent data corruption (bitrot). ~1-2% read overhead. Validated during backup (catches block corruption). Check: SHOW data_checksums;

Production Backup Checklist

  • Take backups from replica (avoid primary performance impact)
  • Test restores regularly (automated restore testing)
  • Monitor archive_command success
  • Retention policy: 7-30 days depending on requirements
  • Separate backup storage from database storage
  • RPO (Recovery Point Objective): how much data loss is acceptable
  • RTO (Recovery Time Objective): how fast must recovery complete

Gotchas

  • pg_dump is logical - does not preserve physical optimizations (table order, TOAST)
  • pg_basebackup copies entire data directory - slow for large databases
  • Lost WAL segments break backup chain for PITR
  • Replication slots prevent WAL cleanup but can fill disk
  • pg_restore -j parallelism only works with custom format (-Fc)
  • PostgreSQL timelines: WAL files include timeline ID - don't replay WAL from wrong recovery branch

See Also