Backups That Actually Work
Everyone has backups. Not everyone has tested backups. The difference becomes apparent at the worst possible moment: when you actually need to restore.
This guide covers how to build a cron-based backup strategy that not only runs automatically, but verifies itself, rotates old backups, and alerts you when something goes wrong.
The Backup Script
#!/bin/bash
set -euo pipefail
# Configuration
DB_NAME="production"
BACKUP_DIR="/backups/postgres"
RETENTION_DAYS=30
DATE=$(date +%Y%m%d_%H%M%S)
BACKUP_FILE="${BACKUP_DIR}/${DB_NAME}_${DATE}.sql.gz"
MIN_SIZE_KB=100
# Create backup directory
mkdir -p "$BACKUP_DIR"
# Create backup
pg_dump "$DB_NAME" | gzip > "$BACKUP_FILE"
# Validate backup size
ACTUAL_SIZE=$(du -k "$BACKUP_FILE" | cut -f1)
if [ "$ACTUAL_SIZE" -lt "$MIN_SIZE_KB" ]; then
echo "ERROR: Backup too small (${ACTUAL_SIZE}KB)" >&2
rm -f "$BACKUP_FILE"
exit 1
fi
# Validate backup integrity
if ! gunzip -t "$BACKUP_FILE"; then
echo "ERROR: Backup file is corrupted" >&2
rm -f "$BACKUP_FILE"
exit 1
fi
# Rotate old backups
find "$BACKUP_DIR" -name "${DB_NAME}_*.sql.gz" -mtime +${RETENTION_DAYS} -delete
# Report
BACKUP_COUNT=$(find "$BACKUP_DIR" -name "${DB_NAME}_*.sql.gz" | wc -l)
echo "Backup complete: ${BACKUP_FILE} (${ACTUAL_SIZE}KB, ${BACKUP_COUNT} backups retained)"
# Check in with monitoring
curl -fsS --retry 3 https://cronguard.app/api/ping/your-monitor-id \
-d "Backup: ${ACTUAL_SIZE}KB, ${BACKUP_COUNT} retained"
Scheduling Strategy
Frequency
How often to backup depends on how much data you can afford to lose. This is your Recovery Point Objective (RPO).
| Data Sensitivity | Backup Frequency | Crontab |
|---|---|---|
| Critical (e-commerce, finance) | Every hour | 0 * * * * |
| Important (SaaS, user data) | Every 6 hours | 0 */6 * * * |
| Standard (content, internal tools) | Daily | 0 2 * * * |
| Low priority (dev, staging) | Weekly | 0 2 * * 0 |
Timing
Schedule backups during low-traffic periods. Database dumps can be I/O intensive and may slow down your application. For most services, 2-4 AM local time works well.
Rotation Strategy
A common rotation scheme (grandfather-father-son):
- Daily backups: Keep for 7 days
- Weekly backups: Keep for 4 weeks (every Sunday backup)
- Monthly backups: Keep for 12 months (first of month backup)
#!/bin/bash
# Simple rotation: keep daily for 7 days, weekly for 4 weeks, monthly for 1 year
BACKUP_DIR="/backups/postgres"
DB_NAME="production"
# Delete daily backups older than 7 days (except first-of-month and sundays)
find "$BACKUP_DIR" -name "${DB_NAME}_*.sql.gz" -mtime +7 \
! -name "${DB_NAME}_*01_*.sql.gz" | while read f; do
DAY=$(basename "$f" | grep -oP '\d{8}' | cut -c7-8)
DOW=$(date -d "$(basename "$f" | grep -oP '\d{8}')" +%u 2>/dev/null || echo "")
if [ "$DOW" != "7" ]; then
rm -f "$f"
fi
done
# Delete weekly backups older than 4 weeks (except first-of-month)
find "$BACKUP_DIR" -name "${DB_NAME}_*.sql.gz" -mtime +28 \
! -name "${DB_NAME}_*01_*.sql.gz" -delete
# Delete monthly backups older than 1 year
find "$BACKUP_DIR" -name "${DB_NAME}_*01_*.sql.gz" -mtime +365 -delete
Offsite Storage
Backups on the same server as the database are useless if the server dies. Push backups to offsite storage:
# Upload to S3
aws s3 cp "$BACKUP_FILE" "s3://my-backups/postgres/"
# Upload to B2 (Backblaze)
b2 upload-file my-bucket "$BACKUP_FILE" "postgres/$(basename $BACKUP_FILE)"
# Rsync to remote server
rsync -az "$BACKUP_FILE" backup-server:/backups/postgres/
Restore Testing
A backup that cannot be restored is not a backup. Schedule periodic restore tests:
#!/bin/bash
set -euo pipefail
LATEST=$(ls -t /backups/postgres/production_*.sql.gz | head -1)
TEST_DB="restore_test_$(date +%Y%m%d)"
# Create test database
createdb "$TEST_DB"
# Restore
gunzip -c "$LATEST" | psql "$TEST_DB"
# Run a basic validation query
ROWS=$(psql -t -c "SELECT COUNT(*) FROM users" "$TEST_DB")
if [ "$ROWS" -lt 1 ]; then
echo "ERROR: Restore test failed - users table empty" >&2
dropdb "$TEST_DB"
exit 1
fi
echo "Restore test passed: $ROWS users found"
dropdb "$TEST_DB"
# Check in
curl -fsS https://cronguard.app/api/ping/restore-test-monitor \
-d "Restore OK: $ROWS users"
MySQL Specifics
# MySQL dump with single transaction (no locking)
mysqldump --single-transaction --routines --triggers \
-u backup_user -p"$MYSQL_PASSWORD" production | gzip > "$BACKUP_FILE"
MongoDB Specifics
# mongodump with oplog for point-in-time recovery
mongodump --uri="$MONGO_URI" --oplog --gzip --archive="$BACKUP_FILE"
Monitoring Your Backups
Monitor both the backup job AND the restore test. A backup that runs but produces corrupt files is just as bad as no backup at all.
- Backup job - dead man's switch, alert if it does not complete
- Backup size - alert if the backup is suspiciously small or large
- Restore test - weekly dead man's switch on the restore test job
- Offsite sync - alert if uploads to S3/B2 fail
Conclusion
A reliable backup strategy has four parts: automated creation (cron), intelligent rotation (grandfather-father-son), offsite storage (S3, B2, rsync), and verified restores (automated testing). Monitoring every step ensures you find out about failures immediately, not when you desperately need a restore.