10 Cron Job Best Practices Every Engineer Should Follow

Beyond the Crontab

Setting up a cron job is easy. Keeping it running reliably in production for months and years is hard. Most cron failures come from missing one of these practices.

1. Use Lock Files to Prevent Overlap

If a job takes longer than its interval, you get two instances running simultaneously. This causes data corruption, duplicate processing, and race conditions.

#!/bin/bash
LOCKFILE=/tmp/my-job.lock

if [ -f "$LOCKFILE" ]; then
  echo "Job already running, skipping"
  exit 0
fi

trap "rm -f $LOCKFILE" EXIT
touch "$LOCKFILE"

# Your actual work here
process_data.sh

Or use flock for a more robust solution:

# In crontab - flock ensures only one instance runs
*/5 * * * * /usr/bin/flock -n /tmp/my-job.lock /path/to/my-job.sh

2. Always Use set -euo pipefail

Without this, bash scripts continue executing after errors. A failed database connection does not stop the rest of the script from running with stale or missing data.

#!/bin/bash
set -euo pipefail

# Now any error stops execution immediately

3. Log Everything with Timestamps

When debugging a 3 AM failure at 9 AM, timestamps in logs are the difference between finding the cause in minutes versus hours.

#!/bin/bash
LOG=/var/log/my-job.log

log() {
  echo "[$(date +'%Y-%m-%d %H:%M:%S')] $*" | tee -a "$LOG"
}

log "Starting backup..."
pg_dump mydb > backup.sql
log "Backup completed: $(du -h backup.sql | cut -f1)"

4. Set Explicit Timeouts

A job that hangs indefinitely is worse than one that fails. At least a failed job can be detected. A hung job just sits there, holding resources and blocking the next run.

# Kill the job if it runs longer than 30 minutes
timeout 1800 /path/to/my-job.sh

# For curl operations
curl --max-time 30 --connect-timeout 10 https://api.example.com/data

5. Handle Temporary Failures with Retries

Network glitches, API rate limits, and temporary database locks are all transient. Retrying with backoff handles these automatically.

#!/bin/bash
MAX_RETRIES=3
RETRY_DELAY=30

for i in $(seq 1 $MAX_RETRIES); do
  if curl -fsS https://api.example.com/sync; then
    break
  fi
  if [ $i -eq $MAX_RETRIES ]; then
    echo "Failed after $MAX_RETRIES retries" >&2
    exit 1
  fi
  sleep $RETRY_DELAY
done

6. Validate Output, Not Just Exit Codes

A backup script that creates a 0-byte file technically "succeeds." Check that the output actually makes sense.

#!/bin/bash
set -euo pipefail

pg_dump mydb > backup.sql

# Validate the backup is not empty
MIN_SIZE=1000  # bytes
ACTUAL_SIZE=$(stat -f%z backup.sql 2>/dev/null || stat -c%s backup.sql)

if [ "$ACTUAL_SIZE" -lt "$MIN_SIZE" ]; then
  echo "Backup suspiciously small: ${ACTUAL_SIZE} bytes" >&2
  exit 1
fi

7. Stagger Your Schedules

Scheduling everything at midnight or on the hour creates resource spikes. Stagger jobs to spread the load.

# Bad: everything at midnight
0 0 * * * /path/to/backup.sh
0 0 * * * /path/to/cleanup.sh
0 0 * * * /path/to/report.sh

# Better: staggered
0 0 * * * /path/to/backup.sh
15 0 * * * /path/to/cleanup.sh
30 0 * * * /path/to/report.sh

8. Use Absolute Paths Everywhere

Cron does not load your shell profile. Relative paths, aliases, and user-specific PATH entries do not exist.

# Bad
python backup.py

# Good
/usr/local/bin/python3 /home/deploy/scripts/backup.py

9. Separate Concerns: One Job, One Task

A single cron job that does backups AND sends reports AND cleans up old files is hard to debug, hard to monitor, and hard to maintain. Split them into separate jobs with separate monitoring.

10. Monitor with a Dead Man's Switch

Add a check-in at the end of every critical cron job. If the check-in does not arrive, you get alerted. This catches every failure mode: crashes, hangs, timeouts, and permission errors.

#!/bin/bash
set -euo pipefail

# Do the work
/path/to/actual-work.sh

# Check in with monitoring
curl -fsS --retry 3 https://cronguard.app/api/ping/your-monitor-id

Frequently asked questions about cron job best practices

How do I stop two runs of the same cron job overlapping? Use a lock file. If a job takes longer than its interval you get two instances running simultaneously, which causes data corruption, duplicate processing, and race conditions. A script can check for a lock file and exit early if one exists, or you can wrap the command in the crontab with flock -n so only one instance ever runs.

Why should every bash cron script start with set -euo pipefail? Without it, bash scripts continue executing after errors. A failed database connection does not stop the rest of the script from running with stale or missing data. With it, any error stops execution immediately.

What happens if a cron job hangs instead of failing? A job that hangs indefinitely is worse than one that fails, because at least a failed job can be detected. A hung job just sits there, holding resources and blocking the next run. Set explicit timeouts — wrap the job in timeout so it is killed after a fixed duration, and give curl operations a max time and connect timeout.

Is a successful exit code enough to know a backup worked? No. A backup script that creates a 0-byte file technically succeeds, so you should validate the output rather than trusting the exit code alone. Check the size of the file you produced and exit with an error if it is suspiciously small.

Why stagger cron schedules instead of running everything at midnight? Scheduling everything at midnight or on the hour creates resource spikes. Spreading the jobs out — for example running the backup at the top of the hour, the cleanup fifteen minutes later, and the report fifteen minutes after that — spreads the load instead.

10 Cron Job Best Practices Every Engineer Should Follow

Beyond the Crontab

1. Use Lock Files to Prevent Overlap

2. Always Use set -euo pipefail

3. Log Everything with Timestamps

4. Set Explicit Timeouts

5. Handle Temporary Failures with Retries

6. Validate Output, Not Just Exit Codes

7. Stagger Your Schedules

8. Use Absolute Paths Everywhere

9. Separate Concerns: One Job, One Task

10. Monitor with a Dead Man's Switch

Frequently asked questions about cron job best practices

Further reading

Related posts

Set up your first monitor.
It'll take 30 seconds.

Beyond the Crontab

1. Use Lock Files to Prevent Overlap

2. Always Use set -euo pipefail

3. Log Everything with Timestamps

4. Set Explicit Timeouts

5. Handle Temporary Failures with Retries

6. Validate Output, Not Just Exit Codes

7. Stagger Your Schedules

8. Use Absolute Paths Everywhere

9. Separate Concerns: One Job, One Task

10. Monitor with a Dead Man's Switch

Frequently asked questions about cron job best practices

Further reading

Related posts

Set up your first monitor.It'll take 30 seconds.

Set up your first monitor.
It'll take 30 seconds.