Different Paradigms, Same Goal
Traditional crontab runs scripts on a single server. Kubernetes CronJobs run containers in a cluster. The scheduling syntax is similar, but everything else is different: how jobs execute, how they fail, and how you monitor them.
Basic Kubernetes CronJob
apiVersion: batch/v1
kind: CronJob
metadata:
name: daily-backup
spec:
schedule: "0 2 * * *"
concurrencyPolicy: Forbid
successfulJobsHistoryLimit: 3
failedJobsHistoryLimit: 3
jobTemplate:
spec:
backoffLimit: 3
activeDeadlineSeconds: 3600
template:
spec:
restartPolicy: OnFailure
containers:
- name: backup
image: myregistry/backup:latest
command: ["/bin/sh", "-c", "./backup.sh"]
env:
- name: DATABASE_URL
valueFrom:
secretKeyRef:
name: db-credentials
key: url
Key Differences
Isolation
Crontab jobs share the host's filesystem, environment, and resources. Kubernetes CronJobs run in isolated containers with their own filesystem, network namespace, and resource limits.
This is both a feature and a source of bugs. Jobs cannot accidentally interfere with each other (good), but they also cannot access local files without explicit volume mounts (confusing at first).
Concurrency Control
Crontab has no built-in concurrency control. If a job takes longer than its interval, you
get overlapping instances. Kubernetes offers concurrencyPolicy:
- Allow - multiple jobs can run simultaneously (default)
- Forbid - skip the new job if the previous one is still running
- Replace - kill the running job and start the new one
Retry Logic
Crontab does not retry. If a job fails, it fails. Kubernetes CronJobs support
backoffLimit and restartPolicy: OnFailure for automatic retries
with exponential backoff.
Resource Limits
resources:
requests:
cpu: 100m
memory: 256Mi
limits:
cpu: 500m
memory: 512Mi
Kubernetes lets you cap CPU and memory per job. Crontab jobs can consume unlimited resources unless you use external tools like ulimit or cgroups.
New Failure Modes in Kubernetes
1. Image Pull Failures
If the container image cannot be pulled (registry down, credentials expired, image deleted), the job never starts. This does not exist in crontab where scripts are local files.
2. Pod Scheduling Failures
If the cluster lacks resources to schedule the pod, the job waits in Pending state. It might
eventually run, or it might be skipped entirely depending on
startingDeadlineSeconds.
3. Node Failures During Execution
If the node running your job dies mid-execution, Kubernetes may or may not restart the job on another node, depending on your configuration.
4. Missed Schedules
If the CronJob controller is overwhelmed or the cluster is under heavy load, scheduled runs
can be missed entirely. Set startingDeadlineSeconds to control how late a job
can start before being considered missed.
Monitoring Kubernetes CronJobs
Kubernetes provides more visibility than crontab (pod status, events, logs), but it also introduces more failure modes. External monitoring remains essential.
kubectl Monitoring
# Check CronJob status
kubectl get cronjobs
# See recent job runs
kubectl get jobs --sort-by=.status.startTime
# Check for failed pods
kubectl get pods --field-selector=status.phase=Failed
Dead Man's Switch
The same pattern works for Kubernetes CronJobs. Add a check-in at the end of your container's entry script:
#!/bin/sh
set -e
# Do the work
python process_data.py
# Check in with external monitoring
curl -fsS --retry 3 https://cronguard.app/api/ping/your-monitor-id
Migration Checklist
When moving from crontab to Kubernetes CronJobs:
- Set
concurrencyPolicy: Forbidif overlap is dangerous - Configure
activeDeadlineSecondsto prevent hung jobs - Set
backoffLimitfor automatic retries - Add resource requests and limits
- Mount secrets and config as environment variables
- Keep external monitoring (dead man's switch) regardless of Kubernetes-native monitoring
- Set
startingDeadlineSecondsto detect missed schedules
Conclusion
Kubernetes CronJobs offer better isolation, concurrency control, and retry logic compared to crontab. But they also introduce container-specific failure modes: image pulls, scheduling delays, and missed runs. External monitoring through a dead man's switch remains the most reliable way to ensure your scheduled tasks actually complete, regardless of the platform.