People expect “24/7” to mean: I schedule a cron job and my agent reliably does it forever.
In practice, cron reliability depends on three things:
- The gateway is always running (daemon/service, not a one-off terminal session).
- The cron scheduler is healthy (no regressions, correct timezone, persistent storage).
- Your delivery strategy is robust (announce vs explicit sends vs webhooks).
This guide focuses on repeatable diagnostics and designs that leave evidence.
What this guide helps you finish
By the end, you should be able to:
- prove the gateway is actually capable of 24/7 scheduling,
- verify that cron runs executed instead of merely re-scheduling,
- choose a delivery pattern that leaves durable evidence,
- and confirm the workflow survives a restart or upgrade.
Who this is for (and not for)
Use this guide if you already have OpenClaw installed and you want reliable scheduled behavior on a gateway, VPS, Docker host, or always-on workstation.
This is not the first page to read if the gateway itself is still unstable or not installed as a long-lived service.
If your OpenClaw install is not yet stable, do that first:
Before you trust a schedule: collect these five facts
Before changing cron jobs, confirm these five facts on the gateway host:
- What process keeps the gateway alive after terminal exit or reboot?
- Where does gateway state live, and is it persisted across restart/redeploy?
- What timezone the scheduler is using?
- How the job reports success today: announce, explicit send, webhook, or artifact only?
- What artifact proves the last successful run actually happened?
Those facts prevent the classic mistake of debugging prompt content when the real problem is uptime, state, or delivery.
0) First principles: cron runs inside the gateway
Cron is executed by the gateway process. If the gateway is down, sleeping, or constantly restarting, cron can’t be reliable.
Baseline checks (gateway host):
openclaw status --deep
openclaw logs --follow
If you are on WSL2, ensure systemd is enabled so the gateway stays alive:
If you are on Docker, ensure the state directory is persisted and writable:
1) Prove cron is running (and see its history)
List jobs:
openclaw cron list
Inspect runs for a job:
openclaw cron runs --id <job-id>
If nextRunAtMs advances but runs stays empty, that’s not “your prompt is bad” — it’s usually scheduler/timezone/storage.
Dedicated fix page:
2) The most common cron failure modes (and what to do)
2.1 Cron never executes (runs stay empty)
Symptoms:
- Job looks enabled,
nextRunAtMskeeps moving, but no runs are recorded.
Fix:
- Upgrade past known scheduler regressions, restart gateway, verify timezone and cron storage.
- Follow: /troubleshooting/solutions/cron-jobs-not-firing-next-run-advances
2.2 Cron executes, but announce delivery fails
Symptoms:
- Runs exist, but you see
cron announce delivery failed(often on Telegram delivery).
High-value workaround:
- Disable announce delivery and send explicitly inside the run (or switch to webhook delivery).
Fix page:
2.3 Cron delivers only a summary (not full output)
Symptoms:
- A multi-line report used to arrive; now you only get a short summary.
Fix page:
2.4 Isolated jobs fail immediately on tools.function validation
Symptoms:
- The cron job exists and runs are attempted, but isolated execution fails immediately with
Field required/tools.function.
Fix page:
2.5 Isolated agentTurn jobs enqueue, then stall or time out under default concurrency
Symptoms:
openclaw cron run <job-id>says the run was accepted/enqueued.- A session stub may appear, but the real transcript/run never materializes.
openclaw cron runs --id <job-id>stays empty or only shows timeouts.- Logs can include lane-wait messages while using
sessionTarget: "isolated"+payload.kind: "agentTurn".
High-value workaround:
- Raise cron concurrency:
{
cron: {
maxConcurrentRuns: 2,
},
}
- Restart the gateway and retest:
openclaw gateway restart
openclaw cron run <job-id>
openclaw cron runs --id <job-id> --limit 20
Why this helps: the documented default is still maxConcurrentRuns: 1, but isolated agentTurn jobs can end up waiting on work queued behind the cron lane itself. Raising concurrency to 2 removes that self-blocking path in affected versions.
If you cannot change concurrency right now, move the job to sessionTarget: "main" + systemEvent temporarily so it avoids the isolated execution path while you validate the rest of the workflow.
3) Design cron prompts that always leave evidence
The biggest quality-of-life upgrade is: stop treating delivery as the only “proof” that the job ran.
Recommended patterns:
- Write a timestamped artifact (file) into the workspace each run.
- Send an explicit message with the artifact path + a short summary.
- Use a webhook for durable delivery when channels are flaky.
If you ever see “it said it saved a file but nothing exists”, fix workspace/persistence first:
4) Heartbeat expectations (why “it’s lazy” happens)
Heartbeat is not magic free will — it is a scheduling mechanism and still depends on:
- gateway uptime
- provider availability (rate limits)
- stable state directory
If your cron/heartbeat runs but the model silently fails, use:
5) Upgrade-safe automation
Upgrades and redeploys often “break cron” indirectly by wiping state or changing the runtime binary.
Do these once:
- Persist state (
~/.openclaw/or$OPENCLAW_STATE_DIR) - Pin gateway auth via env vars (so restores are reproducible)
- Practice a restore drill
Useful references:
6) Verification checklist after the first real week
Do not call cron “reliable” after one green run. Verify all of these:
- at least two scheduled runs created visible artifacts,
- the runs appear in
openclaw cron runs --id <job-id>, - delivery succeeded or the artifact path was still recoverable without delivery,
- a gateway restart did not wipe auth, state, or run history,
- and one failure path still left evidence you could inspect later.
7) What to tighten first when cron feels flaky
If the workflow still feels unreliable, tighten in this order:
- gateway uptime and restart policy,
- state persistence and restore reproducibility,
- delivery design,
- then prompt quality and output formatting.
That order keeps you from tuning the least important layer first.