solution gateway critical macos linux windows

Compaction deadlock: /new and /reset hang — emergency out-of-band session reset

If compaction timeouts deadlock the session lane and even `/new` / `/reset` hang, recover by stopping the gateway and resetting the stuck session from disk (backup first).

By CoClaw Team

Symptoms

  • Your gateway is “up” (service running, channel polling reconnects), but it stops processing messages.
  • You see a typing indicator for a while, then nothing.
  • Recovery commands hang indefinitely because they queue behind compaction:
    • /new
    • /reset
    • openclaw acp ... --reset-session
  • Gateway logs show compaction starting repeatedly and timing out (for example 300s/600s timeouts), often triggered on every inbound message for the same session.

Cause

OpenClaw session processing is effectively single-lane for a given session key.

If compaction enters a failure loop (timeouts, rate limits, repeated retries), it can block the lane long enough that:

  • normal inbound messages cannot complete, and
  • administrative “in-band” resets (/new, /reset, --reset-session) can’t run either because they share the same lane.

At that point, the practical recovery is out-of-band: stop the gateway process and reset the stuck session from disk.

Fix (Emergency out-of-band reset)

Safety rule: Back up first. You’re going to touch files under your gateway state directory.

0) Make sure you are on the gateway host

If you run the gateway on a remote machine/container, SSH into that host first. The session files you need are on the gateway host, not on your laptop UI.

Default state dir is ~/.openclaw unless you set OPENCLAW_STATE_DIR.

1) Stop the gateway hard (do not rely on in-band commands)

Try a normal stop first:

openclaw gateway stop

If it doesn’t stop quickly, use your OS process manager and force-kill the gateway process. (The goal is: no OpenClaw process should be writing session files while you reset them.)

2) Back up the sessions directory

On macOS/Linux (default agent id main):

STATE_DIR="${OPENCLAW_STATE_DIR:-$HOME/.openclaw}"
AGENT_ID="main"
TS="$(date +%Y%m%d-%H%M%S)"

mkdir -p "$STATE_DIR/_recovery"
rsync -a "$STATE_DIR/agents/$AGENT_ID/sessions/" "$STATE_DIR/_recovery/sessions.backup.$TS/"

On Windows, copy the equivalent ...\\.openclaw\\agents\\main\\sessions\\ folder to a safe backup location (Explorer is fine).

3) Choose a reset strategy

Option A (fastest, most reliable): move the whole sessions folder aside

This resets all sessions for that agent (your channel will come back on a clean slate), while preserving a backup.

STATE_DIR="${OPENCLAW_STATE_DIR:-$HOME/.openclaw}"
AGENT_ID="main"
TS="$(date +%Y%m%d-%H%M%S)"

mv "$STATE_DIR/agents/$AGENT_ID/sessions" "$STATE_DIR/agents/$AGENT_ID/sessions.stuck.$TS"
mkdir -p "$STATE_DIR/agents/$AGENT_ID/sessions"

Option B (more targeted): reset only the stuck session key

Use this if you want to preserve other sessions, and you’re comfortable editing one JSON file.

  1. Open the store file:
  • ~/.openclaw/agents/<agentId>/sessions/sessions.json
  1. Find the stuck session key (common examples):
  • DM continuity: agent:main:main
  • Telegram DM with isolated scope: agent:main:telegram:dm:<peerId>
  1. Note the sessionId for that key, then:
  • Rename the transcript: ~/.openclaw/agents/<agentId>/sessions/<sessionId>.jsonl → something like <sessionId>.jsonl.reset.manual.<timestamp>
  • Delete the session key entry from sessions.json (deleting entries is safe; OpenClaw recreates them on demand).

4) Restart the gateway and immediately reset the chat

Restart your gateway service (LaunchAgent/systemd/Docker/etc), then send /new once in the affected chat/channel to ensure you’re on a fresh session.

Verify

  • The gateway responds quickly again:
    • openclaw health (or openclaw gateway status)
    • /status in chat returns promptly
  • The affected channel stops “typing forever”.
  • /new and /reset execute immediately again.

Prevention (so it doesn’t happen again)

  • Keep huge tool outputs out of the main session history. Write logs/results to files and send a short summary + links/paths.
  • Enable session maintenance so session stores don’t grow unbounded:
    • Run openclaw sessions cleanup --dry-run to preview impact.
    • Consider setting session.maintenance.mode: "enforce" with sane pruneAfter + maxEntries.
  • Enable tool-result pruning (Anthropic/OpenRouter Anthropic) to reduce toolResult bloat between compactions:
{
  agents: {
    defaults: {
      contextPruning: { mode: "cache-ttl", ttl: "5m" },
    },
  },
}
  • If you hit repeated compaction failures, do a proactive /new (or /compact) before the session becomes a recovery incident.

Verification & references

  • Reviewed by:CoClaw Code Team
  • Last reviewed:March 14, 2026
  • Verified on: macOS · Linux · Windows
Want to explore more? Browse all solutions or ask in the Community Forum .
Report a problem

Related Resources

Telegram: channel stops working because persisted session state JSON is bad
Fix
Recover a Telegram channel that looks mysteriously broken by identifying the bad persisted Telegram session-state JSON, backing it up, resetting only that file, and restarting the gateway.
Gateway fails to start: EADDRINUSE / another gateway instance is already listening
Fix
Fix OpenClaw gateway port conflicts (EADDRINUSE) by stopping the other process or choosing a new gateway port/profile.
Gateway crashes with EBUSY / EACCES / EPERM when `~/.openclaw` is cloud-synced
Fix
Fix gateway crashes caused by putting the live OpenClaw state directory inside iCloud Drive, OneDrive, Dropbox, Google Drive, or similar sync tools that briefly lock session/config files while uploading.
Build a Durable OpenClaw Project Brain in Obsidian or Plain Markdown
Guide
Turn plans, decisions, handoff notes, and session summaries into durable markdown files so OpenClaw work survives restarts, device changes, and sub-agent turnover.
Home Assistant + OpenClaw Offline Fallback Control: Build a Narrow Emergency Lane
Guide
Design a small fallback control path for Home Assistant and OpenClaw so key status checks and emergency actions still work when your main app, internet path, or cloud notification lane fails.
OpenClaw on Native Windows: A Gateway Stability Playbook for SecretRefs, RPC Timeouts, and Cron Drift
Guide
A practical Windows operator guide for keeping OpenClaw stable when CLI SecretRef warnings disagree with gateway health, RPC probes time out under load, and cron jobs need a safer baseline.