Reducing OpenClaw token usage

TL;DR Link to heading

I had reduced my OpenClaw heartbeat cadence a few weeks ago expecting a big drop in LLM credit usage. It barely moved the bill. Digging into the per-call usage logs, I found that every persistent session had compactionCount: 0, so each cron run and group chat reply was replaying its own growing transcript on top of a 12k-character bootstrap. The fixes that helped were elsewhere: a 500 KB compaction trigger, deduplicated overlap between the heartbeat and a daily cron, and a reset of the stale sessions. Together they took my daily token total from ~2.2M to ~256k. Slowing the heartbeat to once a day contributed a small share.

Motivation Link to heading

I had reduced the heartbeat cadence weeks earlier expecting credit usage to fall in proportion. It did not, and the bill kept growing. I had been treating the heartbeat as the dominant cost without checking. The next step was to find out which session was burning the tokens.

What the trajectory files told me Link to heading

The per-event session transcripts at ~/.openclaw/agents/main/sessions/<sid>.jsonl are just the user/assistant message stream. The numbers I needed were one file over: <sid>.trajectory.jsonl. Every model.completed event has a data.usage object with input, cacheRead, output, and total.

import json, glob
from collections import defaultdict

base = "/path/to/agents/main/sessions"
day_key = defaultdict(lambda: defaultdict(lambda: [0, 0, 0, 0]))

for fp in glob.glob(f"{base}/*.trajectory.jsonl"):
    for line in open(fp):
        e = json.loads(line)
        if e.get("type") != "model.completed":
            continue
        u = e.get("data", {}).get("usage") or {}
        d = e.get("ts", "")[:10]
        sk = e.get("sessionKey", "?")
        day_key[d][sk][0] += u.get("input", 0)
        day_key[d][sk][1] += u.get("cacheRead", 0)
        day_key[d][sk][2] += u.get("output", 0)
        day_key[d][sk][3] += 1

Aggregating by sessionKey made the picture clear. The heartbeat was not the biggest spender. Four daily personal-routine crons plus a handful of group chat replies each accounted for more tokens than the heartbeat ticks I had been pruning. Each call at frontier-model pricing averaged about 160k tokens, mostly in the cacheRead column.

Every session was leaking credits Link to heading

I checked compactionCount for every persistent session in agents/main/sessions/sessions.json. All of them sat at 0. They had been alive for days or weeks, accumulating message history, and had never been summarised. By default, OpenClaw triggers compaction only when the prompt nears the model’s token limit. My context window was large enough that this trigger never fired in normal use, so each call replayed the whole growing transcript.

The schema for agents.defaults.compaction has a knob for this: maxActiveTranscriptBytes. Left unset, it does no transcript-size-driven compaction. The other knob worth turning on is memoryFlush.enabled with forceFlushTranscriptBytes, which runs a memory-extraction pass to save key facts to a separate file before the compactor throws history away.

The duplication I missed Link to heading

One section of my HEARTBEAT.md said “if no question has been sent today, send one” for a daily-question routine. A separate cron job also fired at 10:00 to send the same question. The cron payload referenced HEARTBEAT.md as a pre-read. So:

The cron evaluated the same instructions and sent the question.
Every subsequent heartbeat tick that day re-read HEARTBEAT.md, checked whether the question had been sent, and replied HEARTBEAT_OK.

Each of those HEARTBEAT_OK ticks still loaded the workspace bootstrap, the full session transcript, and tools to inspect the relevant state files before deciding to do nothing. With the heartbeat firing four times a day, the no-op evaluations cost more than the question itself.

I split ownership: the cron owns the once-a-day outbound question, and the telegram group’s systemPrompt owns the reactive “update the state files when a reply lands” half. The cron payload became self-contained with an idempotency check against the question log. HEARTBEAT.md shrank to one line.

The changes Link to heading

In openclaw.json:

"agents": {
  "defaults": {
    "heartbeat": {
      "every": "24h"           // was "4h"
    },
    "compaction": {
      "truncateAfterCompaction": true,
      "notifyUser": true,
      "maxActiveTranscriptBytes": "500kb",
      "memoryFlush": {
        "enabled": true,
        "forceFlushTranscriptBytes": "1mb"
      }
    }
  }
}

In the cron payload, removed the HEARTBEAT.md pre-read and inlined the question-asking rules with an idempotency check. In the telegram group systemPrompt, added the reactive update rules. In HEARTBEAT.md, deleted everything except the active-threads scan line.

I then reset the heartbeat and the cron sessions so the new prompts took effect immediately rather than waiting for the live transcripts to roll over. The reset is the same procedure I wrote up here.

What happened Link to heading

The day before the changes, my VM consumed about 2.2M tokens across 14 model calls. The first full day after, it consumed 256k tokens across 7 calls. Per-call averages dropped from roughly 160k tokens to 35k. The cacheRead column, which had been growing as transcripts accumulated, fell about 7x.

None of the sessions has compacted yet. With the heartbeat at one tick a day and crons running once a day each, a fresh session takes a while to reach 500 KB. The runtime now has a trigger to act on whenever a session does reach that threshold, instead of letting the transcript grow until the context window can no longer fit it.

Lessons Link to heading

Two things took me longer than they should have. I assumed heartbeat frequency was the dominant cost, and it was not; once I aggregated by sessionKey, the distribution showed up. I also assumed compaction was a thing that “just happens”, and it is not. Without a transcript-size trigger, I was relying on the model’s context window pressure to do compaction for me, and a generous context window means that pressure may never arrive.

Update, two weeks later Link to heading

The numbers above held for about twelve days. Daily totals tracked between 150k and 300k tokens, the weekly cap looked manageable, and I assumed the fix had stuck. Then the cap hit 25% with five days of the week still to go.

The culprit was the heartbeat again, but the failure mode was different. To slow it down I had at one point removed the agents.defaults.heartbeat block from openclaw.json entirely, on the assumption that “no config” meant “no heartbeat”. That assumption was wrong. With no block present, OpenClaw falls back to a built-in 30-minute default. After I restarted the gateway, it picked up that default, and the heartbeat fired 48 times a day. On the day I noticed, it had consumed about 2.5M tokens, more than the entire preceding week combined.

The fix was to put the block back with an explicit interval too long to matter:

"heartbeat": {
  "every": "7d"
}

OpenClaw clamps any interval past Node’s setTimeout ceiling (about 24.85 days) and logs a warning, but in practice the heartbeat now fires at most once a week.

What I take from this: removing a config block did not disable the feature. It fell back to the schema default. To turn something off I now set the knob explicitly. I also missed it because I never checked the gateway logs after the removal; the line that would have told me (heartbeat: started intervalMs:1800000) was in /tmp/openclaw/ from the start, I just hadn’t looked.

The 2026.5.28 release notes list “bound heartbeat defaults” as a fix, so this behaviour appears to be addressed upstream, though I have upgraded only recently and haven’t observed long enough to confirm. There is also an open issue (openclaw/openclaw#85025) tracking the broader pattern of opt-in safety knobs around session and compaction defaults; I added a comment on it after hitting the byte-size compaction trigger never firing on codex-backed sessions.