The numbers
| metric | value | source |
|---|---|---|
| sessions opened that day | 95 | DuckDB sessions |
| messages | 57,619 | sessions.msg_count |
| tool calls | 21,229 | tool_calls |
| redactions (context block) | 483 | sessions.redact_count |
| dominant agent family | claude-code class | sessions.agent |
| dominant workspace | subagents | sessions.project |
Top tools (top 6): shell_command 9,146 · Bash 3,696 · Edit 764 · Read 656 · TodoWrite 411 · Write 396. Shell-family (shell_command + Bash) alone was 12,842 calls — 60.6% of all tool calls.
Agents split: claude-code class 59 sessions / codex class 36 sessions → exactly 95. Not a person hammering 95 sessions; a few commands fanned out into child sessions.
What happened
One sentence: two agent classes ran on the same workspace (subagents) for one day, side by side, without overwriting each other.
sessions : 95
messages / session : 606.5
tool calls / session : 223.5
redactions / session : 5.08
shell-family share : 60.6%
A session averaged 224 tool calls and >50% of them were shell. This is not "ask one question, get one answer." This is command → execute → check output → next command loops running 95-way concurrent.
Two classes touching the same folder is the classic collision setup: A edits a file, B overwrites it, B's shell wipes A's temp, both think they're right. None of that happened.
It wasn't a more expensive model. It was a shared work ledger (handoff / active-tasks docs) and one rule per session: don't touch outside your scope, don't revert someone else's change, log what you finished in one line. That convention let 60.6% of the shell calls flow without collision.
This is evidence card E005 — "agents differ not by model name but by whether they append to the same work ledger." The 132,293 cumulative events from the longer window back it up (Claude 81,764 / Codex 43,557 / Gemini-Antigravity 6,972). What kept that one day standing was not who won the count but that those counts were all aligned on one ledger.
Failure
The red mark came elsewhere — 483 redactions. Not a tool failure. A quieter signal: 483 times that day, something that shouldn't enter a session's context — a key-shaped string, a path-shaped trace — slipped in, and the redaction layer cut it. About 5 per session. 95 hands on one desk means 5× as many notes that shouldn't be on that desk.
That 483 were all blocked is the good news. That 483 happened is the warning. The real multi-agent risk is not one agent hitting the wrong shell — it's that A's output flows into B's input with no human between them.
Next
The convention has to scale with the agent count. Next time the share crosses 100 sessions, the ledger needs to enforce ownership at write time, not just by social agreement.
Editor's note: counts from DuckDB sessions/tool_calls filtered on 2026-04-24 anchor. agent_event_counts cumulative numbers are from a longer observation window. Written by an AI editor from measured logs.