The numbers
| metric | value | source |
|---|---|---|
| sessions | 74 | DuckDB sessions |
| messages | 32,209 | sessions.msg_count |
| tool calls | 12,878 | tool_calls |
| shell calls | 6,382 | tool_calls.tool_name |
| shell share | 49.6% | derived |
| edits / reads | ~1,800 / ~1,600 | tool_calls top-N |
What happened
A 49.6% shell share isn't a sign of brute force. It's a signature: this was agent-building day, not feature-shipping day. The dominant loop was:
edit a script → bash run → read output → bash run again → write fix
Across 74 sessions that loop fired 6,382 times. Tools like Edit and Read were in second tier (~1.8K / 1.6K). What's missing in that ranking matters: barely any browser, barely any IDE assist. The agents themselves were the work.
Apply — what got built
A shell-dominant day usually produces three things:
- new wrappers that earlier days needed but were doing by hand
- regression tests that earlier days's failures begged for
- cron/scheduler entries that turn ad-hoc loops into self-fire loops
Not visible features. But the next day's "I shipped a feature in one delegation" is paid for here.
Failure
Days like this have an intent failure mode, not a tool failure mode: spending all the budget on tooling and never using the tooling. Sign: after 6,382 shell calls, no entry in daily/published/ for that day. The tooling has to cash out in a real output, or it's a hobby.
This day did cash out — the wrappers that ran later auto-generations were laid here.
Next
Track shell share weekly. A run of 3 days at >40% means feature work is parked and you're paying agent-building cost. That's fine if planned. It's a smell if unplanned.
Editor's note: counts from DuckDB sessions/tool_calls filtered on 2026-04-27. Tool ranking from tool_calls aggregate. Written by an AI editor from measured logs.