Skip to content

fix(hooks): persist tool stats from short-lived hook processes#1492

Merged
JeremyDev87 merged 1 commit into
masterfrom
fix/stats-tool-counter-zero
Apr 12, 2026
Merged

fix(hooks): persist tool stats from short-lived hook processes#1492
JeremyDev87 merged 1 commit into
masterfrom
fix/stats-tool-counter-zero

Conversation

@JeremyDev87
Copy link
Copy Markdown
Owner

Summary

PostToolUse hooks run in fresh, short-lived Python processes — each one creates a SessionStats, calls record_tool_call() exactly once, then exits. With the default flush_interval=10, that single increment lived only in the in-memory accumulator and was discarded at process exit. The disk file never advanced past tool_count: 0, so the Stop hook summary always read [CB] Xm | 0 tools | 0 errors no matter how many tools were used.

Root cause

  • hooks/lib/stats.py flush() only writes when _pending_count >= flush_interval.
  • hooks/post-tool-use.py records exactly once per process and never calls flush().
  • hooks/stop.py reads disk in a separate process whose own in-memory accumulator is empty, so its flush() is a no-op and format_summary() reports the stale 0.
  • tests/test_stats.py::test_data_persists_across_instances already had to call s1.flush() explicitly — the library required caller flushing but the real caller didn't do it.

Fix

  • Caller (post-tool-use.py): explicit stats.flush() right after record_tool_call() for immediate visibility (statusline, Stop summary).
  • Library (stats.py): track every live SessionStats in a module-level weakref.WeakSet and flush them via a single atexit handler. WeakSet keeps GC unchanged and avoids the per-instance handler leak that atexit.register(self.method) would cause when many instances are created in one process.
  • finalize(): drops the instance from the WeakSet so the post-finalize flush sweep does not resurrect the cleaned-up file.
  • Module docstring: documents the short-lived hook process pattern and the two safety nets.

Regression tests (TestShortLivedProcess)

  1. test_instance_added_to_live_set — newly constructed instances are tracked.
  2. test_record_persists_via_module_atexit_handler — records reach disk via _flush_all_pending() even when the caller forgets flush().
  3. test_finalize_removes_instance_from_live_set — finalized instances disappear and the file is not resurrected.
  4. test_only_one_atexit_handler_regardless_of_instance_count — creating 20 instances must not grow the atexit handler list (no leak).

Test plan

  • yarn workspace codingbuddy-claude-plugin lint — clean
  • yarn workspace codingbuddy-claude-plugin format:check — clean
  • yarn workspace codingbuddy-claude-plugin typecheck — clean
  • yarn workspace codingbuddy-claude-plugin test:coverage — 144/144
  • yarn workspace codingbuddy-claude-plugin circular — no cycles
  • yarn workspace codingbuddy-claude-plugin build — ok
  • python3 -m pytest tests/test_stats.py tests/test_post_tool_use.py tests/test_stop.py — 56/56
  • Security audit (codingbuddy + codingbuddy-claude-plugin + landing-page) — clean
  • End-to-end simulation: 5 short-lived processes (Bash, Edit, Read, Bash, Grep) → disk shows tool_count: 5, tool_names: {Bash:2, Edit:1, Read:1, Grep:1} and format_summary returns [CB] 0m | 5 tools | 0 errors | Bash:2 Edit:1 Read:1 Grep:1.

Note: pre-existing test_hud.py failures (◕‿◕ vs ◕ᴗ◕ unicode regression on master) are unrelated to this change and tracked separately.

Follow-ups (out of scope, separate issues recommended)

  1. flush() read-modify-write race: _locked_read and _locked_write release the lock between calls, allowing concurrent PostToolUse processes to lose updates.
  2. record_hook_timing is never called: defined in stats.py but no hook actually invokes it, so hook timing aggregation is dead code today.

PostToolUse hooks run in fresh Python processes that record exactly
one tool call before exiting. With flush_interval=10, that single
record_tool_call() never reached disk, leaving the Stop hook summary
permanently stuck at "[CB] Xm | 0 tools | 0 errors" no matter how
many tools were used in the session.

- post-tool-use.py: explicit stats.flush() right after record for
  immediate visibility (statusline, Stop summary).
- stats.py: track every live SessionStats in a module-level WeakSet
  and flush them via a single atexit handler. WeakSet keeps GC
  unchanged and avoids the per-instance handler leak that a naive
  atexit.register(self.method) would cause.
- finalize() drops the instance from the WeakSet so the post-finalize
  flush sweep does not resurrect the cleaned-up file.
- Module docstring documents the short-lived process pattern.
- Regression tests cover: WeakSet membership, persistence via the
  module flush handler, finalize() removal, and the no-handler-leak
  invariant (20 instances must not register 20 handlers).
@JeremyDev87 JeremyDev87 added the bug Something isn't working label Apr 12, 2026
@vercel
Copy link
Copy Markdown

vercel Bot commented Apr 12, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
codingbuddy-landing Ready Ready Preview, Comment Apr 12, 2026 0:32am

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant