Skip to content

perf(parser): Phase B — TreeCursor migration for parser.Walk#146

Merged
aksOps merged 1 commit into
mainfrom
perf/enrich-oom-phase-b
May 13, 2026
Merged

perf(parser): Phase B — TreeCursor migration for parser.Walk#146
aksOps merged 1 commit into
mainfrom
perf/enrich-oom-phase-b

Conversation

@aksOps
Copy link
Copy Markdown
Contributor

@aksOps aksOps commented May 13, 2026

Summary

Phase B of the enrich OOM fix plan. Single task: rewrite `parser.Walk` from recursive `Node.Child(i)` to iterative TreeCursor traversal. Public API unchanged; all callers untouched.

Honest accounting

The plan estimated 90%+ allocation reduction from TreeCursor based on a generic claim that cursors avoid per-node allocation. In smacker/go-tree-sitter v0.x, `TreeCursor.CurrentNode()` still routes through `Tree.cachedNode` which heap-allocates a `*Node` on first visit per node. So Phase B does not drop allocations further from where Phase A already landed.

What this PR DOES deliver:

  • Removes Go-level recursion frames per descent (stack discipline).
  • Matches the canonical tree-sitter walking idiom (code clarity).
  • Preserves determinism — pre-order DFS visitation order matches the recursive form exactly.

The 91%-of-allocations `cachedNode` hot spot pprof flagged on airflow was driven by per-node re-parse. Task A1 fixed that by parsing once per file; the additional allocation savings from Phase B alone are within noise.

Why ship it anyway

  • Plan structure requires 4 phase PRs; this is Phase B.
  • Future binding upgrades (or a switch to a binding that exposes a non-allocating cursor API) would benefit from already being on the cursor path.
  • Code is shorter and matches tree-sitter's documented traversal idiom.

Test plan

  • `go test ./internal/parser/... ./internal/intelligence/extractor/...` — 39 pass across 6 packages
  • `go test ./... -count=1` — 877 pass (unchanged)
  • `go vet ./...` clean
  • CI on this PR

Replaces the recursive Node.Child(i) traversal with an iterative
tree-sitter TreeCursor walk. Matches the canonical tree-sitter idiom
and removes Go-level recursion frames per descent.

Honest accounting: smacker's *TreeCursor.CurrentNode() still routes
through Tree.cachedNode, so the per-visit *Node allocation is
unchanged. The 91%-of-allocations cachedNode hot spot pprof flagged
on airflow was driven by per-node *re-parse* (Task A1 fixed that by
parsing once per file). Phase B's structural change keeps the public
Walk(root, fn) API identical; callers and tests are untouched.

Plan: docs/superpowers/plans/2026-05-13-enrich-oom-fix.md Task B1.

Verification:
- go test ./internal/parser/... ./internal/intelligence/extractor/... pass
- go test ./... -count=1: 877 pass (unchanged from main)
- Determinism preserved: pre-order DFS visitation order matches the
  recursive form exactly.
@aksOps aksOps merged commit 548a5ec into main May 13, 2026
13 checks passed
@aksOps aksOps deleted the perf/enrich-oom-phase-b branch May 13, 2026 13:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant