Skip to content

docs: enrich OOM-fix streaming-refactor plan + gitignore fix#144

Merged
aksOps merged 1 commit into
mainfrom
docs/enrich-oom-plan
May 13, 2026
Merged

docs: enrich OOM-fix streaming-refactor plan + gitignore fix#144
aksOps merged 1 commit into
mainfrom
docs/enrich-oom-plan

Conversation

@aksOps
Copy link
Copy Markdown
Contributor

@aksOps aksOps commented May 13, 2026

Summary

Lands the comprehensive implementation plan for fixing the `codeiq enrich` OOM at `~/projects/` scale (49k files / 434k nodes — exit 137 on 15 GB host).

Plan path: `docs/superpowers/plans/2026-05-13-enrich-oom-fix.md` (829 lines).

Plan structure

Phase Tasks Risk Expected impact
A — Quick wins A1 (parse once/file), A2 (bounded goroutine pool), A3 (cap Kuzu BufferPoolSize), A4 (free GraphBuilder maps) low ~/projects peak RSS 9-15 GB → 2-4 GB
B — TreeCursor migration B1 (rewrite parser.Walk) medium 90%+ allocation reduction
C — Streaming three-pass refactor C1-C5 (interfaces, index pass, linker pass, load pass, cutover) medium-high memory-bounded by construction; scales to 10M+ nodes
D — Verification harness D1 (perf-gate CI bench), D2 (real-world acceptance) low locks the regression bar in CI

Ralph-loop ready

The plan includes a §"Ralph-loop execution recipe" with:

  • Completion promise: `OOM FIXED on ~/projects/` gated by 6 acceptance criteria
  • Per-iteration recipe (graphviz DOT diagram)
  • Progress file format at `.claude/oom-fix-progress.md`
  • Inter-task / inter-phase semantics (commits per task, one PR per phase)
  • Failure / blocker handling (same-step-twice → mark blocked, don't retry)

.gitignore fix

`docs/superpowers/` was ignoring the `plans/` directory itself, so the later `!docs/superpowers/plans/.md` negation couldn't re-include files. Added `!docs/superpowers/plans/` to un-ignore the directory before un-ignoring the .md files.

Research provenance

Four parallel research agents informed this plan:

  • Code-walk: memory ownership through every enrich stage
  • Empirical pprof on airflow target: 91% allocations from `tree-sitter.cachedNode`
  • Kuzu API research: no streaming/Appender API exists; chunked COPY FROM + BufferPool cap are the levers
  • ETL patterns: ID-only dedup + compact NodeIndex is the surgical streaming pattern

Out of scope

Plan §"Out of scope" lists:

  • Duplicate-PK service IDs (`service:checkbox`, `service:src`)
  • CSV escape bug in BulkLoadEdges (commas in JSON properties)
  • Kuzu version upgrade
  • Distributed enrich

Each is its own future PR.

Test plan

  • Plan file lands at `docs/superpowers/plans/2026-05-13-enrich-oom-fix.md`
  • `.gitignore` change makes `docs/superpowers/plans/*.md` trackable (verified via `git check-ignore`)
  • No code changes in this PR — pure docs + gitignore

Adds the comprehensive implementation plan for fixing the
`codeiq enrich` OOM at ~/projects/ scale (49k files / 434k nodes,
exit 137 on 15 GB host).

Plan structure: 4 phases (Quick wins -> TreeCursor -> Streaming
three-pass refactor -> Verification harness), 12 tasks total, each
shippable as one PR. Includes a ralph-loop execution recipe so the
loop can drive the plan to completion without human gates inside
each phase.

Research backing the plan:
- Empirical pprof on airflow (9,151 files): 91% allocations from
  tree-sitter.(*Tree).cachedNode, peak RSS 3.8 GB
- Trajectory: istio 1.1 GB / airflow 3.8 GB / ~/projects ~9-15 GB
- Code walk: GraphBuilder dedup maps never released after Snapshot;
  ServiceDetector emits 434k CONTAINS edges
- Kuzu defaults: 80% of system RAM for buffer pool, no streaming
  Appender API in v0.7.1-v0.11.3 (issue #2739)
- ETL patterns: ID-only dedup + compact NodeIndex for linkers
  (drops Properties/Annotations) fits ~35 MB at scale

Completion criterion (from plan):
  /usr/bin/time -v codeiq enrich ~/projects/
must complete with peak RSS < 4 GiB, exit 0, populated graph.

Also fixes the .gitignore quirk that was preventing
`docs/superpowers/plans/*.md` from being trackable:
`docs/superpowers/*` ignored the plans/ directory itself, so the
later `!docs/superpowers/plans/*.md` negation couldn't take effect.
Added `!docs/superpowers/plans/` to un-ignore the directory first.
@aksOps aksOps merged commit 9994900 into main May 13, 2026
13 checks passed
@aksOps aksOps deleted the docs/enrich-oom-plan branch May 13, 2026 12:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant