perf(enrich): Phase A quick wins for OOM fix by aksOps · Pull Request #145 · RandomCodeSpace/codeiq

aksOps · 2026-05-13T13:01:08Z

Summary

Phase A of the enrich OOM fix plan (docs/superpowers/plans/2026-05-13-enrich-oom-fix.md). Four surgical fixes that target the actual hot spots pprof exposed on a real-world polyglot Python target (airflow, 9,151 files → 3.8 GB peak RSS; trajectory extrapolates to OOM at ~/projects/ scale).

Tasks landed

Task	What	Commit
A1	Parse tree-sitter tree once per file, not once per node. Adds `ExtractFromTree(ctx, tree, nodes) []Result` to `LanguageExtractor`; all 4 language extractors implement it; enricher.go parses once per file. Cuts the 91% `tree-sitter.(*Tree).cachedNode` hot spot pprof flagged.	`60d02d9`
A2	Bound the enricher goroutine pool to `2 * GOMAXPROCS`. Caps simultaneously-live trees + file content strings. New test `TestEnricher_BoundedConcurrency` drives 4×cap files through a tracking extractor; asserts peak in-flight ≤ cap.	`21f07d8`
A3	Cap Kuzu `BufferPoolSize` (default 2 GiB) and `MaxNumThreads` (`min(4, GOMAXPROCS)`) via new `OpenOptions` + `OpenWithOptions`. Default `kuzu.DefaultSystemConfig()` reserves 80% of system RAM as buffer pool — ~12 GiB on a 15 GiB host.	`e311d99`
A4	`GraphBuilder.Snapshot()` nils its dedup maps before returning. Frees ~280 MB of duplicate references that previously coexisted with the snapshot slices through the rest of the enrich pipeline. New `TestSnapshotReleasesDedupMaps`.	`3170fe3`

Expected impact

Per the plan's success criterion: `~/projects/` peak RSS should drop from 9-15 GB (OOM-killed at exit 137) to ~2-4 GB. Real-world verification will run once Phase B + C land too.

Test plan

`go test ./... -count=1` — 876 pass (one new bounded-concurrency test added on top of 875)
`fixture-minimal` index → enrich → stats: identical 45 nodes / 68 edges / 1 service output vs pre-Phase-A
`go vet ./...` clean
CI on this PR

Next phases

This PR is Phase A of 4 from the plan. Phase B (TreeCursor migration), Phase C (streaming three-pass refactor), Phase D (perf-gate CI + real-world acceptance) ship as separate PRs after A merges.

…ask A1) Each LanguageExtractor.Extract reparsed the source file at its top — on Python at ~13 nodes/file that meant ~13x over-parse. pprof on airflow flagged 91% of total allocations from tree-sitter. (*Tree).cachedNode driven by the per-node re-parse storm. Adds ExtractFromTree(ctx, tree, nodes) []Result to the LanguageExtractor interface. The orchestrator now parses the file once and calls ExtractFromTree(tree, allNodes) — the AST is walked multiple times for distinct node-kinds but never re-parsed. Extract is retained as a thin wrapper for single-node convenience callers and tests. Plan: docs/superpowers/plans/2026-05-13-enrich-oom-fix.md Task A1. Per-file caches: matchAllList (py), matchInterfaceAssertion (go), collectExports (ts) are computed once per file rather than once per matching node. Verification: - go test ./internal/intelligence/extractor/... -count=1: 28 pass - go test ./... -count=1: 875 pass

Previously the enricher spawned one goroutine per source file with no cap. On polyglot Python repos (airflow: 7,456 files) that produced 7k+ concurrent live tree-sitter Trees + file content strings, driving the OOM-prone RSS spike pprof exposed. Adds a semaphore-bounded fan-out at 2*runtime.GOMAXPROCS(0). Tasks still write to indexed slots, so determinism (sorted file path order) is preserved. Polyglot real-world targets see materially lower peak RSS at no measurable wall-time cost. Plan: docs/superpowers/plans/2026-05-13-enrich-oom-fix.md Task A2. Verification: - New TestEnricher_BoundedConcurrency asserts peak in-flight calls <= 2*GOMAXPROCS by driving 4*cap files through a tracking extractor. - go test ./... -count=1: 876 pass.

…sk A3) kuzu.DefaultSystemConfig() allocates 80% of system RAM as the buffer pool (~12 GiB on a 15 GiB host) before any enrich work runs. Combined with Go-side enricher memory that's enough to OOM the process. The default also allocates full GOMAXPROCS worth of internal threads, amplifying COPY-side working set. Adds OpenOptions struct + OpenWithOptions(path, opts). Open(path) now applies safe defaults via OpenWithOptions(path, OpenOptions{}): - BufferPoolBytes: 2 GiB (DefaultBufferPoolBytes) - MaxThreads: min(4, GOMAXPROCS) OpenReadOnly is unchanged externally (same signature) but routes through OpenWithOptions internally — read paths inherit the same buffer pool cap (2 GiB is plenty for read-side caching at our graph scale). Plan: docs/superpowers/plans/2026-05-13-enrich-oom-fix.md Task A3. Future polish: surface --max-buffer-pool and --copy-threads CLI flags for power-user tuning (deferred). Verification: - go test ./internal/graph/... -count=1: 44 pass - go test ./... -count=1: 876 pass

GraphBuilder.Snapshot extracted deduped nodes/edges into sorted slices but left builder.nodes and builder.edges maps holding references to the same objects. With the slices and maps coexisting for the rest of the enrich pipeline (~30 sec wall time on ~/projects/), ~280 MB of duplicate references stayed live needlessly. Clear the maps inside Snapshot before returning. Snapshot is now single-shot — calling it twice on the same builder returns an empty snapshot (acceptable; the only caller is analyzer.Enrich which calls once). Plan: docs/superpowers/plans/2026-05-13-enrich-oom-fix.md Task A4. Verification: - New TestSnapshotReleasesDedupMaps asserts both nodes + edges maps are nilled after Snapshot returns. - go test ./... -count=1: 876 pass (no regressions).

Stale doc references after Phase 6 (Java deletion, #132) and the Kuzu 0.7.1 → 0.11.3 bump (#155 + #159). - CLAUDE.md / PROJECT_SUMMARY.md: bump Kuzu 0.7.1 → 0.11.3, go-sqlite3 1.14.22 → 1.14.44, cobra to 1.10.2; note native FTS. - AGENTS.md: rewrite "What this repo is" (no more "REST API"); flip `mvn -B -ntp clean verify` → `go test ./...`; clarify that REST + React SPA were deleted in Phase 6 and won't return. - SECURITY.md: rewrite scope. Drop the dead JAR / serve / REST API / React UI / H2 / Neo4j Embedded references. New in-scope list covers every codeiq subcommand, the 10 MCP tools (with `run_cypher` mutation gate called out), `.codeiq/cache/` (SQLite) + `.codeiq/graph/` (Kuzu), and `read_file` path sandboxing. Add the security CI workflows (CodeQL, Semgrep, OSV-Scanner, Trivy, Gitleaks, SBOM, Socket Security) + perf-gate to the hardening references. - CHANGELOG.md: populate [Unreleased] with the OOM-fix saga (PRs #145-#148), the five correctness fixes (#149-#153), the Kuzu 0.7.1 → 0.11.3 bump (#155-#158), the FTS migration (#159), the Dependabot config rewrite (#154), and the enrich CLI knobs. No code changes. Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>

aksOps added 5 commits May 13, 2026 12:53

fix(enricher_test): remove unused rel assignment (staticcheck SA4006)

a6cbff7

aksOps merged commit 9f54673 into main May 13, 2026
13 checks passed

aksOps deleted the perf/enrich-oom-phase-a branch May 13, 2026 13:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf(enrich): Phase A quick wins for OOM fix#145

perf(enrich): Phase A quick wins for OOM fix#145
aksOps merged 5 commits into
mainfrom
perf/enrich-oom-phase-a

aksOps commented May 13, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

aksOps commented May 13, 2026

Summary

Tasks landed

Expected impact

Test plan

Next phases

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant