Skip to content

feat(incremental): delta-aware index + enrich (skip-by-hash + manifest short-circuit)#174

Merged
aksOps merged 12 commits into
mainfrom
feat/incremental-indexing
May 15, 2026
Merged

feat(incremental): delta-aware index + enrich (skip-by-hash + manifest short-circuit)#174
aksOps merged 12 commits into
mainfrom
feat/incremental-indexing

Conversation

@aksOps
Copy link
Copy Markdown
Contributor

@aksOps aksOps commented May 15, 2026

Summary

  • Re-running codeiq index skips parse+detect for files whose content hash already lives in the cache (~5-10× faster on unchanged trees).
  • Re-running codeiq enrich short-circuits when the cache manifest hash matches what the graph stored on its last run (86 ms vs 582 ms full rebuild on fixture-multi-lang).
  • enrich is now safely re-runnable: a new Store.Reset() drops prior CodeNode+GraphMeta data before a full rebuild, so re-runs no longer collide on primary keys.
  • New codeiq diff subcommand previews the cache vs disk delta as JSON without touching anything; index --force and enrich --force/--diff flags surface the new behaviour.
  • Linker emissions (Topic / Entity / ModuleContainment) carry a source tag so future delta-apply work can wipe + re-emit them en bloc via Store.WipeLinkerEdges.

What's not in this PR (deferred)

  • True per-file delta-apply to the graph (use ReplaceFile per changed path instead of full rebuild). The primitives ship here — Store.RemoveFile / InsertFile / ReplaceFile are wired and tested — but the enrich orchestrator still does a full rebuild when the manifest mismatches. The short-circuit case (the common one) is fully delta-aware.
  • MCP-side freshness signals / auto-bootstrap — separate plans.

Phases (12 atomic commits)

Phase Commit Surface
1 feat(cache) GetFileByPath, AllFiles, PurgeByPath, ManifestHash
2 feat(analyzer): Diff Delta{Added,Modified,Deleted,Unchanged}
3 feat(analyzer): cache-hit early-exit Skip-by-hash + Options.Force + new Stats counters
4 feat(graph): RemoveFile/InsertFile/ReplaceFile per-file mutation APIs
5 feat(graph): GraphMeta + manifest ReadManifest/WriteManifest
6 feat(linker+graph): source-tag + WipeLinkerEdges linker re-emit primitives
7 feat(enrich): manifest short-circuit + Reset re-runnable enrich
8 feat(cli) index --force, enrich --force/--diff, codeiq diff
9 test(integration) incremental==full determinism, idempotence, delete-then-add

Test plan

  • CGO_ENABLED=1 go test ./... -count=1 — 937 passing across 44 packages
  • CGO_ENABLED=1 go test ./internal/analyzer/ ./internal/graph/ ./internal/cache/ -race -count=1 — 143 passing, no data races
  • go vet ./... — clean
  • Manual smoke on testdata/fixture-minimal:
    • First codeiq index .: 5 files Added, 0 cache hits
    • Second codeiq index .: 5 Unchanged, 100% cache hits
    • Second codeiq enrich .: "enrich short-circuited: graph already matches cache manifest"
    • codeiq enrich --force .: rebuilds cleanly without PK collisions
    • codeiq diff .: returns JSON with empty Added/Modified/Deleted and 5 Unchanged
  • New integration tests cover: incremental == full determinism, 3× idempotent re-runs short-circuiting, delete-then-recreate-with-same-content
  • Reviewer: try on a real polyglot project to verify the short-circuit timing claim at scale

Known follow-up (out of scope)

govulncheck flags 2 stdlib net/http CVEs (go1.26.2 → fix in 1.26.3) — toolchain bump, separate work.

@aksOps aksOps merged commit e7b0c26 into main May 15, 2026
13 checks passed
@aksOps aksOps deleted the feat/incremental-indexing branch May 15, 2026 08:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant