refactor(graph): use Kuzu 0.11 native features (FTS, param LIMIT, []string)#159
Merged
Conversation
…tring) Kuzu 0.11.3 bundles features that were unavailable or broken in 0.7.1. This commit unwinds the workarounds documented in CLAUDE.md. ### FTS (fulltext search) `CreateIndexes()` was a no-op because Kuzu 0.7.1's FTS extension needed a network INSTALL (incompatible with air-gapped builds). 0.11.3 ships FTS pre-bundled. `CreateIndexes()` now: - `INSTALL fts; LOAD EXTENSION fts;` - `CALL DROP_FTS_INDEX` / `CALL CREATE_FTS_INDEX` for two indexes: - `code_node_label_fts` over `(label, fqn_lower)` - `code_node_lexical_fts` over `(prop_lex_comment, prop_lex_config_keys)` `SearchByLabel` / `SearchLexical` route through `CALL QUERY_FTS_INDEX` with BM25 score ranking. A trailing `*` is auto-appended when the user query is a single bare token, giving prefix-match UX similar to the old CONTAINS behaviour. CONTAINS-based fallbacks remain in place for graphs that never ran enrich (FTS index would be missing). The mutation gate (`MutationKeyword`) allows the read-only `CALL QUERY_FTS_INDEX` procedure; the catalog writers `CALL CREATE_FTS_INDEX` / `CALL DROP_FTS_INDEX` stay blocked under `OpenReadOnly`. ### Parameterized LIMIT / SKIP Kuzu 0.7.1 rejected `$lim` / `$skip` bindings — values had to be inline literals. 0.11.3 accepts them as bound parameters. Affected sites: - `internal/graph/indexes.go` — SearchByLabel / SearchLexical - `internal/graph/reads.go` — FindByKindPaginated - `internal/query/service.go` — FindCycles, FindDeadCode - `internal/mcp/tools_graph.go` — list-edges, ego-neighbours, endpoints-by-id Helper `intLiteral` is removed (was only used to format inline LIMITs). ### Drop `stringsToAny` widener Kuzu 0.7's Go binding required `[]any` for list parameters; `[]string` tripped `unsupported type` in `goValueToKuzuValue`. 0.11.3's binding accepts `[]string` directly. The widener helper is removed and its two callers (`query.FindDeadCode`, `topology.FindServicesContainingNodes`) pass `[]string` straight. ### CLAUDE.md Reworked the Kuzu quirks section into "lifted in 0.11.3" vs "still present" buckets so future contributors don't reintroduce workarounds that the runtime no longer needs. ### Verification - `cd go && CGO_ENABLED=1 go test ./... -count=1` — 883 passed - End-to-end on `~/projects/polyglot-bench/airflow`: enrich exit 0, 95k nodes, 246k edges, FTS search returns BM25-ranked hits - End-to-end on `~/projects/`: enrich exit 0, 187k nodes, 414k edges, 1m 29s wall, 1.88 GiB peak RSS FTS `'service*'` returns top-5 ranked at scores ~12-14 Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2 tasks
aksOps
added a commit
that referenced
this pull request
May 14, 2026
Stale doc references after Phase 6 (Java deletion, #132) and the Kuzu 0.7.1 → 0.11.3 bump (#155 + #159). - CLAUDE.md / PROJECT_SUMMARY.md: bump Kuzu 0.7.1 → 0.11.3, go-sqlite3 1.14.22 → 1.14.44, cobra to 1.10.2; note native FTS. - AGENTS.md: rewrite "What this repo is" (no more "REST API"); flip `mvn -B -ntp clean verify` → `go test ./...`; clarify that REST + React SPA were deleted in Phase 6 and won't return. - SECURITY.md: rewrite scope. Drop the dead JAR / serve / REST API / React UI / H2 / Neo4j Embedded references. New in-scope list covers every codeiq subcommand, the 10 MCP tools (with `run_cypher` mutation gate called out), `.codeiq/cache/` (SQLite) + `.codeiq/graph/` (Kuzu), and `read_file` path sandboxing. Add the security CI workflows (CodeQL, Semgrep, OSV-Scanner, Trivy, Gitleaks, SBOM, Socket Security) + perf-gate to the hardening references. - CHANGELOG.md: populate [Unreleased] with the OOM-fix saga (PRs #145-#148), the five correctness fixes (#149-#153), the Kuzu 0.7.1 → 0.11.3 bump (#155-#158), the FTS migration (#159), the Dependabot config rewrite (#154), and the enrich CLI knobs. No code changes. Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Kuzu 0.11.3 (merged in #155) lifts several restrictions that 0.7.1 had. This PR unwinds the workarounds we coded against the older runtime and exposes the native capabilities — most notably real FTS with BM25 ranking instead of CONTAINS predicates.
What changed
FTS (fulltext search) — `internal/graph/indexes.go`
Parameterized LIMIT / SKIP
Kuzu 0.11.3 accepts `LIMIT $param` and `SKIP $param` as bound parameters; 0.7.1 required inline literals. Cleaned up at:
Drop `stringsToAny` widener
Kuzu 0.7's Go binding required `[]any` for list parameters; 0.11.3 accepts `[]string` directly. The widener helper is gone; `query.FindDeadCode` and `topology.FindServicesContainingNodes` pass `[]string` straight.
Still present in 0.11.3 (workarounds retained)
CLAUDE.md
Rewrote the Kuzu quirks section as "lifted in 0.11.3" vs "still present" buckets so future contributors don't reintroduce workarounds that the runtime no longer needs.
Test plan
Out of scope
🤖 Generated with Claude Code