Skip to content

Latest commit

 

History

History
407 lines (301 loc) · 17.5 KB

File metadata and controls

407 lines (301 loc) · 17.5 KB

codeiq

Deterministic code-knowledge-graph CLI + stdio MCP server

Map a polyglot codebase into a queryable graph. 100 detectors. 35+ languages. Zero AI in the pipeline.


Latest release Go 1.25.10 pkg.go.dev License

CI Perf Gate Security Scorecard CI

OpenSSF Best Practices OpenSSF Scorecard Sigstore keyless SLSA Build Provenance

100 Detectors 35+ Languages 880+ Tests MCP stdio Kuzu 0.11.3 CGO required


Note on SonarQube: codeiq deliberately uses an in-house OSS-CLI security stack (CodeQL, Semgrep, OSV-Scanner, Trivy, Gitleaks, jscpd, govulncheck) instead of Sonar — see docs/07-integrations.md & security.yml.

Table of contents


Why codeiq

Deterministic

Same input → same output, byte-for-byte. Detector emissions are confidence-tagged (LEXICAL / SYNTACTIC / RESOLVED); the graph builder dedup-merges with confidence-aware property union and drops phantom edges at snapshot. Every detector ships a determinism test.

Agent-ready

Stdio MCP server with 10 read-only tools wired for Claude Code / Cursor / Cline. Mode-driven surface (graph_summary, find_in_graph, inspect_node, trace_relationships, analyze_impact, topology_view) plus run_cypher for the power users.

Supply-chain hardened

Goreleaser + Cosign keyless via GitHub OIDC + Sigstore Rekor transparency log + Syft SPDX SBOMs + SLSA build provenance attestation + OpenSSF Scorecard + 6 OSS-CLI security scanners in CI.

Polyglot

100 detectors across 35+ languages: Java, Kotlin, Scala, Python, TypeScript, JavaScript, Go, Rust, C#, C++, plus IaC (Terraform, Bicep, Helm, Kubernetes, Docker, CloudFormation), config (YAML/JSON/TOML/INI), SQL, protobuf, shell, and more.

No AI in the pipeline

Index + enrich + every MCP query is pure static analysis. The only LLM touch is the opt-in codeiq review subcommand. No telemetry. No auto-update. No outbound network during core flows.

Single static binary

~25 MB. CGO embeds Kuzu (graph) + SQLite (cache) + tree-sitter (parser). No daemons. No external services. Works behind corporate firewalls / air-gapped after the initial install.


How it works

   source                                                         ┌─────────────┐
   tree   ─►  index ──────►  ┌──────────┐  ──►  enrich ──────►   │   Kuzu      │
              FileDiscovery  │  SQLite  │       linkers +         │   graph     │
              tree-sitter    │   cache  │       layer classify    │  (FTS-idx)  │
              100 detectors  │          │       intelligence      │             │
              dedup + sort   └──────────┘       ServiceDetector   └──────┬──────┘
                                                bulk COPY → Kuzu         │
                                                                         ▼
                              ┌───────────────────────────────────────────────┐
                              │  Read-only consumers (all powered by Kuzu):   │
                              │    stats, find, query, cypher, flow, graph,   │
                              │    topology, review (+ Ollama LLM)            │
                              │    mcp (stdio JSON-RPC, 10 tools)             │
                              └───────────────────────────────────────────────┘

Three commands cover the lifecycle:

Step Command What lands
1. Index codeiq index <path> <path>/.codeiq/cache/codeiq.sqlite (content-hash keyed; resumable)
2. Enrich codeiq enrich <path> <path>/.codeiq/graph/codeiq.kuzu/ + BM25 FTS indexes
3. Query codeiq mcp | stats | find | query | cypher | ... Read-only consumers of the Kuzu store

See docs/04-main-flows.md for per-flow entry points + failure modes.


Install

Pre-built binary (Linux amd64 / arm64, macOS arm64)

# Pick your platform; replace if needed
curl -L https://github.com/RandomCodeSpace/codeiq/releases/latest/download/codeiq_$(uname -s | tr A-Z a-z)_$(uname -m | sed s/x86_64/amd64/).tar.gz | tar xz
sudo install codeiq /usr/local/bin/
codeiq --version

go install

CGO_ENABLED=1 go install github.com/randomcodespace/codeiq/cmd/codeiq@latest

Requires Go 1.25.0+ and a C/C++ toolchain (Kuzu, SQLite, and tree-sitter all need CGO).

Build from source

git clone https://github.com/RandomCodeSpace/codeiq.git
cd codeiq
CGO_ENABLED=1 go build -o /usr/local/bin/codeiq ./cmd/codeiq
codeiq --version

Full setup checklist in docs/01-local-setup.md.


Quickstart

# 1. Scan files → SQLite cache
codeiq index /path/to/repo

# 2. Load cache → Kuzu graph + FTS indexes
codeiq enrich /path/to/repo

# 3. Ask questions
codeiq stats        /path/to/repo
codeiq find         endpoints /path/to/repo
codeiq query        consumers <node-id> /path/to/repo
codeiq topology     /path/to/repo
codeiq flow         overview /path/to/repo --format mermaid

# 4. Wire into your AI agent (Claude Code / Cursor / Cline)
codeiq mcp          /path/to/repo

# 5. Get an LLM-driven PR review (local Ollama by default)
codeiq review       /path/to/repo --base origin/main --head HEAD

MCP integration

Add to your MCP client config (.mcp.json at the repo root, or your editor's MCP settings):

{
  "mcpServers": {
    "codeiq": {
      "command": "codeiq",
      "args": ["mcp", "/path/to/repo"]
    }
  }
}
Ten user-facing tools
Tool Modes
graph_summary overview / categories / capabilities / provenance
find_in_graph nodes / edges / text / fuzzy / by_file / by_endpoint
inspect_node neighbors / ego / evidence / source
trace_relationships callers / consumers / producers / dependencies / dependents / shortest_path
analyze_impact blast_radius / trace / cycles / circular_deps / dead_code / dead_services / bottlenecks
topology_view summary / service / service_deps / service_dependents / flow
run_cypher Read-only Cypher escape hatch; mutation gate enforced
read_file Path-sandboxed source reader (full file or line range)
generate_flow Architecture flow diagrams (mermaid / dot / yaml) — 5 views
review_changes LLM-driven git-diff review against the graph (Ollama)

CLI cheatsheet

Click to expand
Command Purpose
index [path] Scan files → SQLite analysis cache
enrich [path] Load cache → Kuzu graph + build FTS indexes
mcp [path] Stdio MCP server for Claude Code / Cursor
stats [path] Categorized statistics (graph / languages / frameworks / infra / connections / auth / architecture)
query <kind> <id> [path] consumers / producers / callers / dependencies / dependents
find <preset> [path] endpoints / guards / entities / topics / queues / services / databases / components
cypher <query> [path] Read-only Cypher against Kuzu
flow <view> [path] Architecture diagrams — overview / ci / deploy / runtime / auth
graph [path] Export full graph as json / yaml / mermaid / dot
topology <sub> [path] Service topology + service-detail / blast-radius / bottlenecks / circular / dead / path
review [path] LLM-driven PR review (Ollama local by default; cloud via OLLAMA_API_KEY)
cache <action> Inspect / list / inspect-row / clear the SQLite cache
plugins <action> List + inspect registered detectors
version Build info (version, commit, date, Go toolchain, platform, features)

Run codeiq <cmd> --help for full flag listings. Full reference in docs/05-configuration.md.


Architecture at a glance

codeiq/
├── cmd/codeiq/main.go      ── 5-line entry shim
├── internal/
│   ├── analyzer/           ── index + enrich pipelines + GraphBuilder + ServiceDetector
│   ├── cache/              ── SQLite cache (WAL, content-hash keyed, 5 tables)
│   ├── cli/                ── cobra subcommands + detectors_register.go (choke point)
│   ├── detector/           ── 100 detectors organized by family
│   │   ├── jvm/{java,kotlin,scala}/   python/   typescript/   golang/
│   │   ├── frontend/  csharp/  systems/{cpp,rust}/  iac/  structured/
│   │   ├── auth/  proto/  sql/  markup/  script/shell/  generic/
│   │   └── base/           ── shared helpers (NOT detectors)
│   ├── flow/               ── architecture-flow diagram engine
│   ├── graph/              ── Kuzu facade + FTS + mutation gate
│   ├── intelligence/       ── Lexical enricher + per-language extractors
│   ├── mcp/                ── 10 MCP tools (stdio JSON-RPC)
│   ├── model/              ── CodeNode / CodeEdge / NodeKind (34) / EdgeKind (28) / Confidence / Layer
│   ├── parser/             ── tree-sitter + structured parsers
│   ├── query/              ── service / topology / stats / dead-code Cypher templates
│   └── review/             ── PR-review pipeline (diff + Ollama)
├── testdata/               ── fixture-minimal + fixture-multi-lang
├── .github/workflows/      ── go-ci, perf-gate, release-go, release-darwin, security, scorecard
└── .goreleaser.yml         ── Goreleaser v2 (CGO multi-arch + Cosign + Syft)

Deep dive in docs/02-architecture.md and docs/03-code-map.md.


Verification (supply chain)

Every release artifact is keyless-signed via Cosign + GitHub OIDC and recorded in the Sigstore Rekor transparency log. SLSA build provenance attestations land in GitHub's attestations store.

Verify the checksum manifest signature

cosign verify-blob \
  --bundle checksums.sha256.cosign.bundle \
  --certificate-identity-regexp 'https://github.com/RandomCodeSpace/codeiq/.github/workflows/release-go.yml@.*' \
  --certificate-oidc-issuer https://token.actions.githubusercontent.com \
  checksums.sha256

Verify the darwin tarball (signed separately)

cosign verify-blob \
  --bundle codeiq_0.4.1_darwin_arm64.tar.gz.cosign.bundle \
  --certificate-identity-regexp 'https://github.com/RandomCodeSpace/codeiq/.github/workflows/release-darwin.yml@.*' \
  --certificate-oidc-issuer https://token.actions.githubusercontent.com \
  codeiq_0.4.1_darwin_arm64.tar.gz

Verify the SLSA build provenance

gh attestation verify codeiq_0.4.1_linux_amd64.tar.gz --owner RandomCodeSpace

Documentation

Starter pack Reference Operate

Project overview
Local setup
Architecture
Main flows

Code map
Configuration
Data model
Integrations

Testing
Build / deploy / release
Known risks + TODOs
Agent handoff

Architectural decisions: docs/adr/. Repo-specific Claude Code instructions: CLAUDE.md.


Project status

Surface State
CLI core (index / enrich / stats / find / query / cypher) Production
MCP stdio server (10 tools) Production
Kuzu 0.11.3 + native FTS (BM25) Production
Goreleaser pipeline + Cosign keyless Production
884+ tests passing (race + vet + staticcheck + gosec + govulncheck on every PR) Production
codeiq review (LLM PR review) Beta — works end-to-end against local Ollama

Currently on v0.4.2. Release history was reset at v0.4.0 — see docs/00-project-overview.md for context.


Contributing

  • Branch off main. Conventional-commit subjects (feat:, fix:, chore:, refactor:, test:, docs:, perf:).
  • One logical change per commit. Squash-merge only.
  • Tests + race + vet must pass. CGO_ENABLED=1 go test ./... -race -count=1.
  • Determinism is non-negotiable. Every new detector ships positive / negative / determinism tests.
  • Read-only MCP. Tool calls never mutate the graph. Index/enrich happen via the CLI.
  • New detector? Don't forget to blank-import it in internal/cli/detectors_register.go — see CLAUDE.md for the full how-to.

Security: please report privately via GitHub Security Advisories.


License

MIT License

Copyright © codeiq contributors. See LICENSE.