VAR CLI

A static analysis engine that parses TypeScript/JavaScript codebases into queryable dependency graphs, powered by Tree-sitter AST parsing and LLM-driven code annotation.

Codemap scans your source code, extracts every symbol and its relationships, stores the full dependency graph in PostgreSQL, and optionally annotates it with AI-generated metadata — turning any codebase into a structured, searchable knowledge base.

Why I Built This

Cursor and Claude Code are powerful but they're expensive with context.

They grep through files, store memory in plain markdown, and burn tokens trying to understand codebases that are fundamentally relational — functions calling functions, modules depending on modules, types referencing types.

A markdown file is a terrible brain for a relational problem.

So I built var-cli as a local brain for AI coding tools. It parses your entire TypeScript/JavaScript codebase into a PostgreSQL dependency graph — every symbol, every relationship, every import chain — and syncs instantly as code changes.

When your AI needs to find the hot path through a system, instead of grepping 50 files and hallucinating connections, it queries a structured graph with deterministic relationships. LLM-powered annotations nudge the directionality further — each symbol gets a purpose summary and category tag so the AI knows not just what exists, but what it does.

The result: faster, cheaper, more accurate codebase navigation for AI coding tools.

Why Codemap?

Understanding large codebases is hard. Grep and IDE search find text matches, not meaning. Codemap solves this by building a complete graph of your code's structure — every class, function, type, and the relationships between them — then layering on AI-powered annotations that describe what each piece does and why it exists.

Use cases:

Onboard onto unfamiliar codebases in minutes, not days
Ask natural language questions about how systems work
Audit dependency chains and identify tightly coupled modules
Auto-generate architectural documentation from source code
Power downstream tools (code review bots, migration planners, refactoring assistants)

Architecture

┌─────────────────────────────────────────────────────────────────┐
│                         CLI (Commander)                         │
│                  scan  ·  annotate  ·  query                    │
└──────────┬──────────────────┬──────────────────┬────────────────┘
           │                  │                  │
     ┌─────▼─────┐    ┌──────▼──────┐    ┌──────▼──────┐
     │  Scanner   │    │  Annotator  │    │   Query     │
     │  walker    │    │  context    │    │   Engine    │
     │  filter    │    │  prompts    │    │  (OpenAI)   │
     └─────┬─────┘    │  gemini     │    │  symbol     │
           │          │  parser     │    │  search     │
     ┌─────▼─────┐    └──────┬──────┘    └──────┬──────┘
     │  Parser   │           │                  │
     │  treesit  │           │                  │
     │  symbols  │     ┌─────▼──────────────────▼─────┐
     │  imports  │     │                              │
     └─────┬─────┘     │     PostgreSQL Database      │
           │           │                              │
     ┌─────▼─────┐     │  codebases · folders · files │
     │  Edges    │     │  symbols · file_edges        │
     │  file     ├────►│  symbol_edges · annotations  │
     │  symbol   │     │                              │
     └───────────┘     └──────────────────────────────┘

Pipeline: Scan → Parse → Store → Resolve Edges → Annotate → Query

Phase	What happens	Output
Scan	Recursively walk directories, apply exclusion filters	File & folder inventory
Parse	Tree-sitter AST analysis on every file	Symbols + imports/exports
Store	Batch insert into PostgreSQL	Persisted graph nodes
Edges	Resolve file-to-file and symbol-to-symbol relationships	Dependency edges
Annotate	LLM generates purpose, category, confidence per symbol	Annotation metadata
Query	Natural language search over the graph via GPT-4 tool calling	Answers with source refs

Tech Stack

Layer	Technology
Runtime	Bun / Node.js
Language	TypeScript 5.7
AST Parsing	Tree-sitter (JS/TS/TSX grammars)
Database	PostgreSQL
CLI Framework	Commander
Annotation LLM	Google Gemini 2.5 Flash
Query LLM	OpenAI GPT-4.1

Getting Started

Prerequisites

Bun >= 1.0
PostgreSQL >= 14
API keys for Gemini and/or OpenAI (optional, only needed for annotate and query)

Install

git clone https://github.com/zoosphar/var-cli
cd codemap-cli
bun install

Configure Environment

cp .env.example .env
# Edit .env with your credentials

Required variables:

Variable	Required for	Description
`DATABASE_URL`	All commands	PostgreSQL connection string
`GEMINI_API_KEY`	`annotate`	Google Generative AI API key
`OPENAI_API_KEY`	`query`	OpenAI API key

Create the Database

createdb var_codemap

Then run the schema migration:

-- Codebases
CREATE TABLE codebases (
  id SERIAL PRIMARY KEY,
  slug VARCHAR(255) UNIQUE NOT NULL,
  name VARCHAR(255) NOT NULL,
  root_path TEXT NOT NULL,
  created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
  updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

-- Folders
CREATE TABLE folders (
  id SERIAL PRIMARY KEY,
  codebase_slug VARCHAR(255) NOT NULL,
  parent_folder_id INTEGER REFERENCES folders(id),
  relative_path TEXT NOT NULL,
  created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
  updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

-- Files
CREATE TABLE files (
  id SERIAL PRIMARY KEY,
  codebase_slug VARCHAR(255) NOT NULL,
  relative_path TEXT NOT NULL,
  language VARCHAR(50) NOT NULL,
  hash VARCHAR(64) NOT NULL,
  created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
  updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

-- Symbols
CREATE TABLE symbols (
  id SERIAL PRIMARY KEY,
  codebase_slug VARCHAR(255) NOT NULL,
  file_id INTEGER NOT NULL REFERENCES files(id),
  name VARCHAR(255) NOT NULL,
  kind VARCHAR(50) NOT NULL,
  start_line INTEGER NOT NULL,
  end_line INTEGER NOT NULL,
  created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
  updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

-- File-level dependency edges
CREATE TABLE file_edges (
  id SERIAL PRIMARY KEY,
  codebase_slug VARCHAR(255) NOT NULL,
  from_file_id INTEGER NOT NULL REFERENCES files(id),
  to_file_id INTEGER REFERENCES files(id),
  kind VARCHAR(50) NOT NULL,
  created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

-- Symbol-level dependency edges
CREATE TABLE symbol_edges (
  id SERIAL PRIMARY KEY,
  codebase_slug VARCHAR(255) NOT NULL,
  from_symbol_id INTEGER NOT NULL REFERENCES symbols(id),
  to_symbol_id INTEGER REFERENCES symbols(id),
  kind VARCHAR(50) NOT NULL,
  created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

-- Indexes
CREATE INDEX idx_folders_codebase ON folders(codebase_slug);
CREATE INDEX idx_files_codebase ON files(codebase_slug);
CREATE INDEX idx_symbols_codebase ON symbols(codebase_slug);
CREATE INDEX idx_symbols_file ON symbols(file_id);
CREATE INDEX idx_file_edges_codebase ON file_edges(codebase_slug);
CREATE INDEX idx_file_edges_from ON file_edges(from_file_id);
CREATE INDEX idx_file_edges_to ON file_edges(to_file_id);
CREATE INDEX idx_symbol_edges_codebase ON symbol_edges(codebase_slug);
CREATE INDEX idx_symbol_edges_from ON symbol_edges(from_symbol_id);
CREATE INDEX idx_symbol_edges_to ON symbol_edges(to_symbol_id);

Usage

`scan` — Build the dependency graph

Scans a codebase, extracts all symbols and relationships, and persists them to PostgreSQL.

# Scan a project
codemap scan /path/to/project

# Custom name
codemap scan /path/to/project --name "my-api"

# Exclude additional directories
codemap scan /path/to/project --exclude tests fixtures e2e

# Force a fresh scan (new slug with timestamp)
codemap scan /path/to/project --fresh

# Scan and immediately annotate with AI
codemap scan /path/to/project --annotate --max-files 50

Options:

Flag	Description
`-n, --name <name>`	Custom codebase name (defaults to directory name)
`-e, --exclude <patterns...>`	Additional directories to exclude from scan
`--fresh`	Generate a unique slug with timestamp
`--annotate`	Run LLM annotation immediately after scanning
`--max-files <n>`	Limit annotation to first N files

Example output:

Scanning codebase: my-api
   Path: /Users/you/projects/my-api

Walking directory structure...
   Found 42 folders and 187 files

Inserting folders...
Processing files and extracting symbols...
   Extracted 2,341 symbols

Resolving file dependencies...
   Created 612 file edges

Resolving symbol dependencies...
   Created 1,847 symbol edges

Scan complete!
   Codebase slug: my-api
   Folders: 42
   Files: 187
   Symbols: 2,341
   File edges: 612
   Symbol edges: 1,847

`annotate` — AI-powered code annotation

Generates structured metadata for every file and symbol using Google Gemini. Each annotation includes a responsibility summary, a category tag, and a confidence score.

# Annotate a previously scanned codebase
codemap annotate my-api

# Re-annotate (overwrite existing annotations)
codemap annotate my-api --force

# Annotate only a specific folder
codemap annotate my-api --folder src/services

# Control parallelism
codemap annotate my-api --workers 10

# Limit scope
codemap annotate my-api --max-files 20 --verbose

Options:

Flag	Description
`--force`	Re-annotate even if annotations already exist
`--max-files <n>`	Limit number of files to process
`--folder <path>`	Only annotate files under this relative path
`--workers <n>`	Number of parallel LLM requests (default: 5)
`-v, --verbose`	Show detailed progress per file

Annotation categories: api, ui, component, utility, config, data, service, middleware, model, test, style, build

`query` — Ask questions about your codebase

Uses OpenAI GPT-4 with tool calling to search the dependency graph and answer natural language questions about your code.

# One-off question
codemap query my-api "How does authentication work?"

# Interactive mode (continuous Q&A)
codemap query my-api --interactive

# Use a specific model
codemap query my-api "What calls the PaymentService?" --model gpt-4.1

Options:

Flag	Description
`--api-key <key>`	OpenAI API key (overrides env var)
`--model <model>`	OpenAI model to use (default: `gpt-4.1`)
`--temperature <n>`	LLM temperature (default: `0.3`)
`--interactive`	Enter continuous Q&A mode

How it works under the hood:

Your question is sent to GPT-4 along with the codebase's top-level symbol index
The LLM uses tool calling to invoke a get_symbols_from_query function
Symbols are scored by name relevance, structural importance, and proximity
Top 20 matching symbols (with full source code) are returned to the LLM
The LLM synthesizes a final answer with file paths and line references

Data Model

Symbol Kinds

Kind	Description	Example
`class`	Class declarations	`class UserService {}`
`function`	Function declarations & arrow functions	`function validate() {}`
`method`	Class methods	`getUser() {}`
`variable`	Mutable bindings	`let count = 0`
`const`	Immutable bindings	`const API_URL = "..."`
`interface`	Interface declarations	`interface User {}`
`type`	Type aliases	`type ID = string`
`enum`	Enum declarations	`enum Status {}`

Edge Kinds

File edges — relationships between files:

Kind	Meaning
`imports`	File A imports from File B
`re_exports`	File A re-exports from File B

Symbol edges — relationships between symbols:

Kind	Meaning
`imports`	Symbol A is imported as Symbol B
`calls`	Function A invokes Function B
`extends`	Class A extends Class B
`implements`	Class A implements Interface B
`references`	Symbol A references Symbol B
`type_reference`	Symbol A uses a type from Symbol B

Default Exclusions

Directories: node_modules, .git, dist, build, .next, .nuxt, .output, .cache, .turbo, coverage, .nyc_output, __pycache__, .venv, venv, .idea, .vscode

Files: package-lock.json, yarn.lock, pnpm-lock.yaml, bun.lockb, .DS_Store, Thumbs.db

Supported extensions: .ts, .tsx, .js, .jsx, .mts, .cts, .mjs, .cjs

Development

# Run in dev mode
bun run dev scan /path/to/project

# Type check
bun run typecheck

# Build for production
bun run build

Project Structure

src/
├── index.ts              # CLI entry point and command definitions
├── scanner/
│   ├── walker.ts         # Recursive directory traversal
│   └── filter.ts         # File/directory exclusion rules
├── parser/
│   ├── treesitter.ts     # Tree-sitter initialization
│   ├── symbols.ts        # Symbol extraction from ASTs
│   └── imports.ts        # Import/export statement parsing
├── edges/
│   ├── file-edges.ts     # File-to-file dependency resolution
│   └── symbol-edges.ts   # Symbol-to-symbol relationship extraction
├── db/
│   ├── client.ts         # PostgreSQL connection
│   ├── schema.ts         # TypeScript type definitions
│   ├── queries.ts        # CRUD operations
│   └── annotations.ts    # Annotation persistence
├── annotate/
│   ├── index.ts          # Annotation orchestrator
│   ├── context.ts        # Context builder for LLM prompts
│   ├── gemini.ts         # Google Gemini API client
│   ├── prompts.ts        # Prompt templates
│   └── parser.ts         # LLM response parser
├── tools/
│   ├── openai-client.ts  # OpenAI GPT integration
│   └── symbol-search.ts  # Symbol scoring and retrieval
└── utils/
    ├── slug.ts           # Codebase slug generation
    └── hash.ts           # SHA-256 content hashing

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
src		src
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
bun.lock		bun.lock
package.json		package.json
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

VAR CLI

Why I Built This

Why Codemap?

Architecture

Tech Stack

Getting Started

Prerequisites

Install

Configure Environment

Create the Database

Usage

`scan` — Build the dependency graph

`annotate` — AI-powered code annotation

`query` — Ask questions about your codebase

Data Model

Symbol Kinds

Edge Kinds

Default Exclusions

Development

Project Structure

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

VAR CLI

Why I Built This

Why Codemap?

Architecture

Tech Stack

Getting Started

Prerequisites

Install

Configure Environment

Create the Database

Usage

scan — Build the dependency graph

annotate — AI-powered code annotation

query — Ask questions about your codebase

Data Model

Symbol Kinds

Edge Kinds

Default Exclusions

Development

Project Structure

License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`scan` — Build the dependency graph

`annotate` — AI-powered code annotation

`query` — Ask questions about your codebase

Packages