Skip to content

feat(cli): gradata projects subcommand (GRA-1238 / GH #206)#218

Open
Gradata wants to merge 2 commits into
mainfrom
feat/gradata-projects-subcommand-v2
Open

feat(cli): gradata projects subcommand (GRA-1238 / GH #206)#218
Gradata wants to merge 2 commits into
mainfrom
feat/gradata-projects-subcommand-v2

Conversation

@Gradata
Copy link
Copy Markdown
Owner

@Gradata Gradata commented May 20, 2026

Summary

Adds gradata projects — a table listing every project the SDK knows about from ~/.gradata/projects.toml. Columns: name, brain_dir, rules count, last_correction timestamp, sync_status. Supports --json for machine consumption.

Schema

Uses stdlib tomllib (Python 3.11+) — no new dependency:

[[projects]]
name = "my-project"
brain_dir = "/home/user/projects/my-project/.gradata/brain"

Column semantics

Column Source
name registry entry
brain_dir registry entry
rules COUNT(*) FROM events WHERE type='RULE_GRADUATED'
last_correction MAX(ts) FROM events WHERE type='CORRECTION' or (never)
sync_status ok if brain_dir exists, missing if not, error if db unreadable

Failure modes

  • Missing ~/.gradata/projects.toml → friendly hint, exit 0 (no crash)
  • Malformed TOML → Error: malformed projects.toml (...) + exit 1
  • Missing brain_dir for a registered project → row still emitted with sync_status=missing

Test plan

tests/test_projects_command.py — 9 tests, all passing:

  • missing registry → hint + exit 0 (text and --json)
  • empty registry (valid TOML, no [[projects]])
  • single project, brain_dir missing on disk
  • single project, brain_dir exists, no system.db (fresh)
  • multi-project registry mixing ok/missing
  • malformed TOML → SystemExit(1)
  • --json output shape
  • real sqlite db with RULE_GRADUATED + CORRECTION rows → rules count and last_correction populated
cd Gradata && pytest tests/test_projects_command.py -xvs
============================== 9 passed in 0.24s ===============================

Live smoke (no registry present):

$ gradata projects
No projects registered. Run `gradata init <dir>` to add one.
$ gradata projects --json
[]

Layering check

cmd_projects lives in cli.py (Layer 2). Only stdlib imports introduced: tomllib, sqlite3, pathlib, json — all already used elsewhere in cli.py. No Layer 0 → 2 violation.

Risk

Backwards-compatible additive change. No schema migration. New subcommand only; no existing behaviour modified. Pre-existing ruff warnings on lines 618/892 are untouched and unrelated.

Closes GRA-1238.

Adds 'gradata projects' which lists every project in ~/.gradata/projects.toml
with columns: name, brain_dir, rules count, last_correction timestamp, and
sync_status.

Schema (stdlib tomllib, no new dep):

    [[projects]]
    name = "my-project"
    brain_dir = "/path/to/.gradata/brain"

Column semantics:
  * name             — project identifier from the registry
  * brain_dir        — path to the brain directory on disk
  * rules            — COUNT(*) of RULE_GRADUATED events in system.db
  * last_correction  — MAX(ts) of CORRECTION events, or '(never)'
  * sync_status      — 'ok' (brain_dir exists), 'missing' (brain_dir gone),
                       or 'error' (db unreadable)

Behaviour:
  * Missing ~/.gradata/projects.toml → friendly hint, exit 0 (no crash)
  * Malformed TOML → 'Error: malformed projects.toml (...)' + exit 1
  * --json emits the same data as a JSON array of objects

Tests: tests/test_projects_command.py covers missing registry, empty registry,
single/multi project, missing brain_dir → sync_status='missing', malformed
TOML, --json shape, and a real sqlite db with RULE_GRADUATED/CORRECTION rows.
9 tests, all passing.

Layering: cmd_projects lives in cli.py (Layer 2). No new imports from Layer 0
beyond what cmd_status already uses (sqlite3, pathlib, json, tomllib stdlib).
Copy link
Copy Markdown

@greptile-apps greptile-apps Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 20, 2026

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: a41f7113-9a18-48fa-9ee2-3fa5f25d2301

📥 Commits

Reviewing files that changed from the base of the PR and between 7d80462 and adc07b1.

📒 Files selected for processing (1)
  • Gradata/tests/test_projects_command.py
📜 Recent review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
  • GitHub Check: pytest ubuntu-latest / py3.12
  • GitHub Check: pytest windows-latest / py3.11
  • GitHub Check: pytest windows-latest / py3.12
  • GitHub Check: pytest ubuntu-latest / py3.11
  • GitHub Check: pytest macos-latest / py3.11
  • GitHub Check: pytest macos-latest / py3.12
  • GitHub Check: pytest (py3.11)
  • GitHub Check: pytest (py3.12)
🧰 Additional context used
📓 Path-based instructions (1)
Gradata/tests/**/*.py

📄 CodeRabbit inference engine (Gradata/AGENTS.md)

Gradata/tests/**/*.py: Set BRAIN_DIR environment variable via tmp_path in conftest.py for test isolation — ensure _paths.py module cache refreshes when calling Brain.init() directly inside tests
Add unit tests in tests/test_*.py for every CI push without LLM calls (deterministic); mark integration tests with @pytest.mark.integration and skip them by default (they hit real LLM APIs)

Files:

  • Gradata/tests/test_projects_command.py
🔇 Additional comments (1)
Gradata/tests/test_projects_command.py (1)

87-87: LGTM!

Also applies to: 95-96, 114-114, 172-172


📝 Walkthrough
  • New CLI subcommand: gradata projects — lists projects from ~/.gradata/projects.toml with columns: name, brain_dir, rules (count of RULE_GRADUATED), last_correction (latest CORRECTION ts or (never)), and sync_status (ok/missing/error).
  • JSON output support: --json emits project rows as a JSON array of objects (machine-consumable).
  • Error handling: missing registry prints a friendly hint and exits 0; malformed TOML prints Error: malformed projects.toml (...) and exits 1; unreadable DB yields sync_status=error; missing brain_dir yields sync_status=missing but still emits a row.
  • No new runtime dependencies: uses only stdlib modules (tomllib, sqlite3, pathlib, json); requires Python 3.11+ for tomllib.
  • New public API: added cmd_projects(args) in gradata/cli.py.
  • Tests: added 9 tests in tests/test_projects_command.py covering missing/empty registries, single/multi projects, missing brain_dir, malformed TOML, --json shape, and real sqlite DB behavior (all passing).
  • Minor code changes: gradata status import simplified (uses datetime only); gradata prove formatting adjusted without logic changes.
  • Non-breaking, additive change: no schema migration; registry uses TOML [[projects]] entries with name and brain_dir.
  • Test portability tweak: tests write TOML paths as forward-slash strings (tmp_path.as_posix()) and normalize Path comparisons for Windows compatibility.

Walkthrough

The PR adds a new gradata projects CLI subcommand that reads a ~/.gradata/projects.toml registry, queries each project's system.db for rule counts and correction timestamps, maps database accessibility to sync status, and outputs results as a formatted table or JSON. Comprehensive tests validate all edge cases: missing registries, malformed TOML, missing directories, and real database queries.

Changes

Projects CLI subcommand

Layer / File(s) Summary
Core implementation and CLI wiring
Gradata/src/gradata/cli.py
cmd_projects handler loads and iterates the ~/.gradata/projects.toml registry, probes each project's brain directory and system.db for RULE_GRADUATED event counts and latest correction timestamps, maps readability to sync_status (ok/missing/error), and renders output as a table or JSON array. Argument parser and command dispatch are wired to register and route the projects subcommand.
Minor CLI formatting and imports tweak
Gradata/src/gradata/cli.py
cmd_status datetime import simplified (removed timezone import) and cmd_prove's “Corrections per session” output block was re-indented/reformatted without changing logic.
Projects command test suite
Gradata/tests/test_projects_command.py
Nine test functions validate missing/empty registries with friendly hints, missing/present brain directories, multi-project handling, malformed TOML error exit codes, JSON output schema keys, and an integration test that creates a real system.db with events table rows and asserts rule counts, last correction timestamps, and sync status.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

  • Gradata/gradata#210: Both PRs modify Gradata/src/gradata/cli.py; #210 adds the cmd_prove function while this PR only re-indents the "Corrections per session" verdict section without changing the logic.

Suggested labels

feature

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 29.41% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly describes the main feature addition: a new 'gradata projects' subcommand, with relevant ticket references.
Description check ✅ Passed The description is comprehensive and directly related to the changeset, providing schema, column semantics, failure modes, tests, and risk assessment.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/gradata-projects-subcommand-v2

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 OpenGrep (1.21.0)

OpenGrep fatal error (exit code 2):
┌──────────────┐
│ Opengrep CLI │
└──────────────┘

�[32m✔�[39m �[1mOpengrep OSS�[0m
�[32m✔�[39m Basic security coverage for first-party code vulnerabilities.

�[1m Loading rules from local config...�[0m
[00.24][ERROR]: Error: exception Glob.Lexer.Syntax_error("malformed glob pattern: missing ']'")
Raised at Glob__Lexer.syntax_error in file "libs/glob/Lexer.mll", line 8, characters 2-26
Called from Glob__Lexer.__ocaml_lex_token_rec in file "libs/glob/Lexer.mll", line 29, characters 26-53
Cal


Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot added the feature label May 20, 2026
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@Gradata/src/gradata/cli.py`:
- Around line 347-351: Validate each item from raw_projects before treating it
as a dict: in the loop that iterates over raw_projects (the for proj in
raw_projects block where name = proj.get(...) and brain_dir = proj.get(...)),
check isinstance(proj, dict) and if not, raise or return a clear
malformed-registry error (or skip with a logged warning) indicating the project
entry has an unexpected type; ensure the error message names the offending entry
and/or index so users can fix their registry instead of allowing an
AttributeError to propagate.
- Around line 365-377: The code currently opens a SQLite connection with
_sqlite3.connect(...) and calls con.close() only in the try path, so if a query
raises an exception the connection is leaked; modify the block around con, cur,
rules_count and last_correction so the connection is always closed — either use
a context manager (with _sqlite3.connect(str(db_path)) as con:) or ensure
con.close() is called in a finally block and guard against con being undefined
if connect() fails; update the code that uses cur.execute(...) and the except
clause to preserve existing error handling while guaranteeing con.close() runs.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 81335e1b-cd6c-49ee-b216-1aff7f2bb943

📥 Commits

Reviewing files that changed from the base of the PR and between 2914efb and 7d80462.

📒 Files selected for processing (2)
  • Gradata/src/gradata/cli.py
  • Gradata/tests/test_projects_command.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
  • GitHub Check: pytest windows-latest / py3.11
  • GitHub Check: pytest ubuntu-latest / py3.12
  • GitHub Check: pytest ubuntu-latest / py3.11
  • GitHub Check: pytest windows-latest / py3.12
  • GitHub Check: pytest (py3.12)
  • GitHub Check: pytest (py3.11)
🧰 Additional context used
📓 Path-based instructions (2)
Gradata/src/**/*.py

📄 CodeRabbit inference engine (Gradata/AGENTS.md)

Gradata/src/**/*.py: Prefer sentence-transformers for local embeddings, google-genai for Gemini embeddings, cryptography for AES-GCM encrypted system.db, bm25s for BM25 rule ranking, and mem0ai for external memory adapters — guard all optional dependency imports with try / except ImportError at the call site, never at module level
Maintain strict layering: Layer 0 (Primitives: _types.py, _db.py, _events.py, _paths.py, _file_lock.py; Patterns: contrib/patterns/) must never import from Layer 1 (Enhancements: enhancements/, rules/) or Layer 2 (Public API: brain.py, cli.py, daemon.py, mcp_server.py)
Never use bare except: pass — use typed exceptions or at minimum logger.warning(...) with exc_info=True to avoid silent failure in a memory product
Never import from out-of-scope sibling directories ../Sprites/ or ../Hausgem/ within gradata/* code — that is a layering bug
Never leak private-sibling paths into public docs/code — no references to ../Sprites/, ../Hausgem/, email addresses, OneDrive paths, or Sprites-specific examples from inside gradata/*
Use atomic-write helper when writing JSON files to prevent corruption from mid-write crashes

Files:

  • Gradata/src/gradata/cli.py
Gradata/tests/**/*.py

📄 CodeRabbit inference engine (Gradata/AGENTS.md)

Gradata/tests/**/*.py: Set BRAIN_DIR environment variable via tmp_path in conftest.py for test isolation — ensure _paths.py module cache refreshes when calling Brain.init() directly inside tests
Add unit tests in tests/test_*.py for every CI push without LLM calls (deterministic); mark integration tests with @pytest.mark.integration and skip them by default (they hit real LLM APIs)

Files:

  • Gradata/tests/test_projects_command.py
🔇 Additional comments (2)
Gradata/src/gradata/cli.py (1)

159-159: LGTM!

Also applies to: 896-909, 1980-1983, 2323-2323

Gradata/tests/test_projects_command.py (1)

1-179: LGTM!

Comment on lines +347 to +351
raw_projects = data.get("projects", []) or []
rows = []
for proj in raw_projects:
name = proj.get("name", "(unnamed)")
brain_dir = proj.get("brain_dir", "")
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Validate registry entry shape before dereferencing project fields.

raw_projects items are assumed to be dicts. A schema-invalid but parseable TOML value (e.g. projects = ["x"]) crashes with AttributeError at Line 350 instead of returning a friendly malformed-registry error.

Proposed fix
     raw_projects = data.get("projects", []) or []
     rows = []
     for proj in raw_projects:
+        if not isinstance(proj, dict):
+            print("Error: malformed projects.toml (each [[projects]] entry must be a table)")
+            raise SystemExit(1) from None
         name = proj.get("name", "(unnamed)")
         brain_dir = proj.get("brain_dir", "")
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
raw_projects = data.get("projects", []) or []
rows = []
for proj in raw_projects:
name = proj.get("name", "(unnamed)")
brain_dir = proj.get("brain_dir", "")
raw_projects = data.get("projects", []) or []
rows = []
for proj in raw_projects:
if not isinstance(proj, dict):
print("Error: malformed projects.toml (each [[projects]] entry must be a table)")
raise SystemExit(1) from None
name = proj.get("name", "(unnamed)")
brain_dir = proj.get("brain_dir", "")
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@Gradata/src/gradata/cli.py` around lines 347 - 351, Validate each item from
raw_projects before treating it as a dict: in the loop that iterates over
raw_projects (the for proj in raw_projects block where name = proj.get(...) and
brain_dir = proj.get(...)), check isinstance(proj, dict) and if not, raise or
return a clear malformed-registry error (or skip with a logged warning)
indicating the project entry has an unexpected type; ensure the error message
names the offending entry and/or index so users can fix their registry instead
of allowing an AttributeError to propagate.

Comment on lines +365 to +377
try:
con = _sqlite3.connect(str(db_path))
cur = con.cursor()
rules_count = cur.execute(
"SELECT COUNT(*) FROM events WHERE type='RULE_GRADUATED'"
).fetchone()[0]
row = cur.execute(
"SELECT MAX(ts) FROM events WHERE type='CORRECTION'"
).fetchone()
last_correction = row[0] if row else None
con.close()
except (_sqlite3.OperationalError, _sqlite3.DatabaseError, OSError):
sync_status = "error"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Close SQLite connections on query failures.

If any query fails after connect() succeeds, execution jumps to except and con.close() is skipped. Repeated errors across many projects can leak file descriptors/locks.

Proposed fix
             else:
                 try:
-                    con = _sqlite3.connect(str(db_path))
-                    cur = con.cursor()
-                    rules_count = cur.execute(
-                        "SELECT COUNT(*) FROM events WHERE type='RULE_GRADUATED'"
-                    ).fetchone()[0]
-                    row = cur.execute(
-                        "SELECT MAX(ts) FROM events WHERE type='CORRECTION'"
-                    ).fetchone()
-                    last_correction = row[0] if row else None
-                    con.close()
+                    with _sqlite3.connect(str(db_path)) as con:
+                        cur = con.cursor()
+                        rules_count = cur.execute(
+                            "SELECT COUNT(*) FROM events WHERE type='RULE_GRADUATED'"
+                        ).fetchone()[0]
+                        row = cur.execute(
+                            "SELECT MAX(ts) FROM events WHERE type='CORRECTION'"
+                        ).fetchone()
+                        last_correction = row[0] if row else None
                 except (_sqlite3.OperationalError, _sqlite3.DatabaseError, OSError):
                     sync_status = "error"
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@Gradata/src/gradata/cli.py` around lines 365 - 377, The code currently opens
a SQLite connection with _sqlite3.connect(...) and calls con.close() only in the
try path, so if a query raises an exception the connection is leaked; modify the
block around con, cur, rules_count and last_correction so the connection is
always closed — either use a context manager (with
_sqlite3.connect(str(db_path)) as con:) or ensure con.close() is called in a
finally block and guard against con being undefined if connect() fails; update
the code that uses cur.execute(...) and the except clause to preserve existing
error handling while guaranteeing con.close() runs.

Windows tmp_path uses backslashes which TOML interprets as escape sequences
('Invalid hex value'). Use as_posix() to write forward slashes (works on
Windows too) and compare brain_dir via Path normalization.
Copy link
Copy Markdown

@greptile-apps greptile-apps Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant