No SQLite. No RocksDB. No Postgres bolted under the hood. Every byte — buffer pool, WAL, lock manager, optimizer — is ours.
Quick start · Features · Architecture · SQL examples · Performance · Roadmap · Contributing
StoreMy is an educational, production-patterned relational database engine. It implements the hard parts of a DBMS — buffer pool, page cache, write-ahead log, lock manager, cost-based optimizer, multiple join algorithms — entirely in safe Rust. Built to be read, hacked on, and learned from.
Edition 2024, zero unsafe in hot paths. Strict clippy.
|
Clean module boundaries — read it like a textbook. |
Integration tests against the public Database API.
|
make quickstart
# or
docker compose up storemyBoots the interactive SQL REPL.
Data persists in the |
git clone https://github.com/utkarsh-priyadarshi/storemy.git
cd storemy
cargo run -p storemy -- repl # interactive REPL
cargo run -p storemy -- "SELECT 1;" # one-shot SQL
cargo build -p storemy --release # optimized binary
|
| Variable | Description | Default |
|---|---|---|
DATA_DIR |
Directory for WAL, catalog, REPL history | ./data |
RUST_LOG |
tracing filter — e.g. storemy=debug |
info |
|
|
|
|
|
|
-- DDL
CREATE TABLE employees (
id INT,
name VARCHAR,
department VARCHAR,
salary FLOAT,
hire_date VARCHAR
);
-- DML
INSERT INTO employees (id, name, department, salary, hire_date)
VALUES (1, 'Alice Johnson', 'Engineering', 95000.00, '2023-01-15');
-- Filter
SELECT name, salary
FROM employees
WHERE salary > 80000;
-- Join
SELECT e.name, d.department_name, e.salary
FROM employees e
JOIN departments d ON e.department = d.id
WHERE e.salary > 70000;
-- Aggregate
SELECT department, COUNT(*), AVG(salary)
FROM employees
GROUP BY department;
-- Mutate
UPDATE employees SET salary = 100000.00 WHERE id = 1;
DELETE FROM employees WHERE hire_date < '2020-01-01';
DROP TABLE employees;flowchart LR
A[SQL text] --> B[Lexer]
B --> C[Parser]
C --> D[Planner<br/>logical]
D --> E[Optimizer<br/>physical + join order]
E --> F[Executor<br/>iterator operators]
F --> G[(Rows)]
classDef step fill:#1f1410,stroke:#ff7849,color:#ffd9c2,stroke-width:1.5px
classDef io fill:#0e0a08,stroke:#22c55e,color:#bbf7d0,stroke-width:1.5px
class A,G io
class B,C,D,E,F step
flowchart TB
REPL[REPL / HTTP server] --> ENG[engine<br/>SQL commands]
ENG --> EXE[execution<br/>scan · join · aggregate]
EXE --> CAT[catalog<br/>system tables · stats]
EXE --> HEAP[heap<br/>slotted pages]
EXE --> IDX[index<br/>B+Tree · Hash]
HEAP --> BP[buffer_pool<br/>LRU · NO-STEAL · FORCE]
IDX --> BP
BP --> TX[transaction<br/>2PL · lock manager · deadlock]
BP --> WAL[wal<br/>write-ahead log]
TX --> WAL
classDef l1 fill:#1f1410,stroke:#ff7849,color:#ffd9c2,stroke-width:1.5px
classDef l2 fill:#15101a,stroke:#9333ea,color:#e9d5ff,stroke-width:1.5px
classDef l3 fill:#0e0a08,stroke:#22c55e,color:#bbf7d0,stroke-width:1.5px
class REPL,ENG l1
class EXE,CAT,HEAP,IDX l2
class BP,TX,WAL l3
📁 Repository layout
StoreMy/
├── db/ # crate: `storemy`
│ ├── src/
│ │ ├── buffer_pool/ # page cache, LRU, NO-STEAL / FORCE
│ │ ├── catalog/ # system tables, statistics
│ │ ├── engine/ # SQL command executors (CREATE / INSERT / …)
│ │ ├── execution/ # operators: scan, join, aggregate, set-ops
│ │ ├── heap/ # slotted pages, heap files
│ │ ├── index/ # B+Tree, Hash
│ │ ├── parser/ # lexer, parser, AST
│ │ ├── repl/ # interactive SQL shell
│ │ ├── wal/ # write-ahead log, recovery
│ │ ├── web/ # HTTP handlers (storemy-server)
│ │ └── transaction.rs # 2PL, lock manager, deadlock detection
│ ├── benches/ # Criterion benchmarks
│ └── tests/integration* # E2E tests against public `Database` API
├── storemy-codec-derive/ # proc-macros for on-disk Encode/Decode
├── monitoring/ # Prometheus / Grafana / Jaeger stack
├── Dockerfile # multi-stage release image
└── docker-compose.yml # repl · tests · benchmarks · monitoring
🔒 Concurrency & recovery details
Concurrency control
- Page-level shared / exclusive locks
- Automatic upgrade
S → Xwhen needed - Wait-for graph + cycle detection → abort & retry
- Strict 2PL ⇒ serializable isolation
Recovery
- WAL protocol — log record on disk before page mutation
- Force-at-commit —
COMMITfsync'd before ack - LSN chaining for log traversal
- Record types:
BEGIN·COMMIT·ABORT·INSERT·UPDATE·DELETE - Undo on abort restores before-images
The cost-based optimizer picks per query based on predicate type, cardinality, and available memory.
| Algorithm | Best for | Time | Space |
|---|---|---|---|
| 🔁 Block Nested Loop | Non-equality predicates, small relations | O(|R| + (|R|/B)·|S|) |
O(B) |
| 🗂 Hash Join | Equi-joins with enough memory | O(|R| + |S|) avg |
O(|S|) |
| 🪜 Sort‑Merge | Pre-sorted / very large inputs | O(|R| log |R| + |S| log |S|) |
O(1) merge |
| Layer | Knob | Default |
|---|---|---|
| Page size | fixed | 4 KB |
| Buffer pool | capacity | 1 000 pages (≈ 4 MB) |
| B+Tree | point & range | O(log n) |
| Hash index | average lookup | O(1) |
| Lock granularity | — | page-level |
| Deadlock retry | max attempts | 100, 1 ms → 50 ms backoff |
| Join block size | configurable | 100 tuples |
Run benchmarks locally:
docker compose --profile benchmark up storemy-benchmark
# reports land in ./benchmark-resultscargo nextest run --workspace # full suite (CI runs this)
cargo nextest run -p storemy --test integration # end-to-end only
make quick-test # storemy lib unit tests only
make check # fmt + clippy + ci-test| Tier | Location | Notes |
|---|---|---|
| Unit | next to each module under db/src/** |
fast, focused |
| Integration | db/tests/ |
drives the public Database API |
| Benchmarks | db/benches/ |
Criterion-based |
make docker-build # release image: storemy + metrics_exporter
make docker-demo # interactive REPL in a container
make docker-test # cargo test -p storemy --test integration
make docker-clean # tear down volumes & imagesdocker-compose.yml ships profiles for default (REPL), test, benchmark, and monitoring (Prometheus + Grafana + Jaeger).
|
|
- Separation of concerns — storage, execution, and concurrency never reach across layers.
- Iterator everywhere — one uniform interface lets operators compose like Unix pipes.
- Strategy pattern — join algorithms are pluggable; the optimizer picks.
- ACID, not eventually — strict 2PL + WAL with force‑at‑commit.
- Production patterns — typed errors (
thiserror), structured logging (tracing), zero-warning clippy.
Issues, PRs, and architecture discussions are very welcome — clarity is a feature here. See CONTRIBUTING.md for the full workflow.
cargo +nightly-2026-04-01 fmt --all -- --check
cargo clippy --workspace --all-targets -- -D warnings
cargo nextest run --workspace --profile ciInspired by Database System Concepts (Silberschatz et al.), Database Management Systems (Ramakrishnan & Gehrke), CMU 15‑445/645, and the architectures of PostgreSQL, SQLite, and MySQL.
Released under the MIT License — see LICENSE.