A production-ready schema diff and SQL migration generator for PostgreSQL, MySQL, and SQLite β built with defense-in-depth security, dialect-aware Jinja2 templates, and topological ordering by foreign key dependencies.
- About
- Pipeline
- Stack
- Security
- Prerequisites
- Getting Started
- CLI Reference
- YAML Schema Format
- Dialects
- Safety Features
- Testing
- Environment Variables
- Project Structure
- Design Decisions
- Contributing
- License
deltadb compares two database schemas β live connections or YAML files β and produces a ready-to-run SQL migration with UP and DOWN blocks.
Key capabilities:
- Multi-source diffing: connect strings (
postgresql://...) and YAML schemas as both source and target - Dialect-aware SQL generation via Jinja2 templates for PostgreSQL, MySQL, and SQLite
- Topological ordering by foreign key dependencies β
DROP TABLEorder is always safe - Rename detection using
difflib.SequenceMatcher(opt-in, configurable threshold) - Destructive operation markers (
-- WARNING: DESTRUCTIVE) and a--no-destructiveflag - Defense-in-depth security across 9 independent layers: YAML safety, SQL identifier injection prevention, sandboxed templates, credential masking, path traversal prevention, and more
- Rich terminal diff output and JSON export
[Source: DB or YAML] [Target: DB or YAML]
| |
[Loader] [Loader]
| |
[SchemaModel A] --------> [Diff Engine] <------ [SchemaModel B]
|
[list[Change]]
|
[Rename Detection] (opt-in, --detect-renames)
|
[Topological Sort]
(FK dependency order)
|
[SQL Generator]
(Jinja2 SandboxedEnv)
/ \
[UP block] [DOWN block]
\ /
[.sql output]
|
[Rich Terminal Output]
[Live Database]
|
[DbLoader]
(SQLAlchemy reflection)
|
[SchemaModel]
|
[YAML Serializer]
|
[schema.yml]
| Layer | Technology |
|---|---|
| Language | Python 3.11+ |
| CLI | click 8.1+ |
| Terminal output | rich 13+ |
| SQL templates | jinja2 3.1+ (SandboxedEnvironment) |
| Database reflection | sqlalchemy 2.0+ |
| PostgreSQL driver | psycopg2-binary |
| MySQL driver | pymysql |
| YAML parsing | pyyaml 6+ (safe_load only) |
| Test runner | pytest 7.4+ |
| Integration tests | testcontainers (real PostgreSQL + MySQL) |
| Static analysis | bandit, pip-audit |
| Formatter | black |
| Linter | ruff |
| Containerization | Docker + Docker Compose |
deltadb treats every input as hostile. Nine independent layers of defense, each annotated with # [SECURITY] in the source.
yaml.safe_load() exclusively. yaml.load() is prohibited and detected by bandit. Files are validated against an allowlist of root keys (tables, dialect) and rejected if they exceed 10 MB or contain unexpected structure.
# [SECURITY] yaml.safe_load ONLY β yaml.load() executes arbitrary Python code
data = yaml.safe_load(f)Table and column names from external databases are untrusted input. Names are validated against an allowlist regex ([a-zA-Z_][a-zA-Z0-9_]*) and a blocklist of dangerous patterns (;, --, /*, \n, \x00, xp_ prefix).
# [SECURITY] Allowlist approach β only explicitly allowed characters pass
VALID_IDENTIFIER_RE = re.compile(r"^[a-zA-Z_][a-zA-Z0-9_]*$")All SQL identifiers in generated output are wrapped with quote_identifier(), which applies dialect-appropriate quoting:
| Dialect | Quoting |
|---|---|
| PostgreSQL | "table_name" |
| MySQL | `table_name` |
| SQLite | "table_name" |
Templates run inside a SandboxedEnvironment that blocks access to Python internals. All interpolated values pass through security filters registered on the environment:
| Filter | Purpose |
|---|---|
quote_id |
Identifier validation + dialect quoting |
validate_type |
Column type allowlist |
safe_default |
Reject SQL metacharacters in DEFAULT values |
safe_on_action |
FK action allowlist (CASCADE, SET NULL, ...) |
Templates are static files bundled with the package. User-supplied template paths are never accepted.
Hierarchy: Docker Secrets (files) > env vars > CLI arg (last resort).
mask_url() is applied before any log entry, error message, or terminal output. The masking algorithm uses rfind("@") to correctly handle passwords containing @, #, ?, and other special characters.
# [SECURITY] Credential masking β handles passwords with special chars (@, #, ?)
mask_url("postgresql://admin:S3cr3t!@host/db")
# β "postgresql://admin:****@host/db"Connection strings are never written to: logs, generated SQL, diff output, or error messages. Raw SQLAlchemy exceptions (which may embed the URL) are caught at the loader boundary and replaced with a safe message.
Output paths are resolved with Path.resolve() and checked for .. components. Only .sql, .json, .yml, and .yaml extensions are accepted.
# [SECURITY] Path traversal prevention β allowlist extensions
validate_output_path("../../etc/cron.d/evil.sql") # β SecurityError
validate_output_path("output.sh") # β SecurityErrorScheme allowlist: postgresql, mysql, sqlite. Any other scheme raises SecurityError before a connection is attempted. Connection timeout is enforced at 10 seconds.
| Measure | Detail |
|---|---|
| Pinned image versions | postgres:16.3-alpine3.20, mysql:8.0.37 β no :latest |
| Docker Secrets | Passwords via files (POSTGRES_PASSWORD_FILE), never env vars |
no-new-privileges: true |
Prevents privilege escalation inside container |
read_only: true |
Immutable root filesystem at runtime |
cap_drop: ALL |
All Linux capabilities dropped |
| localhost bind only | Ports on 127.0.0.1:XXXX:XXXX β never 0.0.0.0 |
| Non-root user | uid 1001 in the image β USER deltadb |
| Multi-stage build | Minimal runtime image β dev tools never shipped |
All exceptions from database operations are caught at the loader boundary, credentials masked, and a safe message returned to the caller. Stack traces appear only with --verbose.
make security-check # bandit -r deltadb/ -ll && pip-auditBoth pass with zero findings before every merge.
| Tool | Version | Purpose |
|---|---|---|
| Python | 3.11+ | Runtime |
| Docker | 24+ | Run PostgreSQL + MySQL for integration tests |
| Docker Compose | v2+ | Orchestrate database services |
| Make | any | Run Makefile targets |
| Platform | What to install |
|---|---|
| Linux (native) | Docker Engine |
| Windows (WSL2) | Docker Desktop for Windows β Settings β Resources β WSL Integration β enable your distro |
| macOS | Docker Desktop for Mac |
WSL2 note: the legacy
docker-composebinary is not available in WSL2. deltadb detects this automatically and falls back todocker compose(plugin form).
git clone https://github.com/devpedrois/deltadb.git
cd deltadb
make installmake setup-secretsGenerates cryptographically random 32-byte passwords in secrets/pg_password.txt, secrets/mysql_password.txt, and secrets/mysql_root_password.txt. These files are gitignored and never enter the image.
make docker-upBrings up PostgreSQL 16 and MySQL 8 with healthchecks. Waits until both are accepting connections.
docker compose psBoth services should show healthy.
deltadb diff tests/fixtures/schema_a.yml tests/fixtures/schema_b.yml --dialect postgresqldeltadb diff schema_a.yml schema_b.yml \
--dialect postgresql \
--output migrations/001_initial.sqldeltadb snapshot "postgresql://user:pass@localhost/mydb" \
--output snapshots/current.ymldeltadb diff SOURCE TARGET [OPTIONS]
Compares SOURCE schema to TARGET schema and outputs the SQL migration.
| Option | Default | Description |
|---|---|---|
--dialect |
auto-detected | postgresql, mysql, or sqlite |
--output FILE |
stdout | Write SQL to file (.sql) |
--json-output FILE |
β | Write diff summary as JSON (.json) |
--no-destructive |
false | Omit DROP TABLE and DROP COLUMN |
--detect-renames |
false | Enable rename heuristic |
--rename-threshold FLOAT |
0.7 |
Similarity threshold (0.0β1.0] |
--verbose |
false | Show stack traces on error |
Examples:
# YAML to YAML
deltadb diff schema_a.yml schema_b.yml
# Live DB to YAML, skip destructive ops
deltadb diff "postgresql://user:pass@localhost/db" schema_target.yml --no-destructive
# Detect renames with custom threshold
deltadb diff schema_a.yml schema_b.yml --detect-renames --rename-threshold 0.8
# Output SQL + JSON summary
deltadb diff schema_a.yml schema_b.yml \
--output migrations/001.sql \
--json-output reports/diff.jsondeltadb snapshot SOURCE [OPTIONS]
Reflects a live database schema and writes it as a YAML file.
| Option | Default | Description |
|---|---|---|
--output FILE |
stdout | Write YAML to file (.yml or .yaml) |
Examples:
deltadb snapshot "postgresql://user:pass@localhost/db" --output snapshots/pg.yml
deltadb snapshot "mysql://user:pass@localhost/db" --output snapshots/mysql.ymldialect: postgresql # postgresql | mysql | sqlite
tables:
users:
columns:
- name: id
type: uuid
primary_key: true
nullable: false
- name: email
type: varchar(255)
nullable: false
- name: created_at
type: timestamp
nullable: false
default: "now()"
indexes:
- name: idx_users_email
columns: [email]
unique: true
unique_constraints:
- name: uq_users_email
columns: [email]
orders:
columns:
- name: id
type: serial
primary_key: true
nullable: false
- name: user_id
type: uuid
nullable: false
foreign_keys:
- name: fk_orders_user
columns: [user_id]
referred_table: users
referred_columns: [id]
on_delete: CASCADEAllowed root keys: dialect, tables
Allowed on_delete / on_update values: CASCADE, SET NULL, SET DEFAULT, RESTRICT, NO ACTION
All table names, column names, index names, FK names, and unique constraint names are validated as SQL identifiers before parsing completes.
| Feature | PostgreSQL | MySQL | SQLite |
|---|---|---|---|
| Identifier quoting | "name" |
`name` |
"name" |
ALTER COLUMN TYPE |
ALTER COLUMN ... TYPE |
MODIFY COLUMN |
Not supported (comment generated) |
DROP COLUMN |
Full support | Full support | Not supported (comment generated) |
RENAME TABLE |
ALTER TABLE ... RENAME TO |
RENAME TABLE |
ALTER TABLE ... RENAME TO |
RENAME COLUMN |
ALTER TABLE ... RENAME COLUMN |
ALTER TABLE ... RENAME COLUMN |
ALTER TABLE ... RENAME COLUMN |
| Foreign keys | Full support | Full support | Partial (no ALTER TABLE ADD FK) |
SQLite limitations are rendered as comments in the migration with a -- NOT SUPPORTED IN SQLITE: prefix so the file remains executable without errors.
Every DROP TABLE or DROP COLUMN in the generated SQL is preceded by:
-- WARNING: DESTRUCTIVE operation
DROP TABLE "orders";Omit all destructive operations from the generated migration. Useful for applying schema additions safely in a first pass:
# Phase 1: add new tables and columns only
deltadb diff schema_a.yml schema_b.yml --no-destructive --output phase1.sql
# Phase 2: after verifying, generate and review the drops separately
deltadb diff schema_a.yml schema_b.yml --output phase2_full.sqlPass --detect-renames to enable heuristic rename detection using difflib.SequenceMatcher. A column or table is considered renamed when its similarity score meets --rename-threshold (default 0.7) and no other candidate scores higher (1:1 matching).
Detected renames generate RENAME TABLE / RENAME COLUMN statements instead of DROP + ADD pairs β preserving data rather than destroying it.
Without --detect-renames, behavior is unchanged: removed names produce DROP, added names produce ADD.
The migration generator builds a dependency graph from foreign keys and applies topological sort:
- UP block: tables with no FK dependencies first; dependents after
- DOWN block: reverse order β dependents dropped before the tables they reference
This guarantees the generated SQL executes without FK constraint violations.
make test-unit
# pytest -m "not integration" -vNo Docker required. Covers: model, YAML loader, diff engine, rename detection, SQL generator, security (30 adversarial tests), CLI.
make docker-test
# docker-up β pytest -m integration β docker-downUses testcontainers to spin up real PostgreSQL and MySQL instances. Tests SQLAlchemy reflection, type normalization, and FK handling against actual engines.
pytest -m docker_securityVerifies running containers: docker-compose config validity, no plaintext password in docker inspect, containers not running as root. Skipped automatically when Docker is not available.
make test
# Generates coverage report β current: 93%make security-check
# bandit -r deltadb/ -ll (zero findings)
# pip-audit (zero CVEs)| Variable | Default | Description |
|---|---|---|
DELTADB_SOURCE_URL |
β | Source connection string (alternative to CLI arg) |
DELTADB_TARGET_URL |
β | Target connection string (alternative to CLI arg) |
DELTADB_PG_PASSWORD_FILE |
β | Path to Docker Secret file for PostgreSQL password |
DELTADB_MYSQL_PASSWORD_FILE |
β | Path to Docker Secret file for MySQL password |
DELTADB_LOG_LEVEL |
INFO |
Log level: DEBUG, INFO, WARNING, ERROR |
Connection strings in env vars follow the same allowlist and masking rules as CLI args. Passwords in Docker Secret files (via *_FILE vars) take precedence over the connection string.
deltadb/
βββ deltadb/
β βββ cli.py # Click commands: diff, snapshot
β βββ config.py # Constants: timeouts, limits, dialects
β βββ exceptions.py # DeltaDbError, LoaderError, SecurityError, GeneratorError
β β
β βββ model/ # Frozen dataclasses β immutable schema representation
β β βββ column.py # Column
β β βββ table.py # Table
β β βββ constraint.py # PrimaryKey, ForeignKey, UniqueConstraint
β β βββ index.py # Index
β β βββ schema.py # SchemaModel
β β βββ types.py # normalize_type(raw, dialect) β str
β β
β βββ loader/ # Schema loading from DB or YAML
β β βββ base.py # BaseLoader(ABC)
β β βββ yaml_loader.py # YamlLoader β uses safe_load_yaml()
β β βββ db_loader.py # DbLoader β SQLAlchemy reflection + identifier validation
β β βββ factory.py # create_loader(source) β BaseLoader
β β
β βββ diff/ # Schema comparison β pure functions, no I/O
β β βββ changes.py # ChangeType enum + Change dataclass
β β βββ comparators.py # Per-object comparators (columns, indexes, FKs, UCs)
β β βββ engine.py # DiffEngine.diff(source, target) β list[Change]
β β βββ rename.py # RenameDetector β SequenceMatcher heuristic
β β
β βββ generator/ # Changes β SQL
β β βββ dialects.py # Dialect enum + detect_dialect()
β β βββ sql_generator.py # SqlGenerator β SandboxedEnvironment + security filters
β β βββ topological.py # topological_sort_up / topological_sort_down
β β βββ templates/ # Jinja2 templates per dialect
β β βββ postgresql/ # 14 templates (.sql.j2)
β β βββ mysql/ # 14 templates
β β βββ sqlite/ # 14 templates (unsupported ops rendered as comments)
β β
β βββ output/ # Writing results
β β βββ rich_printer.py # Colored terminal diff summary
β β βββ sql_writer.py # write_sql() β validates output path before writing
β β βββ json_writer.py # write_json() β diff summary as JSON
β β
β βββ security/ # Defense-in-depth security primitives
β βββ credentials.py # mask_url(), get_credential() (Docker Secrets > env vars)
β βββ identifiers.py # validate_identifier(), quote_identifier(), validate_column_type()
β βββ path_safety.py # validate_output_path() β no .. , allowlist extensions
β βββ yaml_safety.py # safe_load_yaml() β size check + structure validation
β
βββ tests/
β βββ conftest.py # pg_url / mysql_url fixtures via testcontainers
β βββ test_security.py # 30 adversarial tests β injection, YAML, credentials, paths
β βββ test_docker_security.py # Container hardening verification (skipif Docker unavailable)
β βββ test_cli.py # End-to-end CLI tests with CliRunner
β βββ test_diff_engine.py # Diff engine scenarios
β βββ test_sql_generator.py # SQL generation per dialect
β βββ test_rename.py # Rename detection
β βββ test_db_loader.py # DbLoader (SQLite unit + PG/MySQL integration)
β βββ fixtures/
β βββ schema_a.yml # Source schema fixture
β βββ schema_b.yml # Target schema fixture (with FKs, indexes)
β βββ malicious_schema.yml # !!python/object exploit attempt β must be rejected
β
βββ secrets/ # Docker Secrets β real values never committed
β βββ .gitignore # Ignores *.txt
β βββ *.txt.example # Placeholder files showing expected format
β
βββ docker-compose.yml # Dev: PostgreSQL 16 + MySQL 8 with full hardening
βββ docker-compose.test.yml # Test override: tmpfs, no ports, ephemeral
βββ Dockerfile # Multi-stage build, non-root user, read-only filesystem
βββ Makefile # install, test, lint, format, security-check, docker-*
βββ pyproject.toml # Project metadata, pytest markers, ruff/black config
βββ .env.example # Example environment variables
Frozen dataclasses for SchemaModel β All model objects (Column, Table, SchemaModel, etc.) use frozen=True. The diff engine takes two immutable schemas and produces a list of Change objects. There is no mutation, no in-place patching. This makes the diff logic pure and trivially testable.
Zero Trust on reflected names β When DbLoader reflects a real database, every table name, column name, index name, and FK name passes through validate_identifier() before entering SchemaModel. A compromised or adversarially-named database cannot inject SQL identifiers into the generated migration.
Jinja2 SandboxedEnvironment with filters β Security-sensitive operations (identifier quoting, type validation, FK action validation) are implemented as Jinja2 filters registered on the environment. Templates never call Python functions directly. A template author cannot bypass security by writing {{ table.__class__ }} or accessing builtins.
mask_url with rfind β Passwords containing @ confuse urlparse because @ is the authority delimiter. The implementation uses rfind("@") on the raw authority section β the last @ before the path is the correct delimiter. This correctly masks mysql://root:p@ss#w0rd@host/db.
Topological sort for FK safety β The generator builds a directed graph where an edge A β B means "table A has a FK referencing table B". topological_sort_up produces a linear order where B always appears before A (creates before dependents). topological_sort_down reverses this β dependents dropped first, then referenced tables.
Rename detection as opt-in β Rename detection changes semantics: a detected rename generates RENAME COLUMN (data-preserving) instead of DROP COLUMN + ADD COLUMN (data-destroying). Since false positives would silently corrupt data, it is opt-in with a configurable similarity threshold.
Docker Secrets over env vars β Passwords stored in env vars appear in docker inspect, /proc/<pid>/environ, and ps aux. Docker Secrets mount as files in /run/secrets/ with 0400 permissions, readable only by the container process. get_credential() checks the *_FILE env var path first.
- Fork the repo and create your branch from
main - Follow the branch naming convention:
feat/<slug>,fix/<slug>,docs/<slug> - Follow Conventional Commits:
feat(scope): description,fix(scope): description - Write or update tests for any changed code β run
make testbefore opening a PR - Run
make security-checkβbanditandpip-auditmust pass - Open a pull request against
main
Accepted commit types: feat, fix, docs, refactor, test, perf, build, ci, chore
This project is licensed under the MIT License.