Org-EthereaLogic.ai is the open-source home of the Enterprise Data Trust portfolio — a suite of Databricks-native data quality and governance controls built for Medallion Architectures. Every repository runs on Databricks Free Edition, every claim is backed by passing tests, and every control pattern is production-reproducible.
"Is your data pipeline trustworthy — or just running without errors?"
Benchmark headline — Perfect challenger recall (1.00 vs 0.8767 industry baseline) validated across 3 public datasets totaling 6.6M rows — Census ACS, NYC TLC Taxi, UCI Adult. 100% detection of silent data corruption before it reaches executive dashboards. Methodology and preregistered experiments: From Theory to Evidence →
| Chapter | Repository | Description | Tests |
|---|---|---|---|
| Ch 1 | Trusted Source Intake | Certifies every record before downstream consumption. 7 contract checks, replay detection, schema drift handling, and quarantine with explicit reasons. | 56 |
| Ch 2 | Silent Failure Prevention | Detects when business columns collapse despite healthy schema and row counts. Distribution stability scoring, 6 publication gates, blocked Gold refresh on degradation. | 50 |
| Ch 3 | Measurable Control Effectiveness | Scores data controls against known failure scenarios with precision, recall, and ground truth. Perfect recall where industry baselines missed injected drift. | 37 |
| Ch 4 | DriftSentinel | Unified platform — intake certification, drift gating, and control benchmarking in a single governed pipeline. Operator dashboard included. | 397 |
| Ch 5 | AetheriaForge | Coherence-scored transformation engine — entity resolution, temporal reconciliation, and schema enforcement with append-only evidence. Published on PyPI. | 304 |
Chapters 4 and 5 are full Databricks-deployable applications with operator dashboards, Asset Bundle deployment, and PyPI packages.
Quick install:
pip install etherealogic-driftsentinel # Ch 4 — Shannon entropy drift detection (355+ tests)
pip install etherealogic-aetheriaforge # Ch 5 — coherence-scored transformation engine (300+ tests)Both packages are Databricks-deployable via Asset Bundles. See each repo's README for the bootstrap workflow. All five Data Trust chapter repos are MIT-licensed.
Core Platform
Data Quality & Governance
CI/CD & Code Quality
Infrastructure
| 🛡️ Data Trust | 🧪 Test-Driven | 🔓 Open Source | 📊 Production-Ready |
|---|---|---|---|
| Every control pattern is evidence-backed | 844+ passing tests across 5 repos | All repos run on Databricks Free Edition | Operator dashboards, CI/CD, and security scanning |
All repositories are open source and reproducible. ⭐ Star them if you find them useful — it helps others in the community find them too.
