A modular research framework for testing quantitative trading strategies across equities, derivatives, and futures. Implements institutional-grade statistical inference, stochastic calculus, and machine learning pipelines with rigorous walk-forward validation and multiple-testing correction.
Walk-forward out-of-sample backtesting (12-month train / 3-month test, rolling quarterly, 2015–2024). All metrics net of 10bps transaction costs.
| Pair | Folds | Traded | OOS Sharpe | Sortino | CAGR | Max DD | Note |
|---|---|---|---|---|---|---|---|
| PEP/KO | 31 | 1 | 0.73 | 0.88 | 4.5% | -1.3% | Cointegrated only in 2015 training window (p=0.0008). All subsequent folds failed cointegration at 5%. |
| XOM/CVX | 31 | 2 | 1.01 | 1.53 | 13.2% | -6.2% | Fold 9 (2018 Q1): Sharpe 2.48, +8.4% return. Fold 26 (2022 Q2): Sharpe -1.26, -2.0% return during energy spike. DSR=0.00 — aggregate Sharpe is not statistically significant across only 2 folds. |
Key Insight: Static Engle-Granger cointegration on two-stock pairs is regime-dependent — only 3 of 62 total folds (4.8%) passed the cointegration test at the 5% significance level. When the relationship holds (e.g., XOM/CVX in 2017–2018), the strategy produces strong OOS alpha (Sharpe 2.48). When it breaks (2020 COVID, 2022 energy crisis), the strategy correctly stays flat — a feature, not a bug. This motivates the HMM regime detection module and Kalman Filter hedge ratio estimation.
Full fold-level diagnostics with cointegration p-values, hedge ratios, and half-lives are logged to
logs/diagnostics_*.json. Honest null results are themselves valuable research findings — demonstrating this rigor is more impressive to institutional interviewers than fabricating a high Sharpe on cherry-picked data.
Finance/
├── config/ # Centralized settings, logging
├── data/
│ ├── wrds_fetcher.py # Polygon.io (primary) + WRDS/CRSP (pending)
│ ├── fred_fetcher.py # FRED macro factors (yield curve, VIX, credit spreads)
│ ├── fetcher.py # Yahoo Finance fallback (rate-limited, retried)
│ ├── storage.py # Parquet + SQLite persistence layer
│ ├── preprocessor.py # Cleaning, outlier detection, feature engineering
│ └── fractional_diff.py # Fractional differentiation (memory-preserving stationarity)
├── derivatives/
│ ├── black_scholes.py # BSM European option pricing
│ ├── greeks.py # Analytical Greeks (Δ, Γ, ν, Θ, ρ)
│ ├── implied_vol.py # IV solver (Brent / Newton-Raphson)
│ ├── volatility_models.py # GARCH(1,1), EWMA, Parkinson, Realized Vol
│ └── heston_model.py # Stochastic volatility (Euler-Maruyama MC)
├── strategies/
│ ├── base.py # Abstract strategy interface
│ ├── classical/
│ │ └── pairs_trading.py # Engle-Granger cointegration, z-score entry/exit
│ ├── ml/
│ │ ├── hmm_regime.py # Gaussian HMM regime detection (EM + Viterbi)
│ │ └── sentiment_signal.py # FinBERT NLP for alternative data signals
│ └── portfolio/
│ └── kelly_criterion.py # Discrete & continuous Kelly sizing
├── backtesting/
│ ├── vectorized.py # Fast vectorized backtester
│ ├── metrics.py # Sharpe, Sortino, Calmar, drawdown, profit factor
│ └── purged_cv.py # Combinatorial Purged CV + Deflated Sharpe Ratio
├── scripts/
│ └── run_backtest_report.py # Walk-forward validation & report generation
├── logs/ # Fold-level diagnostics (JSON) & backtest logs
└── notebooks/
├── interview_prep/ # Mathematical deep-dives (see below)
└── research/ # Hypothesis-driven research narratives
- Kalman Filter pairs trading: Dynamic hedge ratio estimation via recursive Bayesian state-space filtering, replacing static rolling OLS.
- Ornstein-Uhlenbeck modeling: MLE-calibrated mean-reversion speed (θ) and half-life from the exact discrete solution of the OU SDE.
- HMM regime detection: Gaussian mixture emissions with Baum-Welch training and Viterbi decoding for bull/bear regime classification.
- Black-Scholes-Merton: European options with dividend adjustments, put-call parity verification, moneyness classification.
- Analytical Greeks: Closed-form Δ, Γ, ν, Θ, ρ with portfolio-level aggregation and delta-hedge ratio computation.
- Heston stochastic volatility: Monte Carlo pricing via Euler-Maruyama discretization with full truncation for variance positivity.
- Implied volatility surface: Numerical IV inversion using Brent's method with Vega-based Newton-Raphson acceleration.
- Fractional differentiation: Binomial series expansion for non-integer differencing; iterative ADF testing to find minimum d for stationarity while preserving maximum memory.
- NLP alternative data: FinBERT-based sentiment extraction from financial text, integrated as features via EMA smoothing and forward-fill alignment.
- Purged cross-validation: Train/test purging and embargoing to eliminate lookahead bias from serially correlated financial data.
- Deflated Sharpe Ratio: Multiple-testing correction using Extreme Value Theory to compute the probability that a backtest's Sharpe ratio is statistically significant.
- Kelly Criterion: Both discrete (win-rate/payoff) and continuous (Merton's portfolio problem) formulations with fractional Kelly scaling.
- Walk-forward validation: Rolling train/test splits with quarterly refit to simulate realistic out-of-sample performance.
| Source | Role | Status | Coverage |
|---|---|---|---|
| Polygon.io | Primary price data | ✅ Active | Clean institutional-grade daily/intraday OHLCV, options chains. Free tier: 5 calls/min, 2yr history. |
| FRED | Macro factor features | ✅ Active | Yield curve (3M/10Y spread), VIX, credit spreads (BAA-AAA), CPI, Fed Funds rate, breakeven inflation. Completely free, unlimited. |
| SEC EDGAR | NLP pipeline input | ✅ Active | 10-K, 10-Q, 8-K filings, earnings call transcripts → fed into FinBERT sentiment module. |
| WRDS / CRSP | Survivorship-bias-free backtesting | 🔄 Pending access | Delisting-adjusted daily returns, Fama-French factors. Gold standard for academic-grade backtests. |
| Yahoo Finance | Legacy fallback | Daily OHLCV (unlimited history). Used for rapid prototyping; not recommended for production research due to data quality issues. |
Why this stack? Polygon is what quant shops use for prototyping — it signals institutional familiarity in interviews. FRED adds macro regime features that meaningfully upgrade the HMM module (yield curve inversion is the most reliable recession predictor in existence). WRDS/CRSP replaces Polygon for final backtests once access is approved, since CRSP is survivorship-bias-free and delisting-adjusted.
python -m venv venv
venv\Scripts\activate
pip install -r requirements.txt
# Run walk-forward backtest and generate metrics
python scripts/run_backtest_report.py
# Launch research notebooks
jupyter notebook notebooks/| # | Notebook | Topics |
|---|---|---|
| 01 | premier_prediction_models_kalman_ou |
Kalman Filter state-space math, OU SDE exact solution, MLE calibration |
| 02 | regime_detection_hmm |
Transition matrices, Baum-Welch EM, Viterbi decoding on SPY |
| 03 | fractional_differentiation |
Binomial series expansion, stationarity vs. memory tradeoff, ADF testing |
| 04 | purged_cross_validation |
Data leakage in finance, Deflated Sharpe Ratio, Extreme Value Theory |
| # | Notebook | Hypothesis |
|---|---|---|
| 01 | pairs_trading_research |
Kalman-filtered hedge ratios produce statistically significant OOS alpha in energy sector equities |
This is a research framework. It is not connected to live execution infrastructure and has not been validated for production trading. All backtest results are out-of-sample but simulated — actual trading involves additional risks including latency, partial fills, and market impact that are not modeled here.
MIT