Skip to content

GamePointAnalytics/QuantTradingSystem

Repository files navigation

QuantTradingSystem

A modular research framework for testing quantitative trading strategies across equities, derivatives, and futures. Implements institutional-grade statistical inference, stochastic calculus, and machine learning pipelines with rigorous walk-forward validation and multiple-testing correction.

Key Findings

Walk-forward out-of-sample backtesting (12-month train / 3-month test, rolling quarterly, 2015–2024). All metrics net of 10bps transaction costs.

Pair Folds Traded OOS Sharpe Sortino CAGR Max DD Note
PEP/KO 31 1 0.73 0.88 4.5% -1.3% Cointegrated only in 2015 training window (p=0.0008). All subsequent folds failed cointegration at 5%.
XOM/CVX 31 2 1.01 1.53 13.2% -6.2% Fold 9 (2018 Q1): Sharpe 2.48, +8.4% return. Fold 26 (2022 Q2): Sharpe -1.26, -2.0% return during energy spike. DSR=0.00 — aggregate Sharpe is not statistically significant across only 2 folds.

Key Insight: Static Engle-Granger cointegration on two-stock pairs is regime-dependent — only 3 of 62 total folds (4.8%) passed the cointegration test at the 5% significance level. When the relationship holds (e.g., XOM/CVX in 2017–2018), the strategy produces strong OOS alpha (Sharpe 2.48). When it breaks (2020 COVID, 2022 energy crisis), the strategy correctly stays flat — a feature, not a bug. This motivates the HMM regime detection module and Kalman Filter hedge ratio estimation.

Full fold-level diagnostics with cointegration p-values, hedge ratios, and half-lives are logged to logs/diagnostics_*.json. Honest null results are themselves valuable research findings — demonstrating this rigor is more impressive to institutional interviewers than fabricating a high Sharpe on cherry-picked data.

Architecture

Finance/
├── config/              # Centralized settings, logging
├── data/
│   ├── wrds_fetcher.py    # Polygon.io (primary) + WRDS/CRSP (pending)
│   ├── fred_fetcher.py    # FRED macro factors (yield curve, VIX, credit spreads)
│   ├── fetcher.py         # Yahoo Finance fallback (rate-limited, retried)
│   ├── storage.py         # Parquet + SQLite persistence layer
│   ├── preprocessor.py    # Cleaning, outlier detection, feature engineering
│   └── fractional_diff.py # Fractional differentiation (memory-preserving stationarity)
├── derivatives/
│   ├── black_scholes.py    # BSM European option pricing
│   ├── greeks.py           # Analytical Greeks (Δ, Γ, ν, Θ, ρ)
│   ├── implied_vol.py      # IV solver (Brent / Newton-Raphson)
│   ├── volatility_models.py  # GARCH(1,1), EWMA, Parkinson, Realized Vol
│   └── heston_model.py    # Stochastic volatility (Euler-Maruyama MC)
├── strategies/
│   ├── base.py             # Abstract strategy interface
│   ├── classical/
│   │   └── pairs_trading.py  # Engle-Granger cointegration, z-score entry/exit
│   ├── ml/
│   │   ├── hmm_regime.py     # Gaussian HMM regime detection (EM + Viterbi)
│   │   └── sentiment_signal.py  # FinBERT NLP for alternative data signals
│   └── portfolio/
│       └── kelly_criterion.py  # Discrete & continuous Kelly sizing
├── backtesting/
│   ├── vectorized.py      # Fast vectorized backtester
│   ├── metrics.py         # Sharpe, Sortino, Calmar, drawdown, profit factor
│   └── purged_cv.py       # Combinatorial Purged CV + Deflated Sharpe Ratio
├── scripts/
│   └── run_backtest_report.py  # Walk-forward validation & report generation
├── logs/                       # Fold-level diagnostics (JSON) & backtest logs
└── notebooks/
    ├── interview_prep/         # Mathematical deep-dives (see below)
    └── research/               # Hypothesis-driven research narratives

Quantitative Methods

Statistical Inference & Signal Generation

  • Kalman Filter pairs trading: Dynamic hedge ratio estimation via recursive Bayesian state-space filtering, replacing static rolling OLS.
  • Ornstein-Uhlenbeck modeling: MLE-calibrated mean-reversion speed (θ) and half-life from the exact discrete solution of the OU SDE.
  • HMM regime detection: Gaussian mixture emissions with Baum-Welch training and Viterbi decoding for bull/bear regime classification.

Derivatives Pricing

  • Black-Scholes-Merton: European options with dividend adjustments, put-call parity verification, moneyness classification.
  • Analytical Greeks: Closed-form Δ, Γ, ν, Θ, ρ with portfolio-level aggregation and delta-hedge ratio computation.
  • Heston stochastic volatility: Monte Carlo pricing via Euler-Maruyama discretization with full truncation for variance positivity.
  • Implied volatility surface: Numerical IV inversion using Brent's method with Vega-based Newton-Raphson acceleration.

Machine Learning Pipeline

  • Fractional differentiation: Binomial series expansion for non-integer differencing; iterative ADF testing to find minimum d for stationarity while preserving maximum memory.
  • NLP alternative data: FinBERT-based sentiment extraction from financial text, integrated as features via EMA smoothing and forward-fill alignment.
  • Purged cross-validation: Train/test purging and embargoing to eliminate lookahead bias from serially correlated financial data.
  • Deflated Sharpe Ratio: Multiple-testing correction using Extreme Value Theory to compute the probability that a backtest's Sharpe ratio is statistically significant.

Risk Management

  • Kelly Criterion: Both discrete (win-rate/payoff) and continuous (Merton's portfolio problem) formulations with fractional Kelly scaling.
  • Walk-forward validation: Rolling train/test splits with quarterly refit to simulate realistic out-of-sample performance.

Data Sources

Source Role Status Coverage
Polygon.io Primary price data ✅ Active Clean institutional-grade daily/intraday OHLCV, options chains. Free tier: 5 calls/min, 2yr history.
FRED Macro factor features ✅ Active Yield curve (3M/10Y spread), VIX, credit spreads (BAA-AAA), CPI, Fed Funds rate, breakeven inflation. Completely free, unlimited.
SEC EDGAR NLP pipeline input ✅ Active 10-K, 10-Q, 8-K filings, earnings call transcripts → fed into FinBERT sentiment module.
WRDS / CRSP Survivorship-bias-free backtesting 🔄 Pending access Delisting-adjusted daily returns, Fama-French factors. Gold standard for academic-grade backtests.
Yahoo Finance Legacy fallback ⚠️ Fallback only Daily OHLCV (unlimited history). Used for rapid prototyping; not recommended for production research due to data quality issues.

Why this stack? Polygon is what quant shops use for prototyping — it signals institutional familiarity in interviews. FRED adds macro regime features that meaningfully upgrade the HMM module (yield curve inversion is the most reliable recession predictor in existence). WRDS/CRSP replaces Polygon for final backtests once access is approved, since CRSP is survivorship-bias-free and delisting-adjusted.

Quick Start

python -m venv venv
venv\Scripts\activate
pip install -r requirements.txt

# Run walk-forward backtest and generate metrics
python scripts/run_backtest_report.py

# Launch research notebooks
jupyter notebook notebooks/

Research Notebooks

Interview Preparation Series

# Notebook Topics
01 premier_prediction_models_kalman_ou Kalman Filter state-space math, OU SDE exact solution, MLE calibration
02 regime_detection_hmm Transition matrices, Baum-Welch EM, Viterbi decoding on SPY
03 fractional_differentiation Binomial series expansion, stationarity vs. memory tradeoff, ADF testing
04 purged_cross_validation Data leakage in finance, Deflated Sharpe Ratio, Extreme Value Theory

Research Narratives

# Notebook Hypothesis
01 pairs_trading_research Kalman-filtered hedge ratios produce statistically significant OOS alpha in energy sector equities

Disclaimer

This is a research framework. It is not connected to live execution infrastructure and has not been validated for production trading. All backtest results are out-of-sample but simulated — actual trading involves additional risks including latency, partial fills, and market impact that are not modeled here.

License

MIT

About

Quant trading system for research and learning, featuring options pricing, statistical arbitrage, machine learning, risk management, and backtesting across equities, derivatives, and futures.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors