A Python backtesting framework for statistical arbitrage on cointegrated asset pairs. Implements mean-reversion signal generation, ATR-based risk management, and comprehensive performance analysis.
Works with any two assets available through yfinance: stocks, ETFs, futures (e.g., CL=F), and forex pairs.
Given two historically cointegrated assets, the system:
- Validates the pair statistically (Engle-Granger, Johansen, ADF, half-life)
- Computes a hedge-ratio-adjusted log spread
- Generates long/short signals when the z-score deviates beyond a threshold
- Applies regime detection and volatility-adjusted entry thresholds
- Sizes positions using ATR-based risk budgeting
- Runs a backtest with transaction costs and slippage
- Outputs performance metrics, charts, and a trade journal CSV
| File | Purpose |
|---|---|
config.py |
All configuration parameters (pair, strategy, risk, backtest, output) |
data_handler.py |
Data download, cleaning, cointegration tests, hedge ratio, spread computation |
strategy.py |
Signal generation: z-score, regime detection, momentum filter, position management |
backtester.py |
Return calculation, transaction costs, equity curve, drawdown, performance metrics |
risk_manager.py |
ATR-based position sizing, stop-loss/take-profit, portfolio heat limits |
visualization.py |
Price charts, spread analysis, equity curve, trade distribution, monthly heatmap |
main.py |
Orchestration: runs the full pipeline end-to-end, interactive or programmatic |
tests/test_strategy.py |
Unit tests for spread math, z-score, cointegration detection, signal validity |
pip install -r requirements.txtRequirements: pandas, numpy, matplotlib, seaborn, yfinance, statsmodels, scipy, pytest
Python 3.8+.
python main.pyPrompts you to pick from several pre-configured examples (SPY-QQQ, GLD-SLV, CL-HO, AAPL-MSFT) or enter a custom pair.
from main import run_pairs_strategy
result = run_pairs_strategy(
asset1='GLD',
asset2='GDX',
pair_name='Gold-Miners',
start_date='2015-01-01',
initial_capital=500_000,
asset_type='etf'
)
metrics = result['metrics']
print(metrics['Sharpe_Ratio'], metrics['Total_Return_Pct'])from main import run_multiple_pairs
pairs = [
('SPY', 'QQQ', 'SPY-QQQ'),
('GLD', 'SLV', 'Gold-Silver'),
('GLD', 'GDX', 'Gold-Miners'),
]
df = run_multiple_pairs(pairs, start_date='2015-01-01', initial_capital=500_000)pytest tests/ -vAll tests use synthetic data, no network access required.
Strategy and risk parameters live in config.py and can be overridden after construction:
from config import Config
config = Config(asset1='SPY', asset2='QQQ', start_date='2015-01-01')
# Strategy
config.strategy.window = 30 # z-score rolling window
config.strategy.z_entry_long = -2.0 # long entry threshold
config.strategy.z_entry_short = 2.0 # short entry threshold
config.strategy.z_exit = 0.5 # exit threshold
# Risk
config.risk.risk_per_trade = 0.02 # 2% capital at risk per trade
config.risk.max_position_size = 0.30 # max 30% capital per position
config.risk.atr_stop_multiple = 2.5 # stop at 2.5x ATR from entry
# Costs
config.backtest.transaction_cost_pct = 0.0005 # 5 bps commission
config.backtest.slippage_pct = 0.0002 # 2 bps slippage
config.backtest.commission_per_contract = 2.50 # futures onlyPresets are available for common pairs:
config = Config.from_preset('oil-crack') # CL=F / HO=F
config = Config.from_preset('gold-silver')
config = Config.from_preset('spy-qqq')Results are saved to results/<pair_name>/:
01_prices.png— Asset1 and Asset2 price series02_spread.png— Spread with Bollinger bands and z-score with signals03_equity.png— Equity curve with drawdown04_trades.png— Trade P&L distribution, MAE/MFE scatter05_monthly.png— Monthly returns heatmap06_cointegration.png— Rolling cointegration p-value over timetrade_journal.csv— One row per trade with full metadata
Entry: z-score crosses entry threshold AND regime is mean-reverting AND momentum confirms direction. Optionally restricted to periods where the rolling cointegration test is significant.
Exit: z-score crosses back through the exit threshold, or regime shifts to Volatile_Trending.
Position sizing: risk amount / (ATR * stop_multiple), capped by capital percentage and portfolio heat limits. Position is halved during extreme volatility (ATR > 90th percentile).
Alexander Robbins
University of Florida — Math, CS, Economics
robbins.a@ufl.edu
https://github.com/XanderRobbins
MIT