swing_highs_lows() uses a centered window that looks into future bars:
# smc.py line 153-163
ohlc["high"].shift(-(swing_length // 2)).rolling(swing_length).max()
shift(-(swing_length // 2)) shifts data forward, so the function sees future price data that wouldn't exist in real-time trading. This is look-ahead bias and it inflates backtest results significantly.
I tested the same trading logic with and without this bias on XAUUSD M15 (10 years, 280k bars):
| Swing Method |
Trades |
Win Rate |
Profit Factor |
| Centered window (current) |
177 |
81.4% |
7.32 |
| No look-ahead (confirm bars) |
106 |
52.8% |
1.82 |
PF drops from 7.32 to 1.82 when the bias is removed. Anyone backtesting with this function will get inflated results.
A simple fix using confirm bars (only past data, no future):
for i in range(lookback + confirm_bars, n):
candidate = i - confirm_bars
window = highs[candidate - lookback:candidate + 1]
if highs[candidate] == window.max():
if all(highs[candidate + cb] < highs[candidate] for cb in range(1, confirm_bars + 1)):
swing_high[i] = highs[candidate]
The candidate bar is the highest in the past lookback bars, then confirmed by confirm_bars subsequent bars all being lower. No future data used.
I saw PR #95 adds a causal parameter that shifts outputs forward. That prevents using future data for signals but the detection itself still uses the centered window. The confirm bars approach only uses past data from the start.
Would be nice to have a causal=True option that uses a genuinely bias-free algorithm. Happy to submit a PR if there's interest.
swing_highs_lows()uses a centered window that looks into future bars:shift(-(swing_length // 2))shifts data forward, so the function sees future price data that wouldn't exist in real-time trading. This is look-ahead bias and it inflates backtest results significantly.I tested the same trading logic with and without this bias on XAUUSD M15 (10 years, 280k bars):
PF drops from 7.32 to 1.82 when the bias is removed. Anyone backtesting with this function will get inflated results.
A simple fix using confirm bars (only past data, no future):
The candidate bar is the highest in the past
lookbackbars, then confirmed byconfirm_barssubsequent bars all being lower. No future data used.I saw PR #95 adds a
causalparameter that shifts outputs forward. That prevents using future data for signals but the detection itself still uses the centered window. The confirm bars approach only uses past data from the start.Would be nice to have a
causal=Trueoption that uses a genuinely bias-free algorithm. Happy to submit a PR if there's interest.