We built a regime-filter backtester this week. The framework takes a trade log and a market time series, labels each trade with the regime that was visible at entry (no look-ahead), splits into Group A (regime-allowed) and Group B (regime-blocked), and computes the full performance comparison. The headline numbers look excellent.

Max Drawdown
−9.7%−2.1%
−78% reduction
Sharpe Ratio
3.47.6
+124% improvement
Group B Net EV
+25
filtered trades were profitable
// Equity Curve: ALL trades vs Regime-Filtered (A only)
ALL trades (baseline)
A only (regime-allowed)
165k 155k 145k 130k 100k 2022-07 2023-01 2023-07 2024-01 2024-07 Gate blocks entries — curve flattens MaxDD −9.7%
Orange line stops compounding during 2023 H2 — the gate is working. But blue baseline ends higher in absolute terms. Both facts are true simultaneously.

The Honest Mirror

Standard regime-filter evaluation stops at the headline stats. MaxDD down, Sharpe up — ship it. We added one more metric that most practitioners skip: the net expectancy of Group B, the trades the gate blocked.

// Group B Net Expectancy — The Honest Mirror
+25
The filtered-out trades were profitable. The gate was not blocking losers. It was blocking winners. Every unit of MaxDD reduction came at a real cost in absolute return — and this number tells you exactly how much.

Look at the chart again. In 2023 H2, the orange line walks flat while the blue line drops −9.7% and then recovers above its prior high. The gate prevented the drawdown — and the recovery. The orange line ends the backtest lower in absolute P&L. Both facts are true simultaneously. The gate is doing exactly what it was designed to do. The question is whether you want that trade.

This is what honest risk distribution management looks like. Not free alpha. Not a filter that only blocks losing trades. A defensible trade-off with a known cost. If Group B expectancy is negative on your real data, the gate is pure upside. If it is positive, you are paying a defensive premium. Know which one you are buying.

Three Non-Negotiable Design Decisions

These are the choices that determine whether the backtest result is credible. Get them wrong and the numbers are meaningless regardless of how good they look.

// Decision 01
No Look-Ahead Is Written Into the Join, Not Enforced by Discipline
Trades on day D are labeled with the regime known at prior close — the regime sequence is offset by +1 day before the backward as-of join. It is physically impossible to match a trade to tomorrow's EMA. The default is regime_known_at='prior_close'. Only switch to same_close when you can confirm the entry decision happened after that day's close.
// Decision 02
This Is Attribution, Not Walk-Forward Optimization
The EMA length is fixed. We are answering a single question: did this specific rule help historically? Walk-forward (rolling-window re-optimization of EMA length in-sample, then test out-of-sample) is a separate and later step. Do not grid-search parameters against historical results at this stage. There are too few independent regime-shift events in any realistic backtest period, and you will overfit. Validate the process first.
// Decision 03
Group B Net Expectancy Is the Acceptance Gate
This is the metric that tells you whether the filter is earning its keep. Negative Group B EV: the gate is blocking trades that would have lost money regardless — pure upside, consider deploying it. Positive Group B EV: the gate is blocking profitable trades to buy curve smoothness — that is a conscious trade-off, not a free lunch. This framework computes and prints this number explicitly. It is not optional.

What Your Real Data Will Tell You

The demo runs on synthetic data with a clear bull-bear-bull price path — constructed explicitly so both regimes are well-represented. Real data is messier. When you run this on an actual trade log, split the Group B analysis by strategy.

The three equity strategies in QuantKernel have almost certainly different regime sensitivities. EMA_RSI_MACD is trend-following — bear regimes should genuinely hurt it and Group B EV is likely negative (the gate is helping). ORB is intraday — less sensitive to multi-week regime, Group B EV could go either way. PAIRS is market-neutral by design — a directional EMA200 gate should be least damaging and Group B EV is likely positive (the gate is incorrectly penalizing a non-directional strategy). These are hypotheses. The data will tell you.

// Data Sources

SPY/QQQ daily close for strategic regime: pull from Alpaca data API, export as date,close CSV. VIX term structure (F1/F2) for tactical circuit breaker: CBOE or futures source required — Alpaca does not carry this. If the VIX columns are absent, the tactical layer auto-skips and only the strategic filter runs. Recommended first-pass: deploy strategic EMA filter only, confirm it adds value on real data, then wire in the VIX layer.


The framework is ready. --demo mode runs out of the box on synthetic data. Switch to --trades log.csv --market spy.csv for real data. The honest mirror will tell you what the gate is actually doing — not what you hope it is doing.