Tools · 6 min read

Backtest Framework for Prop Firm Traders

A structured backtest framework built for prop firm traders. Validate edge, control drawdown, and pass evaluations with data-backed strategy proof.

Fewer than 10% of traders pass prop firm evaluations on the first attempt. The primary failure mode is not poor entries — it is strategies that were never stress-tested against the specific constraints prop firms impose: daily loss limits, maximum drawdown caps, and minimum trading day requirements.

Prop firm capital is not retail capital. The rules governing how you use it reshape every strategy parameter that matters — position sizing, holding periods, recovery behavior after losing days. A backtest built on generic retail assumptions will produce results that are functionally useless inside a funded account context.

This framework gives prop firm traders a structured backtesting process calibrated to evaluation rules. You will learn how to encode firm-specific constraints into your test logic, validate edge under realistic conditions, and produce a performance record that demonstrates rule-compliant consistency before you risk a single challenge fee.

Why Standard Backtests Fail Prop Firm Traders

A standard backtest optimizes for return. A prop firm evaluation optimizes for return within a hard constraint envelope — typically a 5% daily loss limit and a 10% maximum drawdown, with profit targets between 8% and 10%. These are not soft guidelines. Breach one parameter on day 14 of a 30-day challenge and the account terminates regardless of overall performance.

Most retail backtesting frameworks do not model these boundary conditions. They allow equity to drawdown freely, recover freely, and compound without restriction. The result is a strategy that looks strong on a P&L chart but collapses in evaluation because the position sizing or risk-per-trade logic was never pressure-tested against a hard floor.

The fix is not a better strategy. It is a better testing environment — one that mirrors the exact rule structure of the firm you are targeting before you place a single live order.

Daily loss limits reset at midnight server time — your backtest must model intraday drawdown, not just closing equity
Maximum drawdown on most firms is calculated on peak balance, not starting balance — this compounds the constraint as you profit
Minimum trading days (typically 4-10) mean strategies that concentrate profits in 2-3 sessions will fail even with strong returns
News trading restrictions eliminate certain high-expectancy setups entirely — your dataset must exclude those windows
Lot size scaling must stay within per-instrument position limits or accounts face automatic review

Step 1 — Map the Rule Set Before You Open a Chart

Before any historical data is loaded, document the exact evaluation parameters of your target firm. FTMO, MyForexFunds successors, The Funded Trader, and Apex Futures each carry different constraint structures. A backtest calibrated for one firm is not transferable to another without revision.

Create a constraint sheet with seven fields: daily loss limit (absolute dollar), maximum drawdown limit (absolute dollar), drawdown calculation basis (static or trailing peak), profit target, minimum trading days, restricted instruments, and news trading policy. These seven numbers define the outer boundary of every decision your strategy is permitted to make.

With this sheet in hand, you can convert each rule into a quantitative filter. A 5% daily loss limit on a $100,000 account is a $5,000 intraday floor. Every position size in your backtest must be sized so that a max adverse excursion on any single trade does not breach that floor in isolation — before aggregating correlated positions.

You are a prop firm trading analyst. I am targeting a [FIRM NAME] Phase 1 evaluation with the following parameters: $[ACCOUNT SIZE], [X]% daily loss limit, [X]% max drawdown (trailing from peak), [X]% profit target, minimum [X] trading days, restricted instruments: [LIST]. My strategy is [DESCRIBE: e.g., London session breakout on EUR/USD, 1H timeframe, ATR-based stops]. Build a backtesting constraint checklist that maps each evaluation rule to a specific variable in my strategy — position sizing formula, max concurrent positions, daily trade cutoff logic, and news filter windows. Flag any structural conflict between my strategy type and the firm's rule set.

Step 2 — Build a Rule-Compliant Position Sizing Model

Position sizing is where most prop firm backtest frameworks break down. Fixed fractional sizing — risking 1% per trade — sounds conservative until you run correlated positions across three pairs simultaneously during a high-volatility session. Three 1% risk positions with 0.7 correlation behave like a single 2.1% risk position. Against a 5% daily limit, that leaves almost no buffer for adverse sequences.

The correct model for prop firm backtesting uses a daily risk budget, not a per-trade risk percentage. Set the daily budget at 50-60% of the daily loss limit. Divide that budget across the maximum number of concurrent positions your strategy generates. Only after this allocation is fixed does the per-trade lot size get calculated — derived from stop distance and allocated risk, not applied as a universal percentage.

Run this model across your full historical sample and log the maximum daily loss registered on any single session. If that figure exceeds your daily budget at any point in the dataset, your position sizing is not evaluation-safe — regardless of how the overall equity curve looks.

Daily risk budget = 50% of daily loss limit (leaves buffer for slippage and spread expansion)
Max concurrent positions = derived from strategy signal frequency, not set arbitrarily
Per-trade risk = daily budget divided by max concurrent positions
Lot size = (per-trade risk in dollars) divided by (stop distance in pips × pip value)
Correlation check: reduce aggregate exposure when holding two or more positively correlated instruments

BACKTEST TOOL

Assistly's backtester lets prop firm traders encode evaluation constraints directly into the test environment — daily loss limits, trailing drawdown caps, and Monte Carlo simulation across 1,000 random windows. Run your strategy against the actual rules before you pay for the challenge.

Step 3 — Validate Edge Across Evaluation-Realistic Sample Sizes

A 30-day evaluation window is approximately 20-22 trading sessions. That is not enough trades to confirm statistical edge — most intraday strategies generate 60-150 signals in that window. What it is enough for is measuring behavioral compliance: did the strategy stay within bounds on every session, hit the minimum trading day requirement, and reach the profit target before time expiry.

To validate actual edge, you need a minimum of 200-300 trades from your backtest sample. Run the full historical dataset, then simulate 1,000 random 30-day windows using Monte Carlo draws from your trade log. For each window, record: final P&L, maximum drawdown reached, daily loss limit breaches, and whether the profit target was hit. The output is a probability distribution — not a single equity curve — showing pass rate, average drawdown consumed, and failure mode frequency.

A strategy with 60% win rate and 1.5R average winner may show a simulated pass rate of only 34% if the drawdown distribution is wide. That number, not the equity curve, is the honest assessment of your evaluation odds.

Step 4 — Stress Test Against Regime Shifts

Prop firm evaluations do not occur in stable market conditions on schedule. A 30-day challenge window may overlap with a central bank policy shift, a geopolitical shock, or a sustained low-volatility compression that suppresses your breakout signals entirely. Backtests that only sample favorable historical periods produce strategies that pass in calm conditions and fail precisely when evaluation pressure is highest.

Segment your historical data by volatility regime using ATR percentile bands. Classify each month as low, medium, or high volatility. Run separate performance reports for each regime. If your strategy’s expectancy collapses in low-volatility periods, you need either a regime filter to reduce position size during those windows or a supplementary strategy that performs in range conditions.

Also backtest through known stress events: March 2020, September 2022 GBP flash moves, and Q4 2023 rate decision sequences. If your drawdown model held within prop firm limits through those periods, the strategy has demonstrated genuine robustness — not curve-fitted performance on a benign sample.

I have completed a backtest of my [STRATEGY NAME] across [DATE RANGE] on [INSTRUMENT]. Total trades: [N], Win rate: [X]%, Average R: [X], Maximum drawdown: [X]%. Segment this performance by volatility regime using monthly ATR percentile (low = bottom 33%, high = top 33%). For each regime, calculate: win rate, average R, maximum consecutive losses, and maximum intraday drawdown as a percentage of a $100,000 prop firm account with a 5% daily loss limit and 10% trailing max drawdown. Identify which regimes expose the account to rule breach risk and recommend position size adjustments per regime.

Building a Repeatable Pre-Challenge Checklist

No strategy should enter a funded evaluation without a structured pre-challenge sign-off. The cost of a failed evaluation ranges from $150 to $600 depending on account size and firm. At three attempts per strategy, that is a meaningful capital drain before live funding is reached. A documented checklist converts the backtesting process from a one-time exercise into a repeatable quality gate.

The checklist should verify seven conditions: minimum trade sample size met, daily loss limit never breached in backtest, maximum drawdown stayed below 80% of the firm’s limit (buffer for live execution variance), profit target achieved in at least 40% of simulated 30-day windows, strategy has been tested through at least one high-volatility regime, position sizing model documented and reproducible, and news filter applied consistently across the sample.

File the completed checklist alongside your backtest report before activating the evaluation. If a challenge fails, the checklist becomes the diagnostic tool — comparing live session behavior against the backtest baseline to identify whether the failure was a model error, an execution error, or a regime mismatch.

Minimum 200 historical trades in sample before any evaluation attempt
Zero daily loss limit breaches in the full backtest dataset
Max drawdown consumed less than 80% of firm limit across all simulated windows
Profit target hit in 40%+ of Monte Carlo simulation windows
Strategy tested through at least one high-volatility calendar period
Position sizing formula written down and version-controlled
News event exclusion log attached to backtest report

The AI edge for serious traders

Pass the evaluation with a backtest that reflects the actual rules.

Stop testing in a vacuum. Build your next strategy inside a framework calibrated to your specific prop firm's constraints — and know your pass probability before day one of the challenge.