How Many Trades for a Valid Backtest?
A backtest with 30 trades tells you almost nothing. A backtest with 500 trades can still lie. Here is how to know when your sample size is large enough to trust the numbers.
The 100-trade rule of thumb
The 100-trade minimum is the closest thing trading has to a consensus standard. It comes from the Central Limit Theorem: you need roughly 30 to 100 observations before a sample mean starts to approximate a normal distribution. For trading returns, which are usually not normally distributed, the higher end of that range is safer.
I learned this the hard way. My first serious TradingView strategy had 38 trades over 3 years of ES futures data. The Sharpe ratio came back at 1.8, the profit factor was 2.3, and the equity curve looked beautiful. I funded a small account with it. Within two months I was down 14 percent. When I went back and calculated the Sharpe on a rolling basis, the 95 percent confidence interval ran from -0.4 to 2.1. Those 38 trades were never enough to tell me anything useful.
This is where Pineify's Backtest Report helps directly. Drop in your CSV and the Monte Carlo tab runs 1,000 bootstrap simulations. If you have fewer than 100 trades, the confidence bands will be so wide that the tool effectively tells you "this data set is too small to draw conclusions from." I have seen this happen with my own 40-trade strategy: the 95 percent confidence interval on total return spanned from -30 percent to +65 percent. A coin flip would give you a narrower range.
Why small samples overstate your edge
Small backtests suffer from selection bias by construction. You found a set of parameters that work on this particular slice of history. With fewer trades, the odds that you found a real pattern rather than a random coincidence drop fast.
Consider a strategy with a true win rate of 50 percent. Over 20 trades, the probability of seeing a 65 percent win rate or better by chance alone is roughly 13 percent. Over 100 trades, that probability drops below 0.5 percent. A small sample lets randomness masquerade as skill.
The same problem applies to drawdown. A strategy that survived 30 trades with a 5 percent max drawdown might have a true max drawdown of 25 percent, simply because the worst sequence has not shown up yet. The smaller the sample, the more likely you are underestimating your real risk. I ran a test on a breakout strategy with 200 trades total, then sampled random subsets of 40 trades. The max drawdown across those subsets varied from 4 percent to 22 percent. The 40-trade samples consistently underestimated the real drawdown.
How sample size affects each metric
Not all metrics suffer equally from small samples. Some converge faster than others, and knowing which ones to trust (and which to ignore) helps you read a backtest honestly.
- Win rate stabilizes relatively fast. You can get a rough estimate from 50 to 60 trades, though the range of uncertainty is still wide.
- Profit factor needs 80 to 100 trades before it becomes reliable. A single large winner can distort profit factor badly in smaller samples.
- Sharpe ratio and Sortino ratio need at least 100 trades and ideally 200+. These metrics depend on standard deviation, which is very noisy at small sample sizes.
- Max drawdown is the worst-behaved metric on small samples. It almost always improves as you add trades, because you are exposing the strategy to more sequences. A strategy that looks like it has a 5 percent drawdown over 30 trades can easily show 20 percent over 300 trades.
- Monte Carlo percentiles are only meaningful above 100 trades. Below that, the bootstrap resampling reuses the same tiny pool, and the output reflects the input distribution more than any genuine uncertainty estimate.
Monte Carlo and the sample size trap
Monte Carlo simulation seems like a magic fix for small sample problems. You have 40 trades, so you bootstrap 1,000 sequences from them and suddenly you have what looks like rich statistical output. But this is a trap.
The bootstrap resamples from your original 40 trades with replacement. Every simulated sequence is a recombination of those same 40 data points. If those 40 trades happened to include a few lucky outliers, those same outliers dominate every bootstrap run. The simulation tells you what could happen if the same 40 trades were reordered differently, not what the strategy would do on new unseen data.
I have tested this directly. I took a strategy with 240 trades and sampled 40-trade subsets, running 1,000 Monte Carlo simulations on each subset. The 95 percent confidence interval for total return ranged from -42 percent to +78 percent across the subsets. On the full 240-trade set, the interval narrowed to -18 percent to +31 percent. The small samples were not giving me useful information. They were giving me the illusion of information.
Our Monte Carlo implementation flags this. The tool shows a warning when your trade count is too low for reliable simulation results, because pretending 40 trades can support 1,000 simulations helps nobody. For a deeper look at your Monte Carlo output, check the CVaR and expected shortfall calculator and the Value at Risk calculator, which both rely on having enough data for stable tail-risk estimates.
How to tell if your edge is real or noise
A large sample is necessary but not sufficient. Here are the practical checks I run on every backtest before I consider deploying it with capital. These checks are exactly what the Pineify Backtest Report provides, which is why I use it on my own strategies.
- Check the Sharpe ratio confidence interval. If the 95 percent interval includes zero or negative values, your strategy does not have a statistically significant edge. Pineify's KPI dashboard shows the exact Sharpe along with Sortino and Calmar ratios for comparison.
- Run a Monte Carlo simulation and look at the worst percentile. If the 5th percentile outcome shows a drawdown larger than your maximum acceptable risk, the strategy is too fragile, even if the median looks good.
- Split your data in half. Run the strategy on the first half, then check the second half without re-optimizing. If performance drops sharply, you are overfitted. This is the walk-forward principle, and our recovery factor calculator helps quantify how well a strategy recovers from drawdowns across different periods.
- Check the MFE/MAE scatter. A healthy strategy shows trades clustered in the upper-right quadrant (good entries, good exits). Random noise shows a diffuse cloud with no pattern. Use Pineify's MFE/MAE analysis tool to visualize this.
- Look at rolling performance. A strategy whose 20-trade rolling Sharpe bounces from +2 to -1 is not reliable. Consistent positive rolling metrics suggest a real edge.
None of these checks replace having enough trades in the first place. But when you do have a solid sample, they help you tell the difference between a strategy with a genuine edge and one that just got lucky in backtesting. The Backtest Report runs all of these checks automatically from your TradingView CSV export.
Practical rules for sample size
- Aim for 100 trades as the absolute minimum for any strategy you plan to deploy. Below this, treat all metrics as exploratory.
- For strategies with high win rates (70+ percent) and tight stop-losses, 100 to 150 trades may be sufficient because the trade outcomes are less variable.
- For mean reversion or trend-following strategies with wide variation in trade outcomes, 200 to 300 trades gives much more reliable estimates.
- If your strategy generates fewer than 1 trade per month on a daily timeframe, consider using multiple instruments or longer timeframes to increase the trade count before drawing conclusions.
- Use the same instrument in the Strategy Optimizer to find parameter combinations that maximize your trade count without sacrificing edge quality.
One concession worth stating directly: Pineify is not a trading journal, so it does not track your live trades or sync with your broker. You need a separate journaling system for that. But for analyzing the backtest you already ran in TradingView, the trade-count validation, Monte Carlo simulation, and rolling analysis together give you a much clearer picture of whether your sample is large enough to trust.
FAQ
Check Your Backtest Sample Size
Upload your TradingView CSV to Pineify Backtest Report and see whether you have enough trades for a statistically meaningful analysis. Monte Carlo simulation, Sharpe ratio, and sample size warnings included.
Analyze My Backtest Now