Is My Trading Strategy Good? A Backtest Quality Checklist

You ran the backtest in TradingView. Now you need to know if the result is real or luck. Here is a 6-point checklist to evaluate backtest quality and decide whether to trust the strategy.

When I first started evaluating strategies, I assumed a high win rate and a nice equity curve were enough. I was wrong. A strategy I pulled from TradingView showed 68% win rate with 2.1 profit factor over 300 trades. It looked like a winner. When I uploaded the CSV to Pineify, the Monte Carlo simulation showed that 32% of the randomized runs lost money. The Sharpe Ratio came back at 0.54, not the 1.2 I had guessed from the win rate alone. That strategy never went live, and six months later, watching the market, I could see why the edge was thin.

This checklist covers the six dimensions that separate a reliable backtest from a misleading one. Each section includes the numbers you should look for and links to the specific Pineify calculator tools that compute them automatically from your TradingView CSV.

Checklist Item 1

Sample size: how many trades matter

Sample size is the foundation. Without enough trades, every other metric is unreliable. Statistical significance requires a minimum number of observations, and backtesting is no different.

I use three tiers. Below 50 trades, the result is essentially random. Between 50 and 150 trades, you get directional signals but the margin of error is large. Above 150 trades, risk-adjusted metrics like Sharpe and Sortino start to converge. Above 300 trades, the Monte Carlo confidence intervals tighten enough that I trust the median outcome.

My own cutoff is 100 trades for a preliminary pass. Below that, I do not run Monte Carlo simulation because the bootstrap range is too wide to act on. I once tested a breakout strategy with 78 trades that showed a 1.8 profit factor. The 95% confidence interval from Monte Carlo ranged from a 22% loss to a 45% gain. That range tells you the sample is not large enough for a verdict.

Quick check

  • Under 50 trades: unreliable, treat with extreme caution
  • 50 to 150 trades: directional signal, wide error margin
  • 150 to 300 trades: metrics start to stabilize
  • Over 300 trades: high confidence, narrow bands
Checklist Item 2

Risk-adjusted return: beyond win rate

Win rate is the most misleading metric in trading. A strategy with 35% win rate and a 1:3 risk-reward ratio can outperform a strategy with 70% win rate and a 1:1 ratio. Risk-adjusted returns account for this by measuring return per unit of risk.

The three ratios I check first are Sharpe, Sortino, and Calmar. Sharpe tells you return per unit of total volatility. Sortino tells you return per unit of downside volatility only, which matters more for most traders because upside volatility is not really risk. Calmar divides annualized return by maximum drawdown, giving you a direct measure of drawdown efficiency.

I look for Sharpe above 1.0 for a strategy worth considering. Above 1.5 is strong. Below 0.5, the strategy is not compensating you for the risk taken. Sortino should be higher than Sharpe because it excludes upside volatility, so if Sortino is not clearly above Sharpe, that means downside and upside volatility are roughly equal, which is a yellow flag.

Checklist Item 3

Drawdown: how bad does it get

The maximum drawdown number in TradingView's summary tells you the peak-to-trough drop, but it does not tell you how often deep drawdowns happen or how long they last. A strategy with one deep 30% drawdown over 5 years is very different from a strategy that hits 30% drawdown every 6 months.

I look at three drawdown measures. Average drawdown tells you the typical peak-to-trough decline. Maximum drawdown tells you the worst case. The Ulcer Index measures drawdown depth and duration combined, penalizing strategies that stay underwater for a long time. A strategy I tested on ES futures had a maximum drawdown of 12%, which looked acceptable, until the Ulcer Index showed it spent 40% of the backtest period below the peak equity. That signaled a strategy that recovered slowly.

For my own trading, I reject any strategy with maximum drawdown above 25% unless the Sharpe is above 2.0. I also check whether the drawdowns cluster in specific market regimes. If all the deep drawdowns happen in trending markets, the strategy is essentially a range-bound system that fails when the market trends.

Checklist Item 4

Return distribution: fat tails and skew

A strategy that looks good on average can be hiding a problematic return distribution. If most trades are small winners and a few trades are massive losers, the average return tells a misleading story. The returns distribution histogram with a normal curve overlay shows you exactly this.

Skewness tells you whether the distribution is asymmetric. Negative skew means more outliers on the losing side, which is dangerous because extreme losses can wipe out months of small gains. Kurtosis tells you about tail thickness. High kurtosis means more extreme outcomes than a normal distribution predicts. I have seen strategies with positive average returns and kurtosis above 8, meaning the strategy was quietly building risk that would show up as a single catastrophic trade.

Pineify's returns distribution tab plots the histogram with a normal curve overlay, and the Sharpe and Sortino ratios automatically reflect the risk from non-normal distributions. Value at Risk (VaR) and Conditional VaR give you the dollar amount you can expect to lose on the worst 5% of trading days, which is a number you should know before you commit capital.

Checklist Item 5

Stress testing: does the strategy hold up

A strategy that survives Monte Carlo simulation is one you can trust. Monte Carlo takes your actual trades and reshuffles them into 1000 random sequences. If the strategy depends on a specific trade order (three winners followed by a loser, repeated), the reshuffled sequences will reveal that dependency.

Pineify runs 1000 bootstrap simulations and shows you the equity curve fan chart with 95% and 99% confidence intervals. I look for two things. First, the median Monte Carlo result should be positive. Second, fewer than 20% of the 1000 runs should be unprofitable. If more than 30% of simulations lose money, the strategy edge is too thin to trust.

I also use the System Quality Number (SQN), developed by Van Tharp, which combines expectancy, standard deviation of returns, and number of trades into a single score. SQN above 3.0 is excellent. Below 2.0, the strategy needs more testing before you trade it with real money. When I ran SQN on my own portfolio of strategies, two that I thought were solid came in at 1.7 and 1.9. Both had looked fine on win rate and profit factor alone.

Checklist Item 6

Out-of-sample and walk-forward validation

This is the closest thing to a truth test. Take the first 70% of your data, optimize the strategy on it, then apply the same parameters to the remaining 30% without re-optimizing. If the out-of-sample results are significantly worse, the optimization was fitting to noise.

The walk-forward efficiency ratio compares out-of-sample performance to in-sample performance. Above 80% is excellent. 60 to 80% is acceptable. Below 60%, discard the strategy. I have personally discarded a mean reversion strategy that showed 2.8 profit factor in-sample but scored 41% on walk-forward. The out-of-sample equity curve was flat for two years.

One more check I add to this list: test the strategy on a different market or a different timeframe. If a strategy works on Apple and fails on Microsoft, it is not a strategy you can rely on, it is a stock-specific pattern miner. I use Pineify's Rolling Window Analysis with 20-trade windows to check whether performance holds up across different market regimes within the same dataset.

Run this checklist on your backtest

Upload your TradingView CSV to Pineify and get all 16+ KPI metrics in one dashboard. Sharpe, Sortino, Monte Carlo, drawdown analysis, and more. Everything runs in your browser, nothing leaves your device, and no account is required.

Analyze My Backtest Now

Free. No account needed. 100% client-side.

FAQ