A holdout test is a controlled experiment where a randomly selected group of users — the holdout or control group — is deliberately excluded from receiving a marketing campaign, product feature, or site change. The holdout group's behavior is then compared to the exposed group's behavior to measure the true causal lift created by the treatment. Unlike an A/B test (which compares two versions of something against each other), a holdout test specifically measures whether an activity adds value compared to doing nothing at all.
Holdout Lift = (Conversion Rate Exposed − Conversion Rate Holdout) / Conversion Rate Holdout × 100
If your exposed group converts at 5.4% and your holdout group converts at 4.5%:
Holdout Lift = (5.4% − 4.5%) / 4.5% × 100 = 20% lift
Why Holdout Tests Matter for Ecommerce
Attribution models tell you which channel received credit for a conversion; holdout tests tell you whether removing that channel would have reduced conversions. These are different questions, and holdout tests give you the answer that actually matters for budget decisions.
They're particularly important for channels like retargeting (which targets people already inclined to buy), loyalty emails (sent to customers who shop regularly anyway), and push notifications (sent to engaged users). Attributed performance on all these channels looks great — but if your holdout group converts nearly as often without them, you're paying for conversions that would have happened for free.
For D2C brands managing tight acquisition budgets, understanding which spending is truly incremental can free up 15–30% of channel budget to redirect toward genuinely growth-driving activities.
Real-World Example
The Man Company runs a weekly push notification to its app users about new grooming bundles priced around ₹999. The attributed click-to-purchase rate is 4.2% and the campaign looks profitable. To validate this, they run a holdout test: 15% of eligible users receive no push notification for four weeks. The holdout group's purchase rate over the same period is 3.7%. Incremental lift from push: 13.5%. This is meaningful but lower than the attributed rate suggests — because many users who click the notification would have bought anyway that week via direct app opens. The finding doesn't justify killing push notifications, but it calibrates how aggressively to invest in new push capabilities.
How to Improve / Optimize Holdout Tests
- Size your holdout group carefully: Too small and you'll lack statistical power; too large and you're unnecessarily withholding potentially valuable marketing from customers. 10–20% holdout is typical for channel-level tests.
- Run holdouts for complete purchase cycles: If your average customer takes 14 days from first touch to purchase, run the holdout for at least 30 days to capture the full behavioral signal.
- Randomize at the user level, not session level: Holdout groups should be stable — a user is either always in the holdout or never in it during the test period. Session-level randomization creates contamination.
- Control for seasonality: Running a holdout during a sale period versus a normal period will make the results incomparable. Keep test and control periods consistent.
- Document and share holdout results internally: Holdout findings often challenge conventional wisdom about which channels are "working." Make sure the results reach the decision-makers who control budget.
Holdout Tests in A/B Testing
Holdout tests are a form of A/B test where the "B" variant is the absence of a treatment. The same principles apply: random assignment, adequate sample size, a clearly defined success metric, and a pre-registered hypothesis. Many A/B testing platforms can run holdout experiments natively; the key is ensuring your control group is genuinely excluded from the treatment, not just a subset that happens to see it less often.
Run smarter A/B tests with CustomFit.ai — 14-day free trial, no credit card required.