A Type II error (also called a false negative or beta error) occurs in hypothesis testing when you fail to reject a false null hypothesis — meaning you conclude there is no difference between control and variant when a real difference actually exists. In A/B testing: you declare "no winner" or stop the test when the variant genuinely improves (or harms) conversion. The probability of a Type II error is beta (β); statistical power is (1 − β). At 80% power, β = 0.20 — a 20% chance of missing a real effect.
Relationship: Type II Error Rate = β = 1 − Power. At 80% power, β = 0.20 = 20% chance of missing a real effect.
Why Type II Error Matters for Ecommerce
Type II errors are arguably more costly than Type I errors for growth-stage D2C brands, because they cause you to discard improvements that would have driven real revenue. If your variant genuinely lifts checkout completion by 6% but your test is underpowered, you conclude "no difference" and revert to control — giving up ₹X lakh/month in additional revenue indefinitely.
Underpowered testing is endemic in ecommerce because teams underestimate how much traffic their tests need. A team testing on a product category page getting 800 visitors/day will struggle to reach adequate power for anything below a 15–20% relative lift. For a 5% lift on a 4% baseline conversion rate, they would need weeks — but they stop at 7 days when results look "flat" and move on.
The hidden cost of habitual Type II errors: teams conclude their tests don't work, lose confidence in A/B testing, and stop experimenting — permanently foregoing the discovery of real improvements.
Real-World Example
A Mumbai-based fashion brand tests a new checkout flow on their website, expecting a 7% relative lift in checkout completion (their in-house estimate based on UX research). Their checkout page gets 1,200 visitors/day. A power analysis shows they need 22,000 visitors per variant (about 37 days at 50/50 split) to detect a 7% relative lift at 80% power and α = 0.05. Instead, the team stops the test after 14 days with 8,400 per variant. At that sample size, the test is powered at only 47% to detect the target effect — they have a 53% chance of committing a Type II error. The test shows no significance. They discard the new checkout. Six months later, a competitor launches an identical checkout redesign and publicly shares a 9% lift. The brand realises they likely discarded a real winner.
How to Improve / Optimize Type II Error
- Run power analysis before every test. Calculate the minimum sample size required to detect your target MDE at 80% (or 90%) power. Never stop a test before reaching this sample size.
- Increase traffic to reduce required test duration. Temporarily increase paid traffic, or concentrate a test on your highest-traffic pages, to reach adequate sample size faster.
- Increase statistical power. Set power at 0.90 instead of 0.80 for high-stakes tests. The required sample size increases, but the chance of missing a real effect drops from 20% to 10%.
- Use variance reduction techniques. CUPED (Controlled-experiment Using Pre-Experiment Data) reduces outcome metric variance using pre-experiment covariate data, improving effective power without more traffic.
- Accept higher MDEs for low-traffic pages. Acknowledge that small-traffic pages cannot reliably detect small effects. Focus low-traffic page tests on hypotheses with expected large lifts (10%+ relative), and accept that smaller improvements won't be detectable.
Type II Error in A/B Testing
Type II error is the most common source of missed opportunity in ecommerce CRO programmes. It is caused almost entirely by insufficient sample size relative to the effect size being tested. Systematic power analysis before every test, combined with discipline about not stopping early, prevents most Type II errors. The long-run payoff — discovering real improvements that underpowered tests would have missed — is substantial.
Run smarter A/B tests with CustomFit.ai — 14-day free trial, no credit card required.