Power analysis is a pre-test statistical calculation that determines the sample size required for an A/B test to reliably detect a specified effect size (the minimum detectable effect) at a given significance level and statistical power. Statistical power is the probability of correctly detecting a real difference when one exists — the probability of avoiding a Type II error. A power analysis ensures your test is neither underpowered (missing real effects) nor run longer than necessary.
The four inputs to a power analysis are:
- Alpha (α): The significance level (typically 0.05). The maximum acceptable false positive rate.
- Power (1 − β): Typically set at 0.80 (80%) or 0.90 (90%). The probability of detecting a real effect.
- Baseline conversion rate: The current conversion rate of your control.
- Minimum Detectable Effect (MDE): The smallest lift you care about detecting.
The output is required sample size per variant. Running a test with this sample size gives you the stated power to detect the stated MDE at the stated significance level.
Why Power Analysis Matters for Ecommerce
Running underpowered A/B tests is one of the most common and costly mistakes in CRO. An underpowered test — one with too few visitors — has low probability of detecting a real effect. If your variant genuinely improves conversion by 8%, but your test is powered at only 40%, you have a 60% chance of concluding "no effect" and not shipping a real winner. Over many tests, underpowered programmes miss most of the improvements their variants would deliver.
For Indian D2C brands with limited traffic, power analysis is essential for making realistic decisions about which tests are feasible. A brand getting 500 visitors/day to a PDP cannot reliably test a 2% relative lift — the required sample size would take 6 months to accumulate. Power analysis reveals this early, prompting the team to test larger hypotheses (with bigger expected effects) on limited-traffic pages.
Power analysis also sets realistic stakeholder expectations. When a founder asks "why isn't this test done yet?", showing the pre-calculated required sample size demonstrates that the timeline is driven by traffic volume and statistical requirements, not team slowness.
Real-World Example
The Man Company gets 4,500 unique daily visitors to their grooming product PDPs. Their baseline add-to-cart rate is 7.2%. They want to test a new product image carousel format and consider a 10% relative lift (7.2% → 7.92%) commercially meaningful. Running a power analysis at α = 0.05 and 80% power:
Required sample per variant ≈ 11,400 visitors
With 4,500 daily visitors split 50/50 (2,250 per variant/day), they need approximately 5 days to reach sample. At 90% power, the sample requirement rises to 15,200 per variant — roughly 7 days. They set a 7-day test duration, finish with 15,750 per variant, and observe a 9.8% relative lift in add-to-cart rate — just below the 10% MDE but visible. They extend by 3 days to confirm the result crosses the significance threshold. The power analysis prevented them from stopping at Day 3 (when they had only 6,750 per variant — far short of the required sample) when the result was not yet reliable.
How to Improve / Optimize Power Analysis
- Use an online calculator or your testing platform's built-in tool. Evan Miller's sample size calculator, AB Testguide, and most testing platforms (CustomFit.ai included) provide power analysis tools. No need to calculate manually.
- Set power at 80% as the minimum; 90% for high-stakes tests. 80% means a 20% chance of missing a real effect. For tests on your highest-traffic pages or biggest revenue drivers, use 90% power.
- Incorporate multiple testing corrections if you have multiple variants or metrics. Each additional comparison inflates the required sample size. With 3 variants, apply a Bonferroni correction and adjust alpha accordingly.
- Recalculate when baseline metrics change. If your conversion rate drops from 5% to 3.5% due to seasonal traffic shifts, your original power analysis is invalid. Rerun it with the current baseline.
- Consider one-tailed vs. two-tailed tests. If you only care about detecting a positive lift (not a negative one), a one-tailed test requires smaller sample size. Use one-tailed only when you are certain the variant cannot perform worse.
Power Analysis in A/B Testing
Power analysis is the mandatory first step of every well-designed experiment. Without it, you are testing blind — you don't know if your test can find what you're looking for. In a well-structured CRO programme, power analysis outputs (required sample, expected duration) are documented in the experiment brief before any test goes live.
Run smarter A/B tests with CustomFit.ai — 14-day free trial, no credit card required.