Run time (also called test duration) is the planned length of time an A/B experiment must remain active to collect enough data to produce statistically valid results. It is calculated before the test launches, based on the current baseline conversion rate, the minimum lift you want to detect, the desired statistical power, and the daily visitor volume entering the test. Running a test for less than its planned duration is one of the most common causes of false positive results in ecommerce experimentation.
Estimated Run Time (days) = Required Sample Size per Variant / (Daily Visitors × Traffic Allocation %)
Where required sample size is derived from a power calculation using:
- Baseline conversion rate
- Minimum detectable effect (MDE)
- Statistical power (typically 80%)
- Significance level (typically 95%)
Example: If a test requires 20,000 visitors per variant, you're allocating 100% of traffic (split 50/50), and your page gets 2,000 daily visitors:
- Required visitors per variant: 20,000
- Daily visitors per variant: 1,000
- Run time: 20,000 / 1,000 = 20 days
Most practitioners round up to cover at least 2–3 full business cycles (weeks).
Why Run Time Matters for Ecommerce
Stopping a test early — even when results look compelling — dramatically increases the chance of declaring a false winner. Day-of-week effects, weekend browsing patterns, and weekly promotional cycles mean that data collected over fewer than two weeks can be heavily skewed. For Shopify stores that run frequent discounts or festive promotions, an experiment that catches a sale period in the first week will appear to have much higher conversion rates than it will sustain in normal conditions. Proper run time planning accounts for this by requiring the test to span at least one full weekly cycle, and preferably two.
Real-World Example
A mid-size D2C apparel brand on Shopify launched a product page test on a Monday. By Thursday, the variant was showing a 22% lift at 91% confidence. The team were eager to ship but their testing plan had specified 21 days. They waited. By the end of week two, the variant's lead had narrowed to 7% lift at 96% confidence — still a solid win, but very different from the premature read. The final result was reliable and the shipped version maintained its lift over the following month.
How to Set Run Time Correctly
- Calculate run time before launching the test, not by watching the confidence meter.
- Run for at least two full business weeks regardless of what the calculator says, to capture weekly traffic pattern variation.
- Avoid running tests over major sale events (Diwali, Big Billion Days) unless the test is specifically designed for those conditions.
- Increase traffic allocation to shorten run time if you have a backlog of experiments waiting — just ensure both variants receive traffic from the same sources.
- Do not extend tests indefinitely hoping for significance — if the test ends inconclusive, call it and move on.
Run Time in A/B Testing
Run time is the primary guardrail against the peeking problem. Most experimentation platforms allow you to set a target duration, after which the system prompts you to review results. Treating the planned run time as a hard stop — not a soft suggestion — is a core discipline of a well-run optimization program.
Run smarter A/B tests with CustomFit.ai — 14-day free trial, no credit card required.