What Is Epsilon-Greedy? Definition & Guide

Epsilon-greedy is a simple multi-armed bandit algorithm that balances exploration (trying different variants to gather data) and exploitation (sending traffic to the current best variant to maximise reward). The algorithm works as follows: with probability epsilon (ε), it selects a random variant for exploration; with probability 1 − ε, it selects the variant with the highest observed conversion rate for exploitation. Epsilon is a fixed value set between 0 and 1 — commonly 0.1 (10% exploration) or 0.2 (20% exploration).

Decision rule: Select random variant with probability ε; select best-known variant with probability (1 − ε).

Why Epsilon-Greedy Matters for Ecommerce

Epsilon-greedy is the simplest practical bandit algorithm, making it the easiest to implement and explain to non-technical stakeholders. For ecommerce teams running personalisation experiments or homepage variant tests, it offers a straightforward way to exploit winners while maintaining ongoing exploration — ensuring you don't permanently miss a better variant that underperforms early.

The fixed exploration rate (epsilon) is both its strength and weakness. At ε = 0.1, the algorithm guarantees that 10% of traffic is always exploring — even once a clear winner is established. This is consistent and predictable, but it means you're permanently sending 10% of traffic to potentially inferior experiences. In high-revenue contexts (a ₹50 crore/year ecommerce site), that 10% exploration tax is a meaningful ongoing cost.

For Indian D2C brands experimenting with push notification copy, email subject lines, or landing page headlines at scale, epsilon-greedy is a practical first bandit algorithm because it requires almost no statistical expertise to tune — just set epsilon and let it run.

Real-World Example

A Shopify seller of home furnishings tests three product image styles (lifestyle photos, white background, 360-degree view) using epsilon-greedy at ε = 0.15. Over 10 days, the algorithm observes that lifestyle photos drive the highest add-to-cart rate (6.2% vs. 4.8% and 4.1%). It allocates 85% of traffic to lifestyle photos and splits the remaining 15% between the other two styles for continued exploration. The seller keeps the algorithm running because their catalogue has 200+ products — the 15% exploration keeps feeding data on which style works for different product categories (textiles, decor, furniture), allowing ongoing category-level personalisation.

How to Improve / Optimize Epsilon-Greedy

Tune epsilon to your traffic volume. High-traffic pages (10,000+ daily visitors) can afford ε = 0.05 — less exploration is needed because each variant accumulates data quickly. Low-traffic pages need ε = 0.20–0.30 to gather enough signal on non-leading variants.
Decay epsilon over time. Epsilon-decreasing variants reduce ε as the algorithm matures, starting with heavy exploration (ε = 0.3) and converging toward pure exploitation (ε = 0.05) as the best variant becomes clear. This is more efficient than a fixed epsilon.
Compare against Thompson Sampling before choosing. Epsilon-greedy is simpler but typically less sample-efficient than Thompson Sampling. If your engineering team can implement either, Thompson Sampling usually produces better results with the same traffic.
Don't use epsilon-greedy for very short campaigns. Flash sales lasting 24–48 hours don't give the algorithm time to accumulate meaningful data before the campaign ends. Use a predetermined allocation or traditional A/B test for ultra-short windows.
Monitor for non-stationarity. If the conversion rate of all variants suddenly drops (site bug, payment gateway issue), epsilon-greedy will continue exploiting the "best" (least bad) variant. Add monitoring alerts for absolute performance drops, not just relative comparisons.

Epsilon-Greedy in A/B Testing

Epsilon-greedy occupies the middle ground between a pure A/B test (fixed 50/50 split, no adaptation) and a fully adaptive algorithm (Thompson Sampling, UCB). It is best thought of as A/B testing with a traffic rebalancing rule: a predetermined portion of traffic always explores, and the majority always exploits the current leader. For teams starting with bandit-style optimisation, epsilon-greedy is a practical entry point before graduating to more sophisticated methods.

Run smarter A/B tests with CustomFit.ai — 14-day free trial, no credit card required.

Put this into practice

Run A/B tests and personalize your store without code. 14-day free trial, no credit card.

Start free trial →

← Back to Conversion Glossary

Why Epsilon-Greedy Matters for Ecommerce

Real-World Example

How to Improve / Optimize Epsilon-Greedy

Tune epsilon to your traffic volume. High-traffic pages (10,000+ daily visitors) can afford ε = 0.05 — less exploration is needed because each variant accumulates data quickly. Low-traffic pages need ε = 0.20–0.30 to gather enough signal on non-leading variants.

Decay epsilon over time. Epsilon-decreasing variants reduce ε as the algorithm matures, starting with heavy exploration (ε = 0.3) and converging toward pure exploitation (ε = 0.05) as the best variant becomes clear. This is more efficient than a fixed epsilon.

Compare against Thompson Sampling before choosing. Epsilon-greedy is simpler but typically less sample-efficient than Thompson Sampling. If your engineering team can implement either, Thompson Sampling usually produces better results with the same traffic.

Don't use epsilon-greedy for very short campaigns. Flash sales lasting 24–48 hours don't give the algorithm time to accumulate meaningful data before the campaign ends. Use a predetermined allocation or traditional A/B test for ultra-short windows.

Monitor for non-stationarity. If the conversion rate of all variants suddenly drops (site bug, payment gateway issue), epsilon-greedy will continue exploiting the "best" (least bad) variant. Add monitoring alerts for absolute performance drops, not just relative comparisons.

Epsilon-Greedy in A/B Testing

Why Epsilon-Greedy Matters for Ecommerce

Real-World Example

How to Improve / Optimize Epsilon-Greedy

Epsilon-Greedy in A/B Testing

Related Terms

Put this into practice

Built for every D2C category

Why Epsilon-Greedy Matters for Ecommerce

Real-World Example

How to Improve / Optimize Epsilon-Greedy

Epsilon-Greedy in A/B Testing

Related Terms