Put this into practice
Run A/B tests and personalize your store without code. 14-day free trial, no credit card.
Start free trial →Run A/B tests and personalize your store without code. 14-day free trial, no credit card.
Start free trial →Multi-armed bandit (MAB) is an adaptive experimentation algorithm that simultaneously explores multiple variants and exploits the best-performing one by dynamically shifting traffic allocation in real time. The name comes from the analogy of a gambler facing multiple slot machines (one-armed bandits) with unknown payoff probabilities — the goal is to maximise total reward by learning which machine pays best while still occasionally trying other machines to avoid missing a better option. Unlike fixed A/B tests that split traffic 50/50 until a winner is declared, MAB continuously updates traffic weights based on observed performance.
Traditional A/B testing has a known opportunity cost: during the test, 50% of traffic is sent to a variant that may be significantly worse than control. For high-traffic, high-stakes pages like a product listing page or a flash sale landing page, this "regret" (the lost revenue from running an inferior variant) can be substantial.
MAB minimises this regret by sending progressively more traffic to the winning variant as evidence accumulates. If Variant B is clearly outperforming control after 2 days, a bandit algorithm might shift to 70% Variant B / 30% control automatically — capturing more revenue from the winner while still gathering data on the loser. This is particularly valuable during short, high-intensity events like festive sales where the cost of running an inferior experience for 7+ days at full traffic is high.
For Indian D2C brands with a mix of bestselling SKUs and experimental products, MAB is well-suited to recommendation systems and dynamic content personalisation — continuously optimising which product or offer each user sees.
Pilgrim (Indian skincare brand) runs a multi-armed bandit test on their "Recommended for You" section on the cart page, testing three different recommendation algorithms. Control shows bestsellers; Variant B shows recently viewed items; Variant C shows "frequently bought together" bundles. The MAB algorithm starts at 33% traffic each. Within 48 hours, Variant C (bundle recommendations) shows an average order value (AOV) lift of ₹180 per session. The algorithm shifts traffic to 60% Variant C, 25% control, 15% Variant B. By Day 7, Variant C receives 80% of traffic and is generating an estimated additional ₹2.1 lakh/month in bundle revenue. The team ships Variant C as the default.
MAB sits at the intersection of A/B testing and machine learning. It is most appropriate when speed and revenue optimisation matter more than statistical purity, and when you have multiple variants to evaluate simultaneously. CustomFit.ai supports bandit-style testing alongside traditional A/B experiments, letting teams choose the right approach for each experiment's goals.
Run smarter A/B tests with CustomFit.ai — 14-day free trial, no credit card required.