What Is T-Test? Definition, Formula & Guide

A t-test is a statistical test that compares the means of two groups to determine whether the difference between them is statistically significant. In A/B testing, t-tests are most appropriate for continuous metrics — average order value, revenue per visitor, time on page, or items per cart — where you are comparing averages rather than proportions. The t-test accounts for the variability within each group (the spread of individual order values) when calculating whether the mean difference is meaningful or just noise.

Formula / How a T-Test Works

For an independent samples t-test (the type used in A/B testing):

t = (Mean₁ − Mean₂) / √[(s₁²/n₁) + (s₂²/n₂)]

Where:

Mean₁, Mean₂ = sample means for control and variant
s₁², s₂² = sample variances
n₁, n₂ = sample sizes

The t-statistic is converted to a p-value using a t-distribution with degrees of freedom ≈ n₁ + n₂ − 2. If p < 0.05, the difference in means is statistically significant at the 95% confidence level.

Why T-Tests Matter for Ecommerce

Conversion rate tests use chi-square or z-tests (binary outcomes). But when your experiment's success metric is average order value, cart size, or revenue per visitor, those tests don't apply — you need a t-test. AOV data in ecommerce is notoriously noisy: most orders cluster around a typical value, but a small number of high-value orders skew the mean. The t-test handles this by incorporating variance directly into the significance calculation, meaning it's less likely to be fooled by a few outlier orders into declaring a spurious winner.

Real-World Example

Sugar Cosmetics ran a bundle recommendation experiment where the variant showed "Frequently Bought Together" sets on the cart page. Their primary metric was AOV. Control AOV: ₹1,140 (standard deviation ₹680, n = 8,200). Variant AOV: ₹1,290 (SD ₹710, n = 8,400). Running a two-sample t-test gave t = 8.7, p < 0.001 — a highly significant result. The high SD (large spread in order values) was expected for a cosmetics brand with a wide price range; the t-test correctly accounted for this variance and confirmed the result was genuine.

When to Use a T-Test vs. Other Tests

Use t-test for continuous metrics: AOV, RPV, session duration, items per cart.
Use chi-square or z-test for binary metrics: conversion rate, click-through rate, form completion rate.
Use Mann-Whitney U test (non-parametric alternative) when AOV or revenue data is heavily skewed — it's more accurate when distributions are very non-normal.
Use Welch's t-test (the default in most software) when the two groups may have unequal variances.

How to Apply T-Tests Correctly

Check that your sample sizes are sufficient — t-tests require reasonably large samples (typically 30+ per group) to be reliable.
Look for and cap extreme outliers in revenue data before running the test, or consider a log transformation.
Treat AOV as a secondary metric in most tests; use revenue per visitor as the primary metric since it includes both conversion rate and AOV effects.
Use two-tailed t-tests unless you have a strong prior justification for a one-tailed test.

T-Test in A/B Testing

Advanced testing platforms allow you to specify revenue or AOV as an experiment metric and will apply a t-test automatically. In Python, scipy.stats.ttest_ind runs an independent samples t-test; in R, t.test() does the same. Understanding which test your platform uses for which metric type helps you interpret results accurately.

Run smarter A/B tests with CustomFit.ai — 14-day free trial, no credit card required.

Put this into practice

Run A/B tests and personalize your store without code. 14-day free trial, no credit card.

Start free trial →

Articles about What Is T-Test? Definition, Formula & Guide

In-depth guides and case studies where this concept is put to work.

← Back to Conversion Glossary

Formula / How a T-Test Works

For an independent samples t-test (the type used in A/B testing):

t = (Mean₁ − Mean₂) / √[(s₁²/n₁) + (s₂²/n₂)]

Where:

Mean₁, Mean₂ = sample means for control and variant

s₁², s₂² = sample variances

n₁, n₂ = sample sizes

Why T-Tests Matter for Ecommerce

Real-World Example

When to Use a T-Test vs. Other Tests

Use t-test for continuous metrics: AOV, RPV, session duration, items per cart.

Use chi-square or z-test for binary metrics: conversion rate, click-through rate, form completion rate.

Use Mann-Whitney U test (non-parametric alternative) when AOV or revenue data is heavily skewed — it's more accurate when distributions are very non-normal.

Use Welch's t-test (the default in most software) when the two groups may have unequal variances.

How to Apply T-Tests Correctly

Check that your sample sizes are sufficient — t-tests require reasonably large samples (typically 30+ per group) to be reliable.

Look for and cap extreme outliers in revenue data before running the test, or consider a log transformation.

Treat AOV as a secondary metric in most tests; use revenue per visitor as the primary metric since it includes both conversion rate and AOV effects.

Use two-tailed t-tests unless you have a strong prior justification for a one-tailed test.

T-Test in A/B Testing

Formula / How a T-Test Works

Why T-Tests Matter for Ecommerce

Real-World Example

When to Use a T-Test vs. Other Tests

How to Apply T-Tests Correctly

T-Test in A/B Testing

Related Terms

Put this into practice

Articles about What Is T-Test? Definition, Formula & Guide

Built for every D2C category

Formula / How a T-Test Works

Why T-Tests Matter for Ecommerce

Real-World Example

When to Use a T-Test vs. Other Tests

How to Apply T-Tests Correctly

T-Test in A/B Testing

Related Terms