
From the conversion glossary
Concepts referenced in this article, defined.

Concepts referenced in this article, defined.
Run rigorous A/B tests and personalize every visit on Shopify or any storefront โ no engineers required.
An A/B test hypothesis library is a structured backlog of test ideas, each written with evidence, expected impact, and implementation notes โ ensuring your CRO program runs continuously rather than stalling between tests. Without a library, CRO programs fail not because of bad tools or low traffic, but because teams spend more time figuring out what to test next than actually testing. A well-maintained hypothesis library can sustain 2โ4 tests per week indefinitely.
Most A/B testing programs follow the same arc:
The fix is systematic: build your hypothesis library before you need it. Treat it like a product backlog โ continuously filled, regularly prioritized, and always a few steps ahead of your current test.
A hypothesis is not "test the CTA button color." That's a change, not a hypothesis. A proper hypothesis has four components:
Evidence: What data or research supports this idea? Change: What are you proposing to change? Expected outcome: What metric do you expect to improve, by approximately how much? Audience: Who does this affect?
Template:
"Because [evidence], we believe [change] will cause [outcome] for [audience]."
Bad hypothesis:
"Test the add-to-cart button in orange."
Good hypothesis:
"Because heatmap data shows 60% of mobile users on our PDP don't scroll past the first image, we believe moving the add-to-cart button above the product description will increase mobile add-to-cart rate by 15โ20% for first-time visitors."
The difference matters because:
Source 1: Heatmap and Session Recording Analysis
Tools like Microsoft Clarity (free) or Hotjar show exactly where buyers click, scroll, and drop off. Common findings that generate hypotheses:
Each finding is a hypothesis waiting to be written.
Source 2: Customer Support Tickets
Mine your last 3โ6 months of support tickets. Categorize by theme. The most common themes โ size questions, delivery questions, ingredient questions โ are direct evidence of information gaps that tests can address.
For example: If 30% of support tickets are size questions, your hypothesis might be: "Because size uncertainty drives 30% of support contacts, adding a size recommendation quiz to PDPs will reduce size-related questions and increase conversion for first-time buyers."
Source 3: Post-Purchase Surveys
A simple 3-question survey after purchase captures: why buyers almost didn't buy, what information they wished they had earlier, and what almost made them buy from a competitor. These are gold for generating high-confidence hypotheses.
Source 4: Exit Intent Surveys
Survey visitors who are about to leave without purchasing. "What stopped you from completing your purchase?" The most common answers become hypotheses.
Source 5: Failed Test Analysis
When a test loses, ask why the hypothesis was wrong. The analysis often generates the next hypothesis. A failed "add urgency badge" test might reveal that buyers respond to social proof instead โ generating a new test.
Source 6: Competitor and Industry Research
Review competitor stores and industry case studies. CXL, Baymard Institute, and Nielsen Norman Group publish research on ecommerce UX patterns. A finding from Baymard's checkout research ("43% of US adults have abandoned a checkout due to required account creation") can be validated against your own data to generate a hypothesis.
Step 1: Set up a shared document or tracking tool
Use Notion, Airtable, Google Sheets, or a dedicated tool like PlanOut. The format matters less than consistency. Every hypothesis entry should have:
Step 2: Run a hypothesis generation sprint
Set aside 2 hours with your team. Review:
Generate 15โ20 hypothesis candidates without filtering. Write rough versions first.
Step 3: Refine and write proper hypotheses
Take the raw ideas and write each as a proper hypothesis using the template. This forces you to find or acknowledge missing evidence.
Step 4: Score and prioritize
Use the PIE framework:
Or use ICE:
Step 5: Maintain the library weekly
Assign someone to review the library weekly. New hypotheses should be added as evidence emerges. Completed tests should be documented with results. The library should never empty โ when you're running 4โ6 tests/week, you need 4โ6 new ideas per week coming in.
Use this structure for each entry:
HYPOTHESIS #[number]
Status: [Backlog / In Design / Running / Complete]
Priority Score: [PIE/ICE score]
Hypothesis Statement:
Because [evidence], we believe [change] will cause [outcome] for [audience].
Evidence Sources:
- [Source 1: e.g., Heatmap data showing 55% mobile drop-off at image scroll]
- [Source 2: e.g., Support ticket analysis โ 20% of tickets are size questions]
Change Description:
- Control: [What exists today]
- Variant: [What you'll test]
Success Metric: [Primary KPI, e.g., add-to-cart rate]
Secondary Metrics: [e.g., CVR, session duration]
Estimated Traffic Required: [From sample size calculator]
Estimated Implementation Time: [Hours]
Test Results (after completion):
- Duration:
- Winner/Loser/Inconclusive:
- Result magnitude:
- Learnings:
Suppose you have three hypotheses for your Shopify PDP:
Hypothesis A: Add size guide tooltip near size selector
Hypothesis B: Add video testimonials below the fold
Hypothesis C: Reorder PDP sections: put ingredients before product description
Test order: A, then C, then B.
Your hypothesis library connects to your testing roadmap. Each sprint (typically 2โ4 weeks), you pull the top-scoring hypotheses from the library, design the test, implement it via CustomFit.ai or your chosen platform, and run it.
The test results feed back into the library:
A mature hypothesis library becomes a record of your brand's conversion intelligence โ every test, every result, every learning documented in one place.