Choosing the Right Statistical Test for Your Experiment
Chi-squared vs t-test vs z-test: understand which statistical method applies to your conversion metrics and why it matters for valid results.
Why the Test Matters
Using the wrong statistical test is like using a hammer when you need a screwdriver—you might get the job done, but you risk damaging something along the way.
Different metrics require different statistical tests because they have different properties. Use the wrong test and your results might be invalid, leading to false conclusions.
The Decision Tree
Here's how to choose the right test:
1. What Type of Data Do You Have?
Binary data (yes/no, converted/didn't convert): Use chi-squared test or z-test for proportions
Continuous data (revenue, time on site, cart value): Use t-test
2. How Large Is Your Sample?
Large samples (>1000 per variant): Chi-squared or z-test
Small to medium samples (<1000 per variant): Consider exact tests or t-tests
3. Are You Comparing Proportions or Means?
Proportions (conversion rates, click rates): Chi-squared or z-test
Means (average order value, session duration): T-test
Chi-Squared Test
When to Use It
Chi-squared test is your go-to for most CRO experiments because most CRO metrics are binary:
- Conversion rate (converted yes/no)
- Click-through rate (clicked yes/no)
- Sign-up rate (signed up yes/no)
- Add-to-cart rate (added yes/no)
Requirements
- Binary outcome (success/failure)
- Independent observations
- Expected cell counts >5 (for validity)
What It Tests
"Is the distribution of successes and failures different between variants beyond what random chance would explain?"
Example
Testing Button Color
Control (Blue):
- Visitors: 5,000
- Conversions: 250
- Rate: 5.0%
Variant (Green):
- Visitors: 5,000
- Conversions: 300
- Rate: 6.0%
Chi-squared result: χ² = 4.17, p = 0.041 (significant)
Use our chi-squared calculator to analyze your tests
Two-Sample T-Test
When to Use It
T-test is for continuous metrics where you're comparing averages:
- Average order value (AOV)
- Revenue per visitor
- Time on site
- Pages per session
- Customer lifetime value (CLV)
Requirements
- Continuous numerical data
- Independent samples
- Roughly normal distribution (or large enough samples for CLT)
- Similar variances between groups (or use Welch's t-test)
What It Tests
"Is the difference in means between the two groups larger than what random chance would explain?"
Example
Testing Upsell Flow
Control:
- Visitors: 500
- Mean AOV: $75.30
- Std Dev: $22.10
Variant:
- Visitors: 500
- Mean AOV: $82.50
- Std Dev: $24.30
T-test result: t = 4.76, p < 0.001 (highly significant)
Use our t-test calculator to analyze continuous metrics
Z-Test for Proportions
When to Use It
Z-test for proportions is very similar to chi-squared for 2×2 tables. Use it when:
- Comparing conversion rates (like chi-squared)
- Large sample sizes (>30 per group)
- You want direct comparison of two proportions
Chi-Squared vs Z-Test
For 2×2 comparisons (2 groups, binary outcome), chi-squared and z-test give the same p-value. The z-test is more intuitive if you're directly comparing two proportions, while chi-squared extends more naturally to multiple groups.
Practical advice: For standard A/B tests, chi-squared and z-test are interchangeable. Use chi-squared (it's more common in CRO).
Common Scenarios and Test Choices
Metric → Test Mapping
| Metric | Test |
|---|---|
| Conversion rate | Chi-squared |
| Click-through rate | Chi-squared |
| Sign-up rate | Chi-squared |
| Average order value | T-test |
| Revenue per visitor | T-test |
| Time on site | T-test |
| Cart abandonment rate | Chi-squared |
| Items per order | T-test |
What About Multiple Variants?
If you're testing more than 2 variants (A/B/C/D test), you need to adjust your approach:
For Proportions (Conversion Rates)
Use chi-squared test with more degrees of freedom. The chi-squared test naturally extends to multiple groups.
For Continuous Metrics
Use ANOVA (Analysis of Variance) to test if any groups differ. If significant, follow up with pairwise t-tests with correction for multiple comparisons (Bonferroni or similar).
Common Mistakes
Mistake 1: Using T-Test for Conversion Rates
Wrong: Treating conversion rate as a continuous variable and using t-test
Right: Conversion rate is based on binary outcomes—use chi-squared
Mistake 2: Using Chi-Squared for Revenue
Wrong: Trying to test average revenue per visitor with chi-squared
Right: Revenue is continuous—use t-test
Mistake 3: Ignoring Sample Size Requirements
Wrong: Running chi-squared with expected cell counts < 5
Right: Ensure adequate sample size before analyzing
Key Takeaways
- Binary outcomes (conversion rate): Chi-squared test
- Continuous metrics (AOV, revenue): Two-sample t-test
- Chi-squared and z-test for proportions give same results for 2-group comparisons
- Match your statistical test to your data type—this matters for validity
- Use our calculators to ensure you're analyzing correctly
Need Help Analyzing Your Tests?
Wise Uplift ensures your experiments use the correct statistical methods for valid, reliable results.