Sequential Testing: When to Stop Your A/B Test Early
Learn how sequential sampling methodology lets you monitor tests continuously while maintaining statistical validity. Save time without sacrificing accuracy.
The Problem with Traditional A/B Testing
Traditional fixed-sample A/B testing has a painful requirement: decide on a sample size upfront and don't look at results until you hit that number.
But that's not how real businesses work. You need to monitor tests. You want to stop early winners to capitalize faster and stop clear losers to minimize damage.
The dilemma: If you peek at results and make decisions, you inflate your false positive rate from 5% to 20-30%. Your "winners" are often just noise.
Enter sequential testing—a methodology that lets you look as often as you want while maintaining proper error rates.
What Is Sequential Testing?
Sequential testing (also called sequential probability ratio test or SPRT) is a statistical methodology that allows you to:
- Check test results continuously
- Stop as soon as you have strong evidence either way
- Maintain proper false positive and false negative rates
It adjusts the significance threshold based on how much data you've collected, compensating for multiple looks.
How It Works: The Boundaries
Unlike fixed-sample testing where you have a single significance threshold (p < 0.05), sequential testing uses boundaries that change over time.
Imagine a graph:
- X-axis: Number of visitors tested
- Y-axis: Cumulative difference between variants
- Upper boundary: If cumulative difference crosses this, variant B wins
- Lower boundary: If cumulative difference crosses this, control wins
- Between boundaries: Keep testing, no conclusion yet
Early in the test, boundaries are wide apart (harder to declare a winner). As sample size grows, boundaries narrow (easier to reach a conclusion).
Benefits of Sequential Testing
1. Stop Tests Faster
On average, sequential tests require 50% less traffic than fixed-sample tests to reach the same conclusion. Clear winners or losers become obvious faster.
2. Peek Without Penalty
Check results daily, hourly, or in real-time without inflating your false positive rate. The methodology accounts for continuous monitoring.
3. Minimize Exposure to Losing Variants
If a variant is clearly underperforming, you can stop early rather than forcing 50% of traffic through a bad experience for weeks.
4. Capitalize on Winners Faster
When a clear winner emerges, implement it immediately rather than waiting arbitrary time periods. Every day you delay costs money.
When to Use Sequential Testing
Sequential testing is ideal when:
- You need to monitor tests continuously for business reasons
- You want flexibility to stop early if results are clear
- You're testing something with meaningful business risk (can't wait weeks)
- You have variable traffic and can't predict exact test duration
Stick with fixed-sample testing when:
- You can easily wait for a predetermined sample size
- You won't be tempted to peek (discipline!)
- You're testing something low-risk where speed doesn't matter
Implementing Sequential Testing
Step 1: Set Your Parameters
You need to specify:
- Baseline conversion rate: Your current performance
- Minimum detectable effect: Smallest lift you care about
- Alpha (false positive rate): Typically 5%
- Beta (false negative rate): Typically 20% (80% power)
Step 2: Calculate Boundaries
The sequential boundaries are calculated using the sequential probability ratio. The math is complex, but our sequential testing calculator handles it for you.
Step 3: Monitor Continuously
As data comes in, plot your cumulative difference against the boundaries:
- If cumulative difference crosses the upper boundary → Variant wins
- If cumulative difference crosses the lower boundary → Control wins
- If between boundaries → Continue testing
Step 4: Make Decisions
When a boundary is crossed, you have statistical justification to stop and implement the winner.
Real Example: Sequential vs Fixed-Sample
Scenario: Testing a new pricing page
- Baseline conversion rate: 5%
- New variant true effect: 15% lift (5% → 5.75%)
- Desired power: 80%, significance: 95%
Fixed-sample approach:
- Required sample size: 14,000 per variant = 28,000 total
- With 2,000 visitors/day: 14 days to complete
- Must wait full 14 days regardless of early signals
Sequential approach:
- Same error rates, but boundaries allow early stopping
- Clear signal emerged after 9,000 visitors (day 4.5)
- Test stopped early, saving 9.5 days
- Winner implemented faster, capturing additional revenue
Business impact: If the lift generates $500/day in extra revenue, stopping 9.5 days early captured an additional $4,750.
Common Misconceptions
Misconception 1: "Sequential testing lets you stop whenever you want"
Reality: You can only stop when boundaries are crossed. Just because you're "significant" at p < 0.05 doesn't mean you've crossed the sequential boundary.
Misconception 2: "Sequential testing always requires less data"
Reality: On average yes, but for ambiguous tests with small effects, you might need more data. Sequential testing trades off maximum sample size for early stopping potential.
Misconception 3: "Any testing tool supports sequential testing"
Reality: Most platforms show p-values designed for fixed-sample testing. You need proper sequential boundaries, which require custom implementation or specialized tools.
Limitations and Considerations
Still Need Minimum Sample Sizes
You can't stop after 100 visitors just because boundaries are crossed. There's typically a minimum sample size (e.g., 1,000 per variant) before sequential boundaries become reliable.
Works Best for Clear Effects
If the true effect is barely above your MDE, sequential testing won't help much. It shines when effects are larger than expected.
Requires Proper Implementation
You can't just check p-values daily and call it sequential testing. You need actual sequential boundaries calculated correctly.
Sequential Testing vs Bayesian Methods
You might also hear about Bayesian A/B testing, which allows continuous monitoring. How does it compare?
Sequential testing (Frequentist):
- Clear decision boundaries
- Controlled error rates (type I and II)
- Widely accepted and understood
- Conservative stopping rules
Bayesian methods:
- Probability statements ("95% chance B is better")
- Incorporates prior beliefs
- More intuitive interpretation for some
- Can be more aggressive on early stopping
Both are valid. Sequential testing is more conservative and has longer track record in scientific research. Bayesian is newer to mainstream CRO but growing in popularity.
How to Get Started
Ready to implement sequential testing? Here's your action plan:
- Use our calculator: Start with our sequential testing calculator to understand the boundaries for your specific test.
- Document methodology: Create a testing protocol that specifies when you'll check results and how you'll interpret boundaries.
- Implement monitoring: Set up a dashboard to plot cumulative differences against boundaries in real-time.
- Establish minimums: Define minimum sample sizes before boundaries can be trusted (e.g., 1,000 per variant).
- Create decision rules: Document exactly what happens when a boundary is crossed (QA, rollout plan, etc.).
Key Takeaways
- Sequential testing allows continuous monitoring without inflating false positives
- Tests typically conclude 50% faster than fixed-sample approaches
- Uses dynamic boundaries that account for sample size
- Ideal for high-risk tests or when business needs require flexibility
- Requires proper implementation—can't just check p-values repeatedly
- Use our sequential testing calculator to get started correctly
Need Sequential Testing Guidance?
Wise Uplift implements advanced testing methodologies including sequential testing for clients who need maximum efficiency.