A/B Testing Sample Size: How Many Visitors Do You Really Need?
Sample size is the quiet reason most A/B tests fail to teach you anything. Too few visitors and you can't tell a real lift from random noise; chase that and you'll "win" tests that don't replicate. This guide explains what drives the number, how to estimate it before you launch, and what to do when your traffic simply isn't enough.
If you're just getting started, read what is A/B testing first. Sample size and run time are tightly linked — see also how long to run an A/B test.
The three inputs that set your sample size
You can't pick a sample size in a vacuum. It falls out of three decisions you make up front:
- Baseline conversion rate: the current rate of the metric you're testing. Lower baselines need more traffic to move detectably.
- Minimum detectable effect (MDE): the smallest lift worth caring about. Wanting to detect a 2% relative lift requires far more visitors than detecting a 20% one.
- Confidence and power: conventionally 95% confidence (5% false-positive tolerance) and 80% power (20% chance of missing a real effect). Tighter values cost more samples.
The relationship is unforgiving: required sample size scales roughly with the inverse square of your MDE. Halve the effect you want to detect and you roughly quadruple the visitors you need. This is why "test everything" fails on low-traffic pages — you can only detect large effects there.
A practical way to estimate before launch
Plug your baseline rate and target MDE into any standard sample-size calculator to get the required visitors per variant. Then sanity-check against reality: divide that number by your page's weekly traffic. If the answer is "11 weeks," your MDE is too ambitious for this page, or the page is too low-traffic to test small changes on. Better to learn that before you launch than three weeks into an inconclusive run.
Common sample-size mistakes
The most expensive one is stopping when the counter "feels" big enough instead of when you hit the pre-computed number — that's just peeking with extra steps. Close behind: testing tiny changes (a button-color tweak) that have a realistic effect far below what your traffic can ever detect, then declaring a result anyway. And forgetting that sample size is per variant, not total — a four-arm test needs roughly four times the traffic of a simple A/B.
What to do when you don't have enough traffic
Low-traffic pages aren't hopeless — you just have to change strategy. Test bolder, more visible changes that produce larger effects (a full hero rewrite, not a microcopy tweak). Test higher up the funnel where volume is greatest. Pick a more frequent proxy metric (clicks instead of purchases) when it genuinely predicts the downstream goal. Or let an AI continuous loop accumulate evidence across many rounds rather than betting everything on one underpowered test.
How abTestBot handles sample size
abTestBot's Continuous Loops enforce a 500-samples-per-arm floor on top of the 95% probability-to-win threshold, so no round promotes a winner on thin data. If a page can't produce a detectable signal within the round budget — usually a traffic or effect-size problem — the loop auto-pauses and tells you, instead of manufacturing significance from noise. The mechanics are in the Continuous Loops documentation, and the statistical model is covered in Bayesian vs frequentist testing.
Frequently asked questions
How many visitors do I need for an A/B test?
It depends on your baseline conversion rate and the minimum lift you want to detect — not a fixed number. Use a sample-size calculator with those inputs to get visitors per variant, then check it against your weekly traffic.
Is 1,000 visitors enough for an A/B test?
Sometimes, for large effects on a high-baseline metric — but often not. At a 3% baseline, detecting a modest relative lift can require tens of thousands of visitors per variant. Always compute it for your specific case.
Does sample size apply per variant or in total?
Per variant. Each arm needs to independently reach the required size, so adding variants multiplies the total traffic you need.
Stop guessing at significance
abTestBot enforces sample-size and confidence floors automatically and flags pages that can't be tested — so every winner you ship is real. Paste your URL to begin.
Get started free →