Free experimentation tool

Free A/B Test Sample Size Calculator

An A/B test sample size calculator tells you how many visitors per variant your experiment needs to reliably detect a real difference between control and treatment. Enter your baseline conversion rate, the minimum detectable effect, statistical power, and significance level, and the calculator instantly returns the required sample size and expected test duration.

Want to plan tests off the back of customer signal? Use FeatureVote to gather feedback, prioritize feature requests, and decide what to test next.

Test parameters

Adjust any input and the sample size, total traffic, and duration estimate update instantly. Nothing is sent to a server.

Required sample size

Live results for a two-proportion z-test based on your inputs.

Visitors per variant

31,243

Across 2 variants: 62,486 total visitors.

Estimated test duration

62.49 days

Based on 1,000 visitors/day per variant.

Conversion targets

Baseline: 5%

Treatment target: 5.5%

Interpretation

You need approximately 31,243 visitors per variant to detect a 10% relative lift from a 5% baseline with 80% power and 95% confidence (two tailed).

Always run for at least one full business cycle and avoid peeking at results before reaching the planned sample size.

How to use the A/B test sample size calculator

  1. 1

    Enter your baseline conversion rate

    Pull the current conversion rate of the page or flow you plan to test. Even a 0.1 percentage point change matters at low baselines.

  2. 2

    Set your minimum detectable effect

    Pick the smallest lift that would change a real product decision. Use relative % for revenue and conversion lifts, absolute pp for retention or other tightly bounded rates.

  3. 3

    Choose statistical power and significance

    Stick with 80% power and 95% confidence unless you have a specific reason to change them. Higher values mean a much larger sample.

  4. 4

    Read the sample size and duration

    The calculator shows the required visitors per variant, total across variants, and the estimated number of days to reach that volume.

A/B test sample size FAQ

Common questions about sample size, statistical power, significance, and minimum detectable effect.

What is statistical significance in A/B testing?

Statistical significance is the probability that the difference between your variants is real and not due to random chance. A 95% significance level means there is only a 5% probability of declaring a winner when no real difference exists. It is controlled by the alpha (significance level) input in this calculator.

How do I choose minimum detectable effect (MDE)?

MDE is the smallest improvement you want the test to reliably detect. Smaller MDEs require dramatically more traffic. A common starting point is a 5-10% relative lift over the baseline conversion rate. Pick a value tied to a real business outcome - a smaller MDE may be statistically meaningful but practically irrelevant, and a larger MDE may miss real but modest wins.

What is statistical power and why does 80% matter?

Statistical power is the probability that your test will detect a real effect of the size you specified as the MDE. 80% power is the industry default - it means that if a true lift of the MDE size exists, you will catch it 80% of the time. Higher power (90% or 95%) reduces the risk of false negatives but requires a larger sample.

How long should an A/B test run?

Run your test until you reach the required sample size per variant, and ideally for at least one full business cycle (often a week or two) to capture day-of-week and weekend effects. Do not stop early just because results look significant - peeking inflates false positive rates. The duration estimate in this calculator divides the required total sample by your daily traffic to give a baseline expectation.

What if my conversion rate is very low or very high?

Very low or very high baseline conversion rates require larger sample sizes for the same relative MDE. A 10% relative lift on a 1% baseline is just a 0.1 percentage point absolute change, which is hard to detect. Either accept a longer test, target a larger MDE, or test higher up the funnel where the baseline conversion is greater.

Decide what to test next, with real user signal

Sample size math is only useful if you are testing the right thing. FeatureVote helps product teams collect feature requests, prioritize feedback, and ship the changes customers actually want.