A/B Test Significance Calculator
Check whether the difference between two conversion rates is statistically significant — with p-value, z-score, and a plain-language recommendation.
Variant A
Variant B
Sample size calculator
Enter a baseline conversion rate, expected lift, and power to calculate the required sample size.
What is statistical significance in A/B testing?
Statistical significance tells you whether the difference between two conversion rates is likely a real effect or just random noise. A Z-test for proportions compares the conversion rates of Variant A and Variant B, accounting for sample size, and produces a p-value — the probability of seeing a difference this large (or larger) if there were actually no difference between the variants. If the p-value is below your significance threshold (e.g. 0.05 for 95% confidence), the result is considered statistically significant.
How to use the A/B Significance Calculator
Enter the number of visitors and conversions for Variant A and Variant B, choose a confidence level (90%, 95%, or 99%), and the calculator instantly shows conversion rates, lift, p-value, z-score, and whether the result is statistically significant. Use the Sample Size Calculator below to estimate how many visitors per variant you need before starting a test, based on your baseline conversion rate and the minimum lift you want to detect.
A/B testing best practices
Decide on your sample size and test duration before launching, using the Sample Size Calculator — stopping a test early because it 'looks significant' inflates your false-positive rate (a problem known as 'peeking'). Run tests for at least one full business cycle (typically 1-2 weeks) to account for day-of-week effects. A 95% confidence level is the standard default; only use 90% for low-stakes tests and 99% for changes that are expensive or risky to roll back.
FAQ
What is a p-value in A/B testing?
The p-value is the probability of observing a difference between Variant A and Variant B at least as large as the one measured, assuming there's actually no real difference between them. A smaller p-value means the observed difference is less likely to be due to random chance. A p-value below 0.05 is conventionally treated as statistically significant at the 95% confidence level.
What does 'statistically significant' actually mean?
A result is statistically significant when the p-value falls below your chosen significance threshold (1 minus your confidence level). At 95% confidence, that threshold is 0.05. It means the observed difference between variants is unlikely to be explained by random sampling variation alone — but it doesn't guarantee the effect size will hold at the same magnitude in the future.
How many visitors do I need for an A/B test?
Use the Sample Size Calculator: enter your baseline conversion rate and the minimum lift you want to be able to detect (e.g. 10%), and it returns the required number of visitors per variant at 80% statistical power. Smaller expected lifts and lower baseline conversion rates both require larger sample sizes.
Why does the calculator say my result isn't significant even though Variant B looks better?
A higher conversion rate alone doesn't mean the difference is real — with small sample sizes, random variation can easily produce a difference that looks meaningful but isn't statistically reliable. Collect more data (see the Sample Size Calculator) before drawing conclusions.
What confidence level should I use — 90%, 95%, or 99%?
95% is the standard default for most A/B tests. Use 90% only for low-risk, easily reversible tests where you're comfortable with a higher chance of a false positive. Use 99% for high-stakes changes — such as pricing or checkout flow changes — where acting on a false positive would be costly.
Need full PPC setup?
Hire Maker Unit — senior team, 12 years of experience, €3.6M+ in managed ad spend.