By: Deborah O'Malley, M.Sc | Last updated January, 2023
Google the question, "How long should you let an A/B test run" and you'll get a variety of responses. Most of them incorrect:
In actuality, how long you need to run your A/B test is determined by your sample size requirements.
To run a properly-powered test, you need to begin by calculating your sample size requirements AHEAD of running the study.
You can use a sample size calculator, like this one, to calculate your required sample size. (See this GuessTheTest article on how to best use the calculator).
You can then calculate approximately how long it'll take to reach this sample size requirement. This calculation can be most easily done by using a test duration calculator like this one.
Once calculated, don't stop your test ahead of reaching the pre-calculated sample size requirement – even if results appear significant sooner.
Prematurely declaring a winner, or loser, before meeting sample size requirements is a dangerous testing practice that can cause you to make incorrect calls before the results are fully flushed out.
Assuming sample size requirements can be met, on average an A/B test should run between 2-6 weeks.
A 2-week timeframe ensures the test runs all days of the week and smooths out any data discrepancies in consumer shifts, for example, over the weekend.
Much longer than 6-weeks and the data may start to become muddied.
Things like user patterns may shift or cookies become deleted, introducing a whole new set of variables into the equation. As a result, you won't know if it's changing user behavior or something else that's contributing the test results.
That said, testing timing depends not only on sample size requirements, but also the type of test you're running.
For example, an email test may run just once over 1 hour. As long as the test has a large enough email list to achieve properly-powered, statistically significant results, you're covered.
Other tests may need to run for different durations to take into account factors like seasonality or sales cycles.
In the end, how long your A/B test should run is an "it depends" scenario -- which can be clearly calculated ahead of starting your study.
Do you have any questions or thoughts? Give your feedback in the comments section below:
In-depth A/B test case study analysis on the effectiveness of sliders, across a variety of industries, geographic locations, and time periods, from 2013 onwards.
Test planning and prioritizing requires more than just figuring out which A/B tests to run. To be done well, you need a system in which you can track, map, and show accountability for your testing roadmap. Check out this short video to get an inside view into how Speero plans and prioritizes, and keeps accountability for their tests. Then apply the insights to optimize your own test planning and prioritization process.
The free A/B testing platform, Google Optimize, is sunsetting and will no longer be available. What should experimenters do? This article outlines three simple steps to prepare for the sunset. Hint: don't just sit around and watch it!