🎉Be Part of the Best in Test Awards! 🎉 Find out more
Account
guess_the_test-white-green
guess_the_test-white-green
Menu
Free Sign up
Member Login

A simple way to accurately calculate Minimum Detectable Effect (MDE)

By: Deborah O'Malley | Last updated December, 2023


What is Minimum Detectable Effect (MDE)?

If you've been into experimentation long enough, you've likely come across the term MDE -- which stands for Minimum Detectable Effect (MDE).

The MDE sounds big and fancy, but the concept is actually quite simple when you break it down. It's the:

  • Minimum = smallest
  • Effect = conversion difference
  • Detectable = you want to see from running the experiment

Why is MDE important?

As this GuessTheTest article explains, in order to run a trustworthy experiment -- one that's properly powered experiment, based on an adequate sample -- it's crucial you calculate you calculate the MDE.

But not just calculate it.

Calculate it AHEAD of running the experiment.

The problem is, doing so can feel like a tricky, speculative exercise.

After all, how can you possibly know what effect, or conversion lift you want to detect from the experiment?! If you knew that, you wouldn't need to run the experiment to begin with!

Adding insult to injury, things get even more hazy because the MDE is directly tied into your sample size requirements.

The larger the MDE, the smaller the sample size needed to run your experiment. And vice versa. The smaller the MDE, the bigger the sample required for your experiment to be adequately powered.

But if your sample size requirements are tied into your MDE, and you don't know your MDE, how can you possibly know the required sample size either?

The answer is: you calculate them. Both. At the same time.

There are lots of head spinning ways to do so. This article outlines a few.

But, if you're not mathematically inclined, here's the good news. . .

You can use a pre-test analysis calculator, like this one, to do all the hard work for you:

Now, as said, that's the good news!

The bad news is, even a calculator like this one isn't all that intuitive.

So, to help you out, this article breaks down exactly what you need to input into an MDE calculator, with step-by-step directions and screenshots so you'll be completely clear and feel fully confident every step of the way.

Let's dig in:


Working the MDE calculator

To work this calculator, you’ll need to know your average weekly traffic and conversion numbers.

If you’re using an analytics platform, like Google Analytics, you’ll be able to easily find this data by looking at your traffic and conversion trends.

Users

In Google’s current Universal Analytics, traffic data can be obtained by going to the Audience/Overview tab:

It’s, typically, best to take a snapshot of at least 3 months to get a broader, or bigger picture view of your audience over time.

For this example, let’s set our time frame from June 1 - Aug. 31.

Now, you can decide to look at these numbers three ways:

  • Users: the total number of users, or visitors, coming to your site during the date range.
  • New users: those visitors who come to your site for the first time during that date range.
  • Sessions: users who interact with your website within a particular timeframe. As this article explains, the same user can have multiple sessions on your website.

Given these differences, calculating the total number of users will probably give you the most accurate indication of your traffic trends.

With these data points in mind, over the 3-month period, this site saw 67,678 users. There are, typically, about 13 weeks in 3 months, so to calculate users per week you’d divide 67,678/13=5,206.

In other words, the site received about 5,206 users/week.

You’d then plug this number into the calculator.

Conversions

To calculate the number of conversions over this time period, you’ll need to have already set-up conversion goals in Google Analytics. Here’s more information on how to do so.

Assuming you’ve set-up conversion goals, you’ll next assess the number of conversions by going to the Goals/Overview tab, selecting the conversion goal you want to measure for your test, and seeing the number of conversions:

In this example, there were 287 conversions over the 3-month time period which amounts to an average of 287/13=22 conversions/week. 

Now, imagine you want to test two variants: version A (the control, or original version) and B (the variant).

You’d now plug the traffic, conversion, and variant numbers into the calculator:

Now you can calculate your baseline conversion rate, which is the rate at which your current (control) version is converting at.

This calculator will automatically calculate your baseline conversion rate for you, based on the numbers above.

However, if you want to confirm the calculations, simpley divide the number of goals completed by the traffic which, in this case, is 22 conversions per week/5,206 visitors per week (22/5,206=0.0042). To get a percentage, times this amount by 100 (0.0042*100=0.42%).

You’d end up with a baseline conversion rate of 0.42%:

Next, plug in the confidence level and power at which you want to obtain results.

As a general A/B testing best practice, you want a confidence level of +95% and statistical power of +80%:

Based on these numbers, the pre-test sample size calculator is indicating to you that you’ll want to run your test for:

  • At least 6 weeks
  • With at least 15,618 visitors/variant
  • Based on a relative MDE of at least 46.43%

The optimal MDE

As a very basic rule of thumb, some optimization experts, like Ronny Kohavi, suggests setting the relative MDE with a range from 2-5%.

It's important to note, the upper bound of this range is up to a maximum of 5%.

If the experiment isn't powered enough to detect a 5% effect, the test results can't considered trustworthy.

However, it's also dangerous to go much beyond 5% because, at least in Ronny's experience, most trustworthy tests don't yield more than a 5% relative conversion lift.

As such, for a mature testing organization which large amounts of traffic and an aggressive optimization program, a relative 1-2% MDE is more reasonable and is still reason to celebrate.

MDE guidelines to follow

In the example shown above, the relative MDE was 46.43%, which is clearly above the 5% best practice.

This MDE indicates traffic is on the lower side and your experiment may not be adequately powered to detect a meaningful effect in a reasonable timeframe.

In this case, if you do decide to proceed with running the test, make sure to follow these guidelines:

  • Calculate the sample size requirements ahead of time. Make sure you have enough traffic to reach the suggested sample size in an adequate timeframe.
  • Don't stop the experiment early before you've reached this calculated sample size target -- even if results appear significant earlier.
  • Run the test for the minimum stated testing time period recommended by the calculator, or at they very least two weeks to round out any discrepancies in user behavior. 
  • Consider if the test is truly worth running, and use the outcome only as an indicator of results, not gospel. Low sample sites (traffic or conversion numbers) are tricky to test on.
  • Focus on making more pronounced changes that should, hopefully, create a bigger positive impact and have a larger effect on conversions.

Hope this article has been useful for you. Share your thoughts and comments below:

Subscribe
Notify of
guest
27 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Tamas
Tamas
2 years ago

A useful article on explaining what's MDE, loved reading it! 🙂 What's not clear to me is that is a 5% MDE a relative change (e. g. from 10% to 10.5%)or a 5% percentage point increase (e. g. from 10% to 15%)?

Tamas
Tamas
2 years ago
Reply to  Deborah

Thanks for clarifying it! 🙂

When Ronny stated the below, which one was he referring to?

"However, it's also dangerous to go much beyond 5% because, at least in Ronny's experience, most trustworthy tests don't yield more than a 5% conversion lift."

Tamas
Tamas
1 year ago
Reply to  Deborah

Hi Deborah,

Thank you again! One more question: in the article you write "If the experiment isn't powered enough to detect a 5% effect, the test results can't considered trustworthy."

In general, should we take MDE into consideration when deeming an AB test trustworthy or not? Let's say that the recommended test duration and sample size for each variant are reached, but the conversion uplift is below the MDE (e.g. MDE is 8%, but the conversion uplift is only 7%)? Should we accept the test results or not?

Tamas
Tamas
1 year ago
Reply to  Deborah

Thank you, Deborah! 🙂 Just to make it absolutely sure that I understand it: if the pre-calculated sample size has been met, but the conversion uplift (7%) is below the pre-calculated MDE (8%), should I consider the test trustworthy or not? Also, a slightly related question: what if the test result is not significant (p-value is above the pre-defined threshold) and the control and the treatment variants convert similarly (let's say 5.74% and 5.76% respectively)? Can I accept the treatment as the winner? I know that the answer is no, but the reasoning could be that the 2 variants convert… Read more »

Tamas
Tamas
1 year ago
Reply to  Deborah

Deborah - as always, thank you very much, I appreciate your insightful answers! Love your blog and keep up the good work!

Bruna
Bruna
2 years ago

The article is really good! Congratulations! I have a question about sample size calculation for continuous metric. Do you have any reference on this please?

Bruna
Bruna
2 years ago
Reply to  Deborah

For example, my experiment consists of optimizing billing. But when using this calculator (https://www.evanmiller.org/ab-testing/sample-size.html), which is great, the input parameters don't make sense anymore, because the "current effect" is, for example $100, and the "expected effect" is $200.

Bruna
Bruna
2 years ago
Reply to  Deborah

That's right, perfect! Thank you so much!
I thought I should use another formula for calculating the sample size, like as "sample size for a two-sample t-test", so I asked about sample size for continuous metrics.

Angie
1 year ago

Hi Deborah I have a question about calculating the length of time to do the test. Based on the calculations using https://cxl.com/ab-test-calculator/. I've arrived at: At least 6 weeks With at least 72,717 visitors/variant Based on a relative MDE of at least 35.33% Is this test worth doing at all with MDE above 5% (which the recommendation is to keep it below 5%?) I think you suggested calculating the sample size requirements ahead of time. Do we then use this calculator to calculate sample size requirements? https://www.evanmiller.org/ab-testing/sample-size.html If so, after using the calculator, i've arrived at a sample size of… Read more »

AB Test Sample Size Calculators by CXL.png
Melanie
Melanie
1 year ago

Hi Deborah, Thanks for this insightful article! This is probably a dumb question, but: Does it matter which MDE I enter into my a/b testing tool? Background : Ihave very little experience with a/b testing as I only started very recently. At our company, we use a tool that allows you to enter the following parameters before running your test: runtime, confidence level (which I kept at 95%), power (which I kept at 80%), and MDE. I used the calculator you linked in your article to calculate runtime based on visitor numbers and conversions and I aimed for a very… Read more »

John
John
8 months ago

Hello, I have a question about whether it is possible to carry out a AB test where there are 2 primary metrics to optimise for. e.g. I have a webpage where users can perform 2 actions, and I want to make a change on the webpage to see if the conversion rate for both actions increases significantly. I'm not sure how to plug the various values into a calculator. for example, my webpage gets 34802 visitors a week, and the baseline conversion rate for the 2 actions are 0.91% and 5.6% respectively. How can I calculate a sample size and… Read more »

Victoria
Victoria
7 months ago

Hey Deborah, such a good article, thank you!
I wanted to ask you a question:
shown uplift 7% with 92% significance (running for 3 weeks) (required: 90% uplift)
MDE 9%
To get 7% uplift I should run a test for 5 weeks.
Should I extend the experiment for two more weeks?
What if I don't have time? Should I just stop it and say that it's flat? Or that there is a significant uplift, however, its power is lower?

henrique_miamoto@hotmail.com
henrique_miamoto@hotmail.com
3 months ago

Hey, congrats on the article! Very informative.

What are the calculations behind MDE shown in the calculator?
In other words, how do I get the MDE from the conversion rate, power, and confidence interval?

Thanks!

Maarten
Maarten
15 days ago

Hi Deborah. Thanks for the informative article. I have a couple of questions. I wonder what your thoughts are on this. Lets say that you calculated the MDE and sample size ahead of time for a specific test. In this scenario the MDE was 5% with a minimum sample size of 80.000 per variant for a duration of 4 weeks. During the testing period it seems you have less traffic then expected but a higher effect. So after 4 weeks you found a significant result with a sample of  70.000 per variant and a 7% uplift. What would you do?… Read more »

Other Posts You Might Enjoy

👋 Use the AI-driven chatbot to answer any A/B testing question
Chat Icon
magnifiercrossmenu-circlecross-circle
linkedin facebook pinterest youtube rss twitter instagram facebook-blank rss-blank linkedin-blank pinterest youtube twitter instagram