Minimum Detectable Effect Calculator
Calculate the smallest effect size your experiment can reliably detect based on your sample size and statistical parameters.
Comprehensive Guide: How to Calculate Minimum Detectable Effect (MDE)
The Minimum Detectable Effect (MDE) is a critical concept in experimental design that determines the smallest effect size your test can reliably detect given your sample size and statistical parameters. Understanding and calculating MDE is essential for designing powerful A/B tests, clinical trials, and other experimental studies.
Why MDE Matters in Experimental Design
Before diving into calculations, it’s important to understand why MDE is so crucial:
- Resource Allocation: Helps determine if your sample size is sufficient to detect meaningful changes
- Practical Significance: Ensures you’re testing for effects that matter in real-world applications
- Risk Management: Reduces the chance of false negatives (Type II errors)
- Cost-Benefit Analysis: Helps balance between test duration and detectable effect size
The Mathematical Foundation of MDE
The calculation of Minimum Detectable Effect relies on several statistical concepts:
- Baseline Conversion Rate (p): Your current conversion rate before the experiment
- Sample Size (n): Number of observations in each variation group
- Significance Level (α): Probability of false positive (typically 0.05)
- Statistical Power (1-β): Probability of detecting a true effect (typically 0.80)
- Test Type: One-sided or two-sided test
The core formula for MDE in a two-proportion z-test is derived from:
MDE = √[(p(1-p)/n) * (Z1-α/2 + Z1-β)2]
Where Z values come from the standard normal distribution.
Step-by-Step Calculation Process
-
Determine Your Baseline:
Start with your current conversion rate. For example, if your website has a 5% conversion rate, p = 0.05.
-
Set Statistical Parameters:
Choose your significance level (typically 0.05) and desired power (typically 0.80).
-
Calculate Z-Scores:
Find the Z-score for your significance level (Z1-α/2) and for your power (Z1-β). For α=0.05 and power=0.80, these are approximately 1.96 and 0.84 respectively.
-
Compute Standard Error:
Calculate SE = √[p(1-p)/n] where n is your sample size per variation.
-
Calculate MDE:
Multiply the standard error by the sum of your Z-scores to get the absolute MDE.
-
Convert to Relative Terms:
Divide the absolute MDE by your baseline to get the relative lift percentage.
Practical Example Calculation
Let’s work through a concrete example with:
- Baseline conversion rate: 5% (0.05)
- Sample size per variation: 1,000
- Significance level: 0.05 (two-sided)
- Power: 0.80
Step 1: Calculate standard error
SE = √[0.05 × (1-0.05) / 1000] = √(0.0475/1000) ≈ 0.00689
Step 2: Find Z-scores
Z1-α/2 = 1.96 (for 95% confidence)
Z1-β = 0.84 (for 80% power)
Step 3: Calculate MDE
MDE = 0.00689 × (1.96 + 0.84) ≈ 0.0187 or 1.87%
Step 4: Convert to relative lift
Relative lift = (0.0187 / 0.05) × 100 ≈ 37.4%
This means with 1,000 visitors per variation, you can detect a minimum absolute lift of 1.87 percentage points (from 5% to 6.87%), which represents a 37.4% relative improvement.
| Sample Size per Variation | Baseline 2% | Baseline 5% | Baseline 10% | Baseline 20% |
|---|---|---|---|---|
| 500 | 1.4% (70%) | 2.2% (44%) | 3.1% (31%) | 4.4% (22%) |
| 1,000 | 1.0% (50%) | 1.6% (32%) | 2.2% (22%) | 3.1% (15%) |
| 2,000 | 0.7% (35%) | 1.1% (22%) | 1.6% (16%) | 2.2% (11%) |
| 5,000 | 0.4% (22%) | 0.7% (14%) | 1.0% (10%) | 1.4% (7%) |
Note: Values show absolute lift (relative lift in parentheses) for 80% power and 95% confidence
Common Mistakes in MDE Calculation
Avoid these pitfalls when working with Minimum Detectable Effect:
-
Ignoring Baseline Rate:
MDE is highly sensitive to your baseline conversion rate. A 2% absolute lift means something very different for a 5% baseline vs. a 50% baseline.
-
Confusing Absolute and Relative:
Always clarify whether you’re discussing absolute percentage point changes or relative percentage changes.
-
Neglecting Test Type:
One-sided tests have different critical values than two-sided tests, affecting your MDE.
-
Overlooking Multiple Testing:
If you’re running multiple comparisons, you may need to adjust your significance level (e.g., Bonferroni correction).
-
Assuming Equal Variance:
The standard formula assumes equal sample sizes and variances between groups. Violations can affect accuracy.
Advanced Considerations
Unequal Sample Sizes
When your variations have different sample sizes, the MDE calculation becomes more complex. The standard error term changes to:
SE = √[p1(1-p1)/n1 + p2(1-p2)/n2]
Where p1 and p2 are the conversion rates and n1 and n2 are the sample sizes for each group.
Non-Normal Distributions
For very small sample sizes or extreme conversion rates (near 0% or 100%), the normal approximation may not hold. In these cases, consider:
- Exact binomial tests
- Fisher’s exact test
- Bayesian approaches
Sequential Testing
If you’re peeking at results during the test (sequential testing), you need to adjust your significance thresholds to maintain overall Type I error rates. Methods include:
- O’Brien-Fleming boundaries
- Pocock boundaries
- Alpha spending functions
| Method | When to Use | Pros | Cons |
|---|---|---|---|
| Normal Approximation | Large samples, rates between 10-90% | Simple to calculate, widely understood | Less accurate for extreme rates or small samples |
| Exact Binomial | Small samples, extreme conversion rates | Precise, no approximation errors | Computationally intensive, harder to explain |
| Bayesian | When prior information is available | Incorporates prior knowledge, intuitive interpretation | Requires specifying priors, less familiar to some |
| Permutation Tests | Non-normal data, small samples | No distributional assumptions, exact p-values | Computationally expensive, harder to implement |
Applying MDE in Different Fields
Digital Marketing and A/B Testing
In website optimization, MDE helps determine:
- How long to run a test to detect meaningful improvements
- Whether your traffic volume supports testing small changes
- How to prioritize tests based on detectable effect sizes
For example, if your MDE is 5% relative lift but you’re testing a minor button color change that you expect to improve conversions by only 2%, you either need more traffic or should focus on more impactful changes.
Clinical Trials
In medical research, MDE is called the “minimal clinically important difference” (MCID). It represents the smallest treatment effect that would change clinical practice. Regulatory bodies often require justification of the chosen MDE in trial designs.
Manufacturing and Quality Control
In industrial settings, MDE helps determine:
- The smallest defect rate increase that can be detected
- Sample sizes needed for quality assurance testing
- Sensitivity of process monitoring systems
Tools and Software for MDE Calculation
While our calculator provides a quick solution, several specialized tools exist:
- Evan’s Awesome A/B Tools: Free online calculator with visualizations
- Optimizely Sample Size Calculator: Integrated with their experimentation platform
- R/Python Packages:
- R:
pwrpackage - Python:
statsmodelsandscipy.stats
- R:
- G*Power: Comprehensive power analysis software
- PASS: Commercial statistical power analysis software
Interpreting and Communicating MDE Results
Effective communication of MDE is crucial for stakeholder buy-in:
-
Contextualize the Number:
Always present MDE in both absolute and relative terms, and relate it to business metrics.
Example: “We can detect a 2 percentage point increase (40% relative lift), which would mean approximately 500 more conversions per month.”
-
Visualize the Range:
Use charts to show the relationship between sample size and detectable effect.
-
Discuss Practical Significance:
Help stakeholders understand whether the detectable effect is meaningful for business decisions.
-
Highlight Limitations:
Be clear about what effects you cannot detect with your current design.
Frequently Asked Questions
How does MDE relate to statistical power?
MDE and statistical power are inversely related when holding other factors constant. Increasing your desired power (e.g., from 80% to 90%) will increase your MDE, meaning you can only detect larger effects. Conversely, if you’re willing to accept lower power, you can detect smaller effects.
Can I calculate MDE for continuous metrics like revenue?
Yes, the concept applies to continuous metrics as well. For normally distributed data, you would use the standard deviation instead of the baseline conversion rate in your calculations. The formula becomes:
MDE = (t1-α/2 + t1-β) × σ × √(2/n)
Where σ is the standard deviation and t-values come from the t-distribution.
How does test duration affect MDE?
Test duration indirectly affects MDE through its impact on sample size. Longer tests generally accumulate more samples, which reduces the MDE. However, other factors like seasonality or novelty effects may complicate longer tests.
What’s the difference between MDE and the effect size I actually observe?
MDE is what you could detect with your current design, while the observed effect size is what you actually measured in your experiment. Your observed effect might be:
- Larger than MDE (statistically significant)
- Smaller than MDE (not statistically significant)
- In the opposite direction (potential negative effect)
How do I choose an appropriate MDE for my test?
Selecting an MDE involves both statistical and business considerations:
- Determine the smallest effect that would change your decision
- Consider your available sample size and test duration
- Balance between Type I and Type II error risks
- Consult historical data on effect sizes in your domain
- Get stakeholder input on what constitutes a meaningful change