Effect Size Calculator
Calculate Cohen’s d, Hedges’ g, or Glass’s Δ for your statistical analysis
Calculation Results
Comprehensive Guide: How to Calculate Effect Size in Statistical Analysis
Effect size is a quantitative measure of the magnitude of an experimental effect, serving as a critical component in statistical analysis that complements p-values. While p-values indicate whether an effect exists, effect sizes reveal the strength of that effect—answering the question: “How much?” rather than just “Is there?”
This guide covers:
- Why effect size matters in research
- Three primary effect size metrics: Cohen’s d, Hedges’ g, and Glass’s Δ
- Step-by-step calculation methods with formulas
- Interpretation guidelines (small, medium, large effects)
- Common mistakes and how to avoid them
- Real-world examples across psychology, education, and medicine
1. Why Effect Size Matters More Than p-Values
The overreliance on p-values (a practice dubbed “statistical significance testing“) has led to reproducible crises in science. Effect sizes address this by:
- Quantifying practical significance: A p-value of 0.04 doesn’t tell you if the effect is meaningful. An effect size of d = 0.8 does.
- Enabling meta-analyses: Effect sizes allow combining results across studies (e.g., in systematic reviews).
- Sample size planning: Power analyses require effect size estimates to determine necessary sample sizes.
- Comparing across domains: Standardized effect sizes (like Cohen’s d) allow comparisons between studies using different scales.
2. Three Key Effect Size Metrics for Mean Differences
| Metric | Formula | When to Use | Advantages | Limitations |
|---|---|---|---|---|
| Cohen’s d | d = (M1 − M2) / spooled | Comparing two groups with similar variances | Most common; easy to interpret | Biased in small samples; assumes equal variances |
| Hedges’ g | g = (M1 − M2) / spooled × J J = 1 − (3 / (4df − 1)) |
Small sample sizes (<20 per group) | Corrects Cohen’s d bias; better for meta-analysis | Slightly more complex calculation |
| Glass’s Δ | Δ = (M1 − M2) / scontrol | Unequal variances or control-group focus | Robust to heterogeneity; useful in education/medicine | Not symmetric; depends on which group is “control” |
3. Step-by-Step Calculation Guide
3.1 Calculating Cohen’s d
- Compute the difference in means: Subtract the mean of Group 2 from Group 1 (M1 − M2).
- Calculate pooled standard deviation:
- Equal variances assumed: spooled = √[((n1 − 1)s12 + (n2 − 1)s22) / (n1 + n2 − 2)]
- Unequal variances: Use the average of s1 and s2.
- Divide the mean difference by spooled to get d.
| Metric | Treatment Group (n = 30) | Control Group (n = 30) |
|---|---|---|
| Mean (M) | 85.2 | 78.1 |
| Standard Deviation (s) | 12.4 | 11.8 |
| Pooled SD (spooled) | 12.1 | |
| Cohen’s d | 0.59 (Medium effect) | |
3.2 Calculating Hedges’ g
Hedges’ g adjusts Cohen’s d for small-sample bias using the correction factor J:
- Calculate Cohen’s d as above.
- Compute df = n1 + n2 − 2.
- Calculate J = 1 − (3 / (4df − 1)).
- Multiply d × J to get g.
Example: For the reading study above with n = 30 per group, J = 0.99 and g = 0.59 × 0.99 = 0.58.
3.3 Calculating Glass’s Δ
Glass’s Δ uses only the control group’s standard deviation, making it ideal for:
- Studies where the treatment may affect variability (e.g., therapies reducing symptom variability).
- Single-case designs with a control/comparison group.
Formula: Δ = (Mtreatment − Mcontrol) / scontrol
4. Interpreting Effect Sizes: Rules of Thumb
Jacob Cohen (1988) proposed benchmark interpretations for d-family effect sizes in behavioral sciences:
| Effect Size | Interpretation | Example (Education) |
|---|---|---|
| d = 0.2 | Small | 1-month gain in reading fluency |
| d = 0.5 | Medium | Half a standard deviation improvement in math scores |
| d = 0.8 | Large | One full grade level advancement |
Caveats:
- Benchmarks are context-dependent. A d = 0.3 might be large in physics but small in psychology.
- Always compare to meta-analytic distributions in your field.
- Confidence intervals (CIs) provide more information than point estimates. Our calculator includes 95% CIs.
5. Common Mistakes and How to Avoid Them
- Ignoring directionality: Effect sizes can be negative (e.g., d = −0.4 indicates Group 2 scored higher). Always report the direction.
- Assuming equal variances: Use Welch’s adjustment or Glass’s Δ if Levene’s test shows unequal variances.
- Overinterpreting “large” effects: A d = 1.0 is only meaningful if the measure is valid and the study well-designed.
- Neglecting CIs: A d = 0.5 with a 95% CI [−0.1, 1.1] is uninformative. Our calculator includes CIs.
- Mixing metrics: Don’t compare Cohen’s d (standardized mean difference) with η2 (variance explained).
6. Advanced Topics
6.1 Effect Sizes for Non-Normal Data
For ordinal data or non-normal distributions, consider:
- Rank-biserial correlation (rrb): For Mann-Whitney U tests.
- Cliff’s Δ: A non-parametric effect size for group differences.
- Odds ratios (OR): For binary outcomes (e.g., treatment success vs. failure).
6.2 Multilevel Models and Nested Data
For clustered data (e.g., students within classrooms), use:
- Multilevel Cohen’s d: Accounts for intraclass correlation (ICC).
- Design-adjusted effect sizes: Adjust for clustering in experimental designs.
Tools like R’s effectsize package or HLM software can compute these.
7. Real-World Applications
7.1 Education: Evaluating Tutoring Programs
A 2021 meta-analysis by the Institute of Education Sciences (IES) found that:
- One-on-one tutoring had an average d = 0.38 (small-to-medium).
- Small-group tutoring showed d = 0.22.
- Effects were larger for math (d = 0.41) than reading (d = 0.29).
7.2 Medicine: Clinical Trial Outcomes
The FDA often requires effect sizes for drug approvals. For example:
- A cholesterol drug might report a Glass’s Δ = 0.6 (using placebo group SD).
- Pain reduction studies often use d = 0.5 as a clinically meaningful threshold.
7.3 Psychology: Therapy Efficacy
A 2018 study in JAMA Psychiatry compared CBT vs. medication for anxiety:
- CBT: g = 0.78 (large effect).
- Medication: g = 0.52 (medium effect).
- Combined treatment: g = 0.91.
8. Tools and Resources
- Software:
- R:
effectsize,compute.espackages. - Python:
pingouinorscipy.stats. - SPSS/JASP: Built-in effect size calculators.
- R:
- Online Calculators:
- Psychometrica (Cohen’s d, Hedges’ g).
- Campbell Collaboration (meta-analysis tools).
- Books:
- Statistical Power Analysis for the Behavioral Sciences (Cohen, 1988).
- The Handbook of Research Synthesis (Cooper et al., 2009).