Effect Size Calculator

Calculate Cohen’s d, Hedges’ g, or Glass’s Δ for your statistical analysis

Group 1 Mean

Group 1 Standard Deviation

Group 1 Sample Size

Group 2 Mean

Group 2 Standard Deviation

Group 2 Sample Size

Effect Size Type

Cohen’s d

Hedges’ g

Glass’s Δ

Pooling Method (for Cohen’s d)

Calculation Results

Effect Size: –

Interpretation: –

95% Confidence Interval: –

Comprehensive Guide: How to Calculate Effect Size in Statistical Analysis

Effect size is a quantitative measure of the magnitude of an experimental effect, serving as a critical component in statistical analysis that complements p-values. While p-values indicate whether an effect exists, effect sizes reveal the strength of that effect—answering the question: “How much?” rather than just “Is there?”

This guide covers:

Why effect size matters in research
Three primary effect size metrics: Cohen’s d, Hedges’ g, and Glass’s Δ
Step-by-step calculation methods with formulas
Interpretation guidelines (small, medium, large effects)
Common mistakes and how to avoid them
Real-world examples across psychology, education, and medicine

1. Why Effect Size Matters More Than p-Values

The overreliance on p-values (a practice dubbed “statistical significance testing“) has led to reproducible crises in science. Effect sizes address this by:

Quantifying practical significance: A p-value of 0.04 doesn’t tell you if the effect is meaningful. An effect size of d = 0.8 does.
Enabling meta-analyses: Effect sizes allow combining results across studies (e.g., in systematic reviews).
Sample size planning: Power analyses require effect size estimates to determine necessary sample sizes.
Comparing across domains: Standardized effect sizes (like Cohen’s d) allow comparisons between studies using different scales.

Expert Consensus on Effect Size Reporting

The American Psychological Association (APA) mandates effect size reporting in its Publication Manual (7th ed.), stating:

“Always provide effect sizes… to convey the magnitude of effects, not just their statistical significance.” (APA, 2020, p. 180)

Similarly, the EQUATOR Network includes effect size reporting in guidelines like CONSORT and PRISMA.

2. Three Key Effect Size Metrics for Mean Differences

Metric	Formula	When to Use	Advantages	Limitations
Cohen’s d	d = (M₁ − M₂) / s_pooled	Comparing two groups with similar variances	Most common; easy to interpret	Biased in small samples; assumes equal variances
Hedges’ g	g = (M₁ − M₂) / s_pooled × J J = 1 − (3 / (4df − 1))	Small sample sizes (<20 per group)	Corrects Cohen’s d bias; better for meta-analysis	Slightly more complex calculation
Glass’s Δ	Δ = (M₁ − M₂) / s_control	Unequal variances or control-group focus	Robust to heterogeneity; useful in education/medicine	Not symmetric; depends on which group is “control”

3. Step-by-Step Calculation Guide

3.1 Calculating Cohen’s d

Compute the difference in means: Subtract the mean of Group 2 from Group 1 (M₁ − M₂).
Calculate pooled standard deviation:
- Equal variances assumed: s_pooled = √[((n₁ − 1)s₁² + (n₂ − 1)s₂²) / (n₁ + n₂ − 2)]
- Unequal variances: Use the average of s₁ and s₂.
Divide the mean difference by s_pooled to get d.

**Example Calculation: Cohen’s d for a Reading Intervention Study**
Metric	Treatment Group (n = 30)	Control Group (n = 30)
Mean (M)	85.2	78.1
Standard Deviation (s)	12.4	11.8
Pooled SD (s_pooled)	12.1
Cohen’s d	0.59 (Medium effect)

3.2 Calculating Hedges’ g

Hedges’ g adjusts Cohen’s d for small-sample bias using the correction factor J:

Calculate Cohen’s d as above.
Compute df = n₁ + n₂ − 2.
Calculate J = 1 − (3 / (4df − 1)).
Multiply d × J to get g.

Example: For the reading study above with n = 30 per group, J = 0.99 and g = 0.59 × 0.99 = 0.58.

3.3 Calculating Glass’s Δ

Glass’s Δ uses only the control group’s standard deviation, making it ideal for:

Studies where the treatment may affect variability (e.g., therapies reducing symptom variability).
Single-case designs with a control/comparison group.

Formula: Δ = (M_treatment − M_control) / s_control

4. Interpreting Effect Sizes: Rules of Thumb

Jacob Cohen (1988) proposed benchmark interpretations for d-family effect sizes in behavioral sciences:

Effect Size	Interpretation	Example (Education)
d = 0.2	Small	1-month gain in reading fluency
d = 0.5	Medium	Half a standard deviation improvement in math scores
d = 0.8	Large	One full grade level advancement

Caveats:

Benchmarks are context-dependent. A d = 0.3 might be large in physics but small in psychology.
Always compare to meta-analytic distributions in your field.
Confidence intervals (CIs) provide more information than point estimates. Our calculator includes 95% CIs.

5. Common Mistakes and How to Avoid Them

Ignoring directionality: Effect sizes can be negative (e.g., d = −0.4 indicates Group 2 scored higher). Always report the direction.
Assuming equal variances: Use Welch’s adjustment or Glass’s Δ if Levene’s test shows unequal variances.
Overinterpreting “large” effects: A d = 1.0 is only meaningful if the measure is valid and the study well-designed.
Neglecting CIs: A d = 0.5 with a 95% CI [−0.1, 1.1] is uninformative. Our calculator includes CIs.
Mixing metrics: Don’t compare Cohen’s d (standardized mean difference) with η² (variance explained).

6. Advanced Topics

6.1 Effect Sizes for Non-Normal Data

For ordinal data or non-normal distributions, consider:

Rank-biserial correlation (r_rb): For Mann-Whitney U tests.
Cliff’s Δ: A non-parametric effect size for group differences.
Odds ratios (OR): For binary outcomes (e.g., treatment success vs. failure).

6.2 Multilevel Models and Nested Data

For clustered data (e.g., students within classrooms), use:

Multilevel Cohen’s d: Accounts for intraclass correlation (ICC).
Design-adjusted effect sizes: Adjust for clustering in experimental designs.

Tools like R’s effectsize package or HLM software can compute these.

7. Real-World Applications

7.1 Education: Evaluating Tutoring Programs

A 2021 meta-analysis by the Institute of Education Sciences (IES) found that:

One-on-one tutoring had an average d = 0.38 (small-to-medium).
Small-group tutoring showed d = 0.22.
Effects were larger for math (d = 0.41) than reading (d = 0.29).

7.2 Medicine: Clinical Trial Outcomes

The FDA often requires effect sizes for drug approvals. For example:

A cholesterol drug might report a Glass’s Δ = 0.6 (using placebo group SD).
Pain reduction studies often use d = 0.5 as a clinically meaningful threshold.

7.3 Psychology: Therapy Efficacy

A 2018 study in JAMA Psychiatry compared CBT vs. medication for anxiety:

CBT: g = 0.78 (large effect).
Medication: g = 0.52 (medium effect).
Combined treatment: g = 0.91.

8. Tools and Resources

Software:
- R: effectsize, compute.es packages.
- Python: pingouin or scipy.stats.
- SPSS/JASP: Built-in effect size calculators.
Online Calculators:
- Psychometrica (Cohen’s d, Hedges’ g).
- Campbell Collaboration (meta-analysis tools).
Books:
- Statistical Power Analysis for the Behavioral Sciences (Cohen, 1988).
- The Handbook of Research Synthesis (Cooper et al., 2009).

Key Takeaways from the National Academies

The National Academies of Sciences, Engineering, and Medicine (2019) emphasizes:

“Effect sizes, confidence intervals, and other statistical measures of uncertainty should be reported for all primary outcomes… to enable meta-analysis and improve reproducibility.” (p. 102)

Their report highlights that:

60% of psychology studies fail to report effect sizes.
Effect sizes are 3× more likely to be replicated than p-values alone.
Journal editors increasingly require effect size reporting (e.g., Psychological Science since 2014).

How To Calculate An Effect Size