Power analysis essentials in medical statistics

Power analysis determines the sample size you need to detect a meaningful effect in your research. Get it wrong, and you risk wasting resources on underpowered studies—or missing important discoveries entirely.

This evidence-based guide covers everything from core concepts to practical implementation, backed by insights from peer-reviewed research.

Sample Size Determination - Power Analysis Guide

The Four Pillars of Power Analysis

Every power calculation balances four interconnected parameters. Know three, and you can calculate the fourth:

d Effect Size
α Alpha (usually 0.05)
1-β Power (≥80%)
n Sample Size
ParameterWhat It MeansStandard Value
Effect SizeHow big is the difference you expect to find?Cohen’s d: 0.2 (small), 0.5 (medium), 0.8 (large)
Alpha (α)Risk of false positive (Type I error)0.05 (5%)
Power (1-β)Probability of detecting a real effect0.80 (80%) minimum
Sample Size (n)Number of participants neededCalculated from the above

How Power Relates to Sample Size

The relationship between power, effect size, and sample size is illustrated below:

Power Analysis Curves
Figure 1. Power curves showing how sample size requirements change with effect size at α=0.05
Key Insight: As effect size decreases, required sample size increases exponentially. Detecting a small effect (d=0.2) requires 15× more participants than detecting a large effect (d=0.8).

10 Steps to Conduct Power Analysis

  1. 1 Define your research question and hypothesis
    Identify the study design: randomized controlled trial, cohort study, or case-control study.
  2. 2 Select the appropriate statistical test
    Match your test to your data: t-test, ANOVA, chi-square, or logistic regression.
  3. 3 Set the significance level (α)
    Typically 0.05—a 5% probability of Type I error (false positive).
  4. 4 Determine the desired power (1-β)
    Aim for 0.80 or higher—the probability of detecting a true effect.
  5. 5 Estimate the effect size
    Use pilot data, previous studies, or clinical judgment. This is often the hardest step.
  6. 6 Specify allocation ratio
    Equal groups (1:1) are most efficient; adjust if clinical constraints require unequal allocation.
  7. 7 Run power analysis software
    Use G*Power (free), PASS, nQuery, or R packages to calculate sample size.
  8. 8 Adjust for real-world factors
    Add 10-20% for attrition, non-compliance, or cluster designs (ICC adjustment).
  9. 9 Interpret in context
    Is the calculated sample feasible? Can you recruit enough participants?
  10. 10 Conduct sensitivity analyses
    How does changing assumptions affect your sample size? Document this.

Type I vs Type II Errors

Power analysis is fundamentally about managing two types of mistakes:

Error TypeWhat HappensProbabilityReal-World Impact
Type I (α)You find an effect that doesn’t existUsually 5%Adopt an ineffective treatment
Type II (β)You miss an effect that does existUsually 20%Abandon an effective treatment

See a detailed visualization: View Figure → Type I and Type II Errors (PMC)

📊 Research Insights: What the Evidence Shows

A comprehensive 2021 review by Serdar et al. in Biochemia Medica analyzed power analysis practices across 100+ studies. Their findings reveal critical problems—and solutions.

Read full paper →

Finding #1: Effect Size Dramatically Affects Sample Requirements

The difference is exponential, not linear:

Effect Size (Cohen’s d)Sample Size Needed*Typical Studies
0.2 (small)788Large epidemiological studies
0.5 (medium)128Most clinical trials
0.8 (large)52Strong intervention effects
1.0 (very large)34Pre-clinical studies

*Two-tailed t-test, α=0.05, power=0.80. Data from Serdar et al. (2021)

View the complete relationship: Figure 3 → Effect Size vs Sample Size (PMC)

Finding #2: Animal Studies Face a Power Problem

⚠️ Critical Issue: With typical samples of 8-10 animals per group, you can only detect very large effects (d > 2.0). Most biological effects are smaller than this.

When pilot data isn’t available, use the resource equation:

N = (DF / k) + 1

DF = degrees of freedom (10-20 acceptable)
k  = number of groups

Finding #3: 46% Confuse Replication Types

Common Mistake: Nearly half of studies mistake technical replication (measuring the same sample multiple times) for biological replication (independent samples).

Only independent biological samples count toward your N.

See the difference: Figure 5 → Technical vs Biological Replication (PMC)

Effect Size Reference by Test

Statistical TestEffect Size MeasureSmallMediumLarge
t-test (means)Cohen’s d0.20.50.8
Chi-squareCohen’s ω0.10.30.5
CorrelationPearson’s r0.10.30.5
ANOVACohen’s f0.10.250.4
Case-controlOdds Ratio1.52.03.0
Multiple regression0.020.150.35

Why 80% Power is the Standard

The 80% Convention Explained:
  • Balanced trade-off between Type I (5%) and Type II (20%) errors
  • Resource-efficient—90% power requires 30% more participants
  • Reproducibility—underpowered studies drive the replication crisis
  • Ethically sound—don’t expose participants to inconclusive research

Power Analysis Software

FREE

G*Power

Best for most researchers. User-friendly, covers common tests (t-test, ANOVA, correlation, regression).

Download G*Power →

COMMERCIAL

PASS

~200 study designs including survival analysis, equivalence tests. Best for complex clinical trials.

Learn about PASS →

COMMERCIAL

nQuery

Clinical trial focus. Adaptive designs, non-inferiority tests, dropout adjustments.

Learn about nQuery →

FREE

R Packages

Maximum flexibility for statisticians. pwr, MESS, powerMediation packages.

Get pwr package →

CriteriaG*PowerPASSnQueryR
CostFree$$$$$$$Free
Ease of use⭐⭐⭐⭐⭐⭐⭐⭐⭐⭐⭐⭐⭐⭐
Test coverageCommon tests~200 designsClinical focusExtensible
Best forMost researchersComplex trialsPhase II/IIIStatisticians

Common Mistakes to Avoid

🚫 Five Power Analysis Pitfalls:
  1. Post-hoc power analysis—Calculating power AFTER data collection is statistically meaningless
  2. Optimistic effect sizes—Overestimating effects to justify smaller (cheaper) studies
  3. Ignoring attrition—Not adding 10-20% buffer for dropouts
  4. Confusing replication types—Technical replicates don’t increase N
  5. Not reporting—Omitting power analysis from publications hurts reproducibility

Ethical Considerations

Power analysis isn’t just statistics—it’s ethics:

  • Underpowered studies expose participants to risks without generating useful knowledge
  • Overpowered studies waste resources and may expose more participants than necessary
  • IRBs and ethics committees increasingly require formal power analysis
  • CONSORT and STROBE mandate sample size justification for publication

Frequently Asked Questions

What is power analysis?

Power analysis calculates the sample size needed to detect an effect of a given size with a specified probability (typically 80%). It ensures your study isn’t too small (underpowered) or wastefully large.

Why is 80% power the standard?

It balances practical constraints with scientific rigor. Higher power (90%) requires ~30% more participants, while lower power risks missing real effects.

What if I can’t recruit enough participants?

Consider: multi-site collaboration, more sensitive outcome measures, reducing measurement error, or acknowledging the limitation transparently.

Which software should I use?

Start with G*Power—it’s free, user-friendly, and covers most common tests. Move to PASS or nQuery for complex clinical trial designs.

When should power analysis be done?

Before data collection, during study design. Post-hoc power analysis (after data collection) is statistically invalid and should be avoided.

Need Help With Power Analysis?

Our statisticians handle sample size calculations, effect size estimation, and sensitivity analyses for grant applications and publications.

Explore Statistical Services

References

  1. Serdar CC, Cihan M, Yücel D, Serdar MA. Sample size, power and effect size revisited: simplified and practical approaches in pre-clinical, clinical and laboratory studies. Biochem Med (Zagreb). 2021;31(1):010502. doi:10.11613/BM.2021.010502
  2. Cohen J. Statistical Power Analysis for the Behavioral Sciences. 2nd ed. Lawrence Erlbaum Associates; 1988.
  3. Button KS, et al. Power failure: why small sample size undermines the reliability of neuroscience. Nat Rev Neurosci. 2013;14(5):365-376.
  4. Faber J, Fonseca LM. How sample size influences research outcomes. Dental Press J Orthod. 2014;19(4):27-29.

Your manuscript partner: www.editverse.com

Editverse Publication Support Services ]]>