Educational Research Column: An Introduction to Power Analysis
An Introduction to Power Analysis
Medical education research is a thriving and expanding area. However, at times there is real concern regarding the validity of quantitative studies. These issues typically occur throughout the research process, from design to manuscript. One of the more common errors deals with inadequate power.
In medical education research, groups are evaluated based on the differences that occur during a particular event, curriculum etc. In a sample, there exists the risk of finding a difference despite there actually not being one. This is called a false positive or Type I error (alpha). In order to limit this, traditional statistics set the alpha level at 5%. By increasing the power of the test, one can minimize this error. Medical education studies, often with their lower numbers, fall victim to this. Studies have shown that education-based research is often underpowered. It is not uncommon to see studies with less than 25 participants. While increasing the power to its maximal seems to be a logical answer, this will cause a Type II error. This is the rejection of a positive finding and instead determining that the difference is purely due to chance (null hypothesis). Overall power is typically set to 80%. This is to say that there is an 80% chance of detecting a potential difference. Another way to think about this is that there is a 20% chance of having a Type II error.
In order to calculate the necessary power, several things must be done. Researchers need to determine how likely the design actually will identify a difference. Then the minimal detectable effect should be identified to determine the “floor” of the study. Given the difficulty that medical education research has in sample size, knowing this allows for proper timeline and resource utilization. This will allow for increased reliability of reported data and correctly interpreted hypothesis.
A power analysis should occur prior to data collection. This allows for appropriate sample size collection. It also allows for potential improvement of study design and identifying covariates. Often power is calculated “post-hoc.”This is when the results and their difference are used to retrospectively calculate the appropriate power and should be avoided. It will create a one-to-one relationship with the p value i.e. a low p value will artificially provide a greater power calculation.
It is important to have sufficient power to accurately reflect statistical significance. This will minimize Type I errors and make one’s data far more robust. It will also serve to improve the planning and implementation of a study.
Edward Ullman, MD
Harvard Affiliated Emergency Medicine Residency at Beth Israel Deaconess Medical Center
- Picho K, Artino AR. Seven Deadly Sins in Educational Research. J of Grad Med Educ. 2016;8(4):483-87.
- Bakker M, van Dijk A, Wicherts JM. The rules of the game called psychological science. Perspect Psychol Sci. 2012;7(6):543-554.
- Cohen J Statistical Power Analysis for the Behavioral Sciences (2nd ed) New York Academic Press 1977.