CDEM Voice

Educational Research Column: How to Appropriately Analyze a Likert Scale in Medical Education Research

Dec 17, 2018, 13:58 PM by Nick Olah
A common tool in both medical education and medical education research is the Likert scale. The Likert scale is an ordinal scale using 5 or 7 levels. Despite regular use of the scale, its interpretation and statistical analysis continues to be a source of controversy and consternation.

er-column

How to Appropriately Analyze a Likert Scale in Medical Education Research

A common tool in both medical education and medical education research is the Likert scale. The Likert scale is an ordinal scale using 5 or 7 levels. Despite regular use of the scale, its interpretation and statistical analysis continues to be a source of controversy and consternation. While the Likert scale is a numerically based scale, it is not a continuous variable, but rather an ordinal variable. The question is then how to correctly analyze the data.

In the strictest sense ordinal data should be analyzed using non-parametric tests, as the assumptions necessary for parametric testing are not necessarily true. Often investigators and readers are more familiar with parametric methods and comfortable with the associated descriptive statistics which may lead to their inappropriate use. Mean and standard deviation are invalid descriptive statistics for ordinal scales, as are parametric analyses based on a normal distribution. Non-parametric statistics do not require a normal distribution and are therefore always appropriate for ordinal data. Common examples of parametric tests are the t-test, ANOVA, and Pearson correlation. Common examples of corresponding non-parametric tests the Wilcoxon Rank Sum, Kruskal Wallis Test, and Spearman Correlation.

The confusion and controversy arise because parametric testing may be appropriate and in fact more powerful than non-parametric testing of ordinal data provided certain conditions exist. Parametric tests require certain assumptions such as normally distributed data, equal variance in the population, linearity, and independence. If these assumptions are violated then a parametric statistic cannot be applied. Care must also be taken to ensure that averaging the data isn’t misleading. This can occur if the data is clustered at the extremes resulting in a neutral average. For instance, if we used a Likert scale to evaluate the current polarized political climate, we would likely be clustered at the extremes, yet the mean might lead us to believe everyone is neutral.

Frequently, the responses on a Likert scale are averaged and the means are compared between the control and intervention group (or before and after implementation of an educational tool) utilizing a T-test or ANOVA. While these are the correct statistical analyzes for comparing means, one cannot calculate an actual mean for a Likert scale as it is not a continuous numerical value and the distance between values may not be equal therefore it is also not interval data. For example, in a study comparing mean arterial blood pressures between an experimental drug and placebo, there is a continuous numerical variable for a mean can be calculated between the two study groups. In contrast for a Likert scale of 1-5, these are ordinal classifications and there are no responses of 1.1, 2.7, 3.4 or 4.2. Therefore, a mean of 3.42 for the control group and 3.86 for the intervention group does not fall within the pre-defined ordinal category responses of the Likert scale.

One approach is to dichotomize the data into “yes” and “no” categories.  For example, on a scale from 1-5 with 3 being “average” one could group responses into >3 or <3.  Dichotomizing the data is also a mechanism to increase the power. An exception to this is if one is using a series of questions and averaging the individual’s response to create a single composite score and then compares the composite scores across the groups. Under this scenario, comparing means may be appropriate since the data has been converted into a continuous variable.

After dichotomizing, one can utilize a Fisher’s exact or a Chi-Squared test to analyze the data.  Stay tuned for a future explanation of the differences between and Fisher’s exact and Chi-Squared analysis!

Understanding the statistics can help improve the experimental design and avoid inappropriate application of statistical analyses yielding erroneous conclusions.

Jason J. Lewis, MD                                    David Schoenfeld, MD, MPH
Beth Israel Deaconess Medical Center      Harvard Medical School

Reference

Load more comments
comment-avatar