Measures of Central Tendency – The Mode
As we saw in the previous two posts, the mean and the median are used for continuous data, depending on normalcy of course, and these give us a midpoint for describing our data set. The mode can also be used with continuous data, and would simply be the value that occurs most often in the data set. But how do we judge the midpoint of the data when that data is categorical? In truth, no measure of central tendency is appropriate for categorical data, because there won’t be a true “center”, either distance-wise or value-wise. Even so, the mode is still considered a measure of central tendency and is the preferred measure when the data is categorical (nominal or ordinal). Simply put, it is commonly used for identifying the category with the highest frequency.
The mode is an interesting duck in that a data set can have two modes (called bimodal) or more than two modes (called multimodal). It can even have no mode at all. For example, what if the question is “Out of these four ice cream flavors, what is your favorite?” It’s certainly possible that two of these categories could have the same number of votes (i.e. the same frequency). In this case, we would have a bimodal distribution. If each of the four categories had the same frequency, there would be no mode at all since the very definition of mode is “most common”. Compared to the mean and median, this is a unique occurrence since it would be impossible to have multiple means or medians and equally impossible to have no mean or median.
In skewed distributions, the mean will be nearest the “tail”, with the mode in the tallest part of the curve, and the median will be somewhere between these two since it is the absolute middle of the data set.
All three measures of central tendency have their place in describing data, and in a perfectly normal distribution all three will be the same. If the data is not perfectly normal, but approximates normal (which is considered adequate for analyses that require normalcy), all three measures will be very similar. In fact, this is a pretty good way to determine if your data is normally distributed.