Measures of Central Tendency – The Mean

Measures of central tendency are numerical values that give us the approximate midpoint of a data set. The three measures of central tendency are mean, median, and mode, and each is best suited for specific situations. In other words, they are descriptors that helps us to “sum up” the data set. For example, knowing the average amount of time children spend on homework would help you to judge whether your child is spending more or less time than most children.

The mean is the one you are most likely to run across when you are reading research, and is simply the arithmetic average of a data set. The average amount of time children spend on homework is an example of the mean. As most people know, the mean is derived by summing the data and dividing by the number of values summed. This necessarily forces half the values to be above the mean and half to be below the mean, making the mean the midpoint. So even if we find that children spend an average of 5 hours a week on homework, half the children will be spending more time than this and half will be spending less time, though the difference in most cases may be slight.

This measure is best used with continuous data that is normally distributed (as opposed to skewed) because it is quite sensitive to outliers. One or two extremely different values can cause the mean to be artificially high or low, depending on the whether the extreme value is higher or lower than the others. A good thing about the mean is that it tends to be about the same from one sample to the next, as long as those samples are taken from the same population. For example, if nothing else changes (i.e. we use the same population of children), the average amount of time the children spend doing homework will be about the same regardless of which sample of school children we use. This property is very handy because it makes it possible for us to be reasonably accurate when generalizing the results of a sample to the population it came from. The reason for this is because the average of all possible samples from one population will be equal to the population mean. So if it is true that our sample of school children spends an average of 5 hours a week on homework, we can be pretty sure that all the children spend about that much time on homework.

Another good thing about the mean is that it can be used for data replacement. Suppose someone left one of the questions unanswered on one of the instruments you used in your questionnaire. Because the mean represents the midpoint of the data, you can use the mean answer choice of the sample as the answer for that question.

As useful as the mean can be, it is not very representative of a data set that is skewed or that isn’t continuous. For these situations, we have other measures of central tendency. Next time I’ll talk about the median, which is best used for skewed data.