Measures of Variability (part 3)
Of the common measures of variability, range and mean deviation are both fairly useful for describing the spread of scores in a data set. But as we saw in the last two posts, both also have their problems. Because of this, by far the most often chosen measure of variability for describing a data set is standard deviation.
Standard deviation, whose symbol is “s”, gives us a value that tells us the average distance the scores are from the mean of their data set. The larger this value is, the further apart the scores can be expected to be. There are exceptions, such as when you have a data set with an extreme high score and an extreme low score, which effects kurtosis, but we will save this for another time. Sometimes students will ask “What is a large standard deviation value and what is a small one?” The answer depends on what you are comparing it to. After all, the terms “large” and “small” are relative and therefore only have meaning when we are comparing two things. In this case, those two things would be the variability of scores in two data sets. As an example, suppose the standard deviation for the first data set is s = 1.36 while the standard deviation for the second one is
s = 2.03. From these values we can see that the second data set has scores that are more spread out from their mean than the scores in the first one are from theirs.
Standard deviation values are arrived at in a similar way as mean deviation (see the previous post), with one exception. The negative signs accompanying some of the deviations that were a problem for mean deviation are not deleted, but are instead squared. This effectively “hides” them without getting rid of them, which is the main thing that makes standard deviation superior to mean deviation. After working out the appropriate formula, the resulting value is “unsquared” (I know, this isn’t really a word but you get my meaning I’m sure) and we have done no damage to the original deviations that carried that negative sign. Because of this, standard deviation can then be used in further analyses with no problems. The formulas for standard deviation are pretty simple, but preparing the data can be somewhat of a tedious job. The basic formulas are:
For the population : σ = √ SS / N For the sample: s = √ SS / (n-1)
As you can imagine, the standard deviation is pretty useless without it’s partner “the mean”. Since the standard deviation is the average distance scores are from the mean, knowing this without knowing what the mean is would tell us nothing! So when you are writing up your results for publication, be sure to include both.
Variance is another measure of variability, and is the “big brother” of standard deviation. Its symbol is “S2”. Variance is the value you would have if you didn’t take the square root of the result of your formula. Looked at another way, variance is the square of your standard deviation. For example, if your standard deviation for your data set is 4, the variance would be 42 = 16. We don’t use variance to describe a data set, but we do use it in inferential analyses such as analysis of variance. This is a technique where we compare the variability of two or more groups’ scores to decide whether it is natural variation or due to treatment. As much as I know you want to know about this, it will have to wait. But I promise to tell you about this particular excitement in a later post 🙂