Central Limit Theorem
In the previous post we saw how it could be a problem to have no knowledge of the population mean when trying to decide whether a treated sample mean was significantly different from the population mean. We also saw how the sampling distribution of the mean was the solution to this problem, but constructing one would usually be impossible. But the central limit theorem makes it unnecessary anyway, and here is why.
The central limit theorem states that “for any population mean (μ) and standard deviation (σ), the distribution of sample means for size n will have a mean of μ and a standard deviation of σ/√n and will approach a normal distribution as n approaches infinity”. In simpler terms, it means that the mean of a sampling distribution will have the same mean as the population that it came from, and the standard deviation will be equal to the standard error. Also, as long as your sample size is 30 or more, the distribution will be normal regardless of whether the population is normal.
So we see that the central limit theorem describes the sampling distribution of the mean in all three of the possible ways: shape, central tendency, and variability.
It talks about the shape of the distribution when it says the sampling mean of the distribution will be normal if the population distribution is normal, OR if the n of each sample is at least 30. Thirty is the gold standard, but sample sizes larger than this are even better because of the effect larger sample sizes have on the standard error (which is the same thing as the standard deviation when you’re talking about the distribution of means).
The theorem talks about central tendency when it says the mean of the sampling distribution of the mean is equal to the population mean. You might hear this mean called “expected value of M”. Notice how the sample means tend to bunch up around the population mean. This is how it gets it’s bell-curve shape. So any two randomly chosen samples likely won’t have the means. However, if we could take the grand mean of all of the sample means, they would equal the population mean (or expected value of M).
Finally, the theorem talks about variability. Variability, of course, refers to the standard deviation of our sampling distribution of the mean. This standard deviation is called the standard error of the mean, and is the topic of the next post.