## Sampling Error

So what is this ‘sampling error’? Some mistake we might make when getting our sample? Not quite, but not a bad guess. It does have to do with sampling, but it doesn’t involve mistakes so much as a sort of “inexactness”. *Sampling error* is the difference between a sample’s average and the average of the population it came from, and as far as I know it is unavoidable.

When we gather a sample for research from a population of interest, we can be pretty sure that our sample will be somewhat different from the population. Think of the population in terms of the normal distribution (i.e. the bell curve), where the average person in that population is represented by the very middle of that curve. The further out to the “tails” we go, the less common people we find. For example, if the average person gets 4 colds a year, you’d find these people in the center of the bell curve. People who get more colds will be to the right of the center, and those who get fewer colds will be to the left of the center. Someone who gets a cold once a month might be found in the far right tail of the curve, while someone who gets a cold only once every couple of years would be in the far left tail.

Now, for our sample to have the same average as the population, we’d have to select the same proportion of folks from the center, right tail, and left tail of the curve as we’d find in the population, and the chances of doing this perfectly are not very high. This is where our “inexactness” comes in. When we take a random sample from a population, we might have a few extra that come from one side of the curve or the other, and this will affect the average of our sample. If we have a few extra from the right side, then our sample will have a higher average than the population, and if we have a few extra from the left side, our sample will have a lower average than the population. Every time a sample is chosen from this population, it will likely have a different average than the population that it came from, with a few that are the practically the same, but with most being higher or lower. This difference between a sample’s average and the population average is known as *sampling error*.

Even though sampling error is normal and unavoidable, one very good way to minimize it is through our sample size. The Central Limit Theorem’s Law of Large Numbers tells us that the larger our sample, the more likely that sample will reflect the population. In other words, the more people we have in our sample, the less difference there will be between the sample and the population it came from and the smaller the sampling error. So who cares, right? WE DO! We like to minimize our sampling error so that whatever the results of our analyses turn out to be, we can be more certain that the same thing would hold true in the population. So go out there and gather a big sample!

Comments are closed