The sampling distribution is the “theoretical probability distribution of all possible sample values for the statistics in which we are interested” (Frankfort-Nachmias & Leon-Guerrero, 2015, p. 219). Since this includes every possible combination of units in a sample, the sampling frame can become fairly enormous for even small samples – thankfully, it’s mainly theoretical, and rarely calculated. This sampling distribution, for whatever statistic it’s calculated for, can be described like other distributions with mean and standard deviation, which is referred to as the standard error of the mean in this context. Conclusions drawn from these concepts serve as the basis for inferential statistics.
While largely theoretical, sampling distributions have important implications on sampling techniques and generalizability. For example, the larger the sampling frame becomes (or, in other words, the larger the sample size), the more the sampling distribution begins to resemble a normal distribution – this is known as the central limit theorem. This means that the larger the sample, the more likely that sample’s mean is to match the actual population mean. Since the goal of inferential statistics is to infer a statistic about an entire populated out of sample data, this theorem serves as the basis for inferential statistics.
This video introduction explains the concept of a sampling distribution, through an intentionally contrived example of a professor who wants to know the average age of a college class, but is unable to sample more than three students at a time. It serves as a good introduction to sampling distributions for the uninitiated, but I wanted to draw attention to it because I really like the ending thoughts.
The textbook and the video both answer the question of why the reader/viewer should care in the same way, but they do so in different ways. The video highlights the ability of sampling distributions to quantify the likelihood of a sample statistic accurately reflecting the population statistic – it’s both really cool that one is able to do this, and really important in that it allows one to provide a metric of generalizability for statistics calculated from a given sample. Of course, this is all implied by the textbook’s thorough explanation of the central limit theorem, but the textbook probably doesn’t spell it out this way since this is part of the next chapter topic.