We want to use a sample to learn something about a population, but no sample is perfect!
Sampling error is the error resulting from using a sample to estimate a population characteristic.
We want to use a sample to learn something about a population, but no sample is perfect!
Sampling error is the error resulting from using a sample to estimate a population characteristic.
If we use a sample mean \(\bar{x}\) to estimate \(\mu\), chances are that \(\bar{x}\ne\mu\) (they might be close but… they might not be!). We will consider
The distribution of a statistic (across all possible samples of size \(n\)) is called the sampling distribution.
For a variable \(x\) and given a sample size \(n\), the distribution of \(\bar{x}\) is called the sampling distribution of the sample mean or the distribution of \(\boldsymbol{\bar{x}}\).
Suppose our population is the five starting players on a particular basketball team. We are interested in their heights (measures in inches). The full population data is
Player | A | B | C | D | E |
---|---|---|---|---|---|
Height | 76 | 78 | 79 | 81 | 86 |
The population mean is \(\mu=80\).
Consider all possible samples of size \(n=2\):
Sample | A,B | A,C | A,D | A,E | B,C | B,D | B,E | C,D | C,E | D,E |
---|---|---|---|---|---|---|---|---|---|---|
\(\bar{x}\) | 77 | 77.5 | 78.5 | 81.0 | 78.5 | 79.5 | 82.0 | 80.0 | 82.5 | 83.5 |
There are 10 possible samples of size 2.
In general, the larger the sample size, the smaller the sampling error tends to be in estimating \(\mu\) using \(\bar{x}\).
In practice, we have one sample and \(\mu\) is unknown.
For the distribution of \(\bar{X}\)
We refer to the standard deviation of a sampling distribution as standard error.
The mean living space for a detached single family home in the United States is 1742 ft\(^2\) with a standard deviation of 568 square feet. For samples of 25 homes, determine the mean and standard error of \(\bar{x}\).
The plots show (A) a random sample of 1000 from a Normal(100, 25) distribution and (B) the approximate sampling distribution of \(\bar{X}\) when X is Normal(100, 25).
In fact, if \(X\) is Normal(\(\mu\), \(\sigma\)), then \(\bar{X}\) is Normal(\(\mu_{\bar{X}}=\mu\), \(\sigma_{\bar{X}}=\sigma/\sqrt{n}\)).
Surprisingly, we see a similar result for \(\bar{X}\) even when \(X\) is not normally distributed!
For relatively large sample sizes, the random variable \(\bar{X}\) is approximately normally distributed regardless of the distribution of \(X\): \[\bar{X}\text{ is Normal}(\mu_{\bar{X}}=\mu, \sigma_{\bar{X}}=\sigma/\sqrt{n}).\]
Notes