- Use z scores to compare observations on different scales.
- Calculate probabilities for a normal distribution using area under the curve.
- Calculate normal distribution percentiles.
We represent the shape of a continuous variable using a density curve. This is like a histogram, but with a smooth curve:
Properties:
The proportion of all possible observations that lie within a specified range equals the corresponding area under the density curve.
Why “normal”? Because it appears so often in practice!
Normal distributions…
To check whether a variable is (approximately) normally distributed,
A z-score tells us how many standard deviations an observation is from the mean.
Example: \(z=-0.23\) is 0.23 standard deviations below the mean.
For any (approximately) normally distributed variable,
Note: when we z-score a variable, we preserve the area under the curve properties!
Properties:
We work with cumulative probabilities or probabilities of the form \(P(Z < z)\).
We will use the fact that the total area under the curve is 1 to find probabilities like \(P(Z > c)\):
We can also use this concept to find \(P(a < Z < b)\).
Notice that \[1 = P(Z < a) + P(a < Z < b) + P(Z > b)\]
From \[1 = P(Z < a) + P(a < Z < b) + P(Z > b)\] we can write \[P(a < Z < b) = 1 - P(Z > b) - P(Z < a)\] Since we just found that \[P(Z > b) = 1 - P(Z < b)\] we can replace \(1 - P(Z > b)\) with \(P(Z < b)\), and get \[P(a < Z < b) = P(Z < b) - P(Z < a).\]
Now that we can get all of our probabilities written as cumulative probabilities, we’re ready to use software to find the area under the curve!
ACT scores are well-approximated by a normal distribution with mean 20.8 and standard deviation 5.8.
pnorm
command in R) to find the associated area.Find the proportion of SAT-takers who score between 1150 and 1300. Assume that SAT scores are approximately normally distributed with mean \(\mu=1100\) and standard deviation \(\sigma = 200\).
We can also find the observation associated with a percentage/proportion.
Recall: The \(w\)th percentile \(p_w\) is the observation that is higher than w% of all observations \[P(X < p_w) = w\]
Note that if \(z = \frac{x-\mu}{\sigma}\), then \(x = \mu + z\sigma\).
SAT scores are approximately Normal(\(\mu=1100\), \(\sigma=200\)). Find the 90th percentile for SAT scores.
Section 5.3 Exercises 1-10