A contingency table is a way to summarize bivariate data, or data from two variables.
A contingency table is a way to summarize bivariate data, or data from two variables.
|
Inoculated |
|||
|
yes |
no |
total |
|
Result |
lived |
238 |
5136 |
5374 |
died |
6 |
844 |
850 |
|
total |
244 |
5980 |
6224 |
|
Inoculated |
|||
|
yes |
no |
total |
|
Result |
lived |
0.0382 |
0.8252 |
0.8634 |
died |
0.0010 |
0.1356 |
0.1366 |
|
total |
0.0392 |
0.9608 |
1.0000 |
|
Inoculated |
|||
|
yes |
no |
total |
|
Result |
lived |
0.0382 |
0.8252 |
0.8634 |
died |
0.0010 |
0.1356 |
0.1366 |
|
total |
0.0392 |
0.9608 |
1.0000 |
What can we learn about the result of smallpox if we already know something about inoculation status?
Conditional probability: the probability of some event \(A\) if we know that event \(B\) occurred (or is true): \[P(A|B) = \frac{P(A\text{ and }B)}{P(B)}\] where the symbol | is read as “given”.
If knowing whether event \(B\) occurs tells us nothing about event \(A\), the events are independent. For example, if we know that the first flip of a (fair) coin came up heads, that doesn’t tell us anything about what will happen next time we flip that coin.
We can test for independence by checking if \(P(A|B)=P(A)\).
If \(A\) and \(B\) are independent events, then \[P(A \text{ and }B) = P(A)P(B).\]
Find the probability of rolling a \(6\) on your first roll of a die and a \(6\) on your second roll.
Let \(A=\) (rolling a \(6\) on first roll) and \(B=\) (rolling a \(6\) on second roll). For each roll, the probabiltiy of getting a \(6\) is \(1/6\), so \(P(A) = \frac{1}{6}\) and \(P(B) = \frac{1}{6}\).
Then, because each roll is independent of any other rolls, \[P(A \text{ and }B) = P(A)P(B) = \frac{1}{6}\times\frac{1}{6} = \frac{1}{36}\]
If \(A\) and \(B\) are any two events, then \[P(A \text{ and }B) = P(A|B)P(B).\]
Suppose we know that 38.4% of US households have dogs and that among those with dogs, 23.1% have cats. Find the probability that a US household has both dogs and cats.