Standard hypothesis test are readily available as built in R functions:
Consider the chickwts
dataset.
head(chickwts)
## weight feed
## 1 179 horsebean
## 2 160 horsebean
## 3 136 horsebean
## 4 227 horsebean
## 5 217 horsebean
## 6 168 horsebean
In general, if you know what standard test you want to perform, you can look up the appropriate function online and you have all the tools you need to run it based on what we’ve already learned. For example, we might perform an ANOVA to see if there is a difference in weight between the different diets.
# ?aov
test1 <- aov(weight~feed, chickwts) #runs an ANOVA along with a bunch of other stuff
names(test1) # see what the aov command stored for us
## [1] "coefficients" "residuals" "effects" "rank"
## [5] "fitted.values" "assign" "qr" "df.residual"
## [9] "contrasts" "xlevels" "call" "terms"
## [13] "model"
anova(test1) # examine the ANOVA output
## Analysis of Variance Table
##
## Response: weight
## Df Sum Sq Mean Sq F value Pr(>F)
## feed 5 231129 46226 15.365 5.936e-10 ***
## Residuals 65 195556 3009
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
par(mfrow=c(2,2))
plot(test1) # default plots are diagnostic plots
Because statistical inference in R follows so directly from what we’ve already learned (and some things we will learn in the next labs), we will focus on running alternate approaches to inference.
There are lots of nonparametric tests built into R, but let’s take a moment to work through a couple.
What follows is an explanation of a hypothesis test in R, the Wilcoxon signed-rank test. See if you can convert this explanation to R code.
Suppose we have some paired data on dock jumps, a type of dog agility competition.
dognames <- c("Suki","Harvey","Sausage","Heidi","Beans")
jump1 <- c(24.3,26.3,31.2,19.9,23.1)
jump2 <- c(24.6,27.1,30.0,22.5,24.1)
dockjump <- data.frame(dognames, jump1, jump2)
dockjump
## dognames jump1 jump2
## 1 Suki 24.3 24.6
## 2 Harvey 26.3 27.1
## 3 Sausage 31.2 30.0
## 4 Heidi 19.9 22.5
## 5 Beans 23.1 24.1
We wish to determine if the average difference between these pairs is nonzero.