We start this analysis by loading packages in r so that we have the needed tools to aid in our data wrangling (i.e., cleaning and organizing the data), matching (i.e., ensuring PAL students with similar and non-PAL students with similar predictors are grouped), and balance checking (i.e., ensuring these groupings are fair and accurate). We do this manually because the functions required for this project are not provided by R in a clean, standard way. By installing them here, we are preventing “function not found” errors later in the report.
library(dplyr)
library(MatchIt)
library(Matching)
library(DataExplorer)
library(cobalt)
library(backports)
library(rbounds)
library(tidyr)
library(knitr)
library(recipes)
library(readr)
knitr::opts_chunk$set(
fig.path = "report_figures/",
dev = "png",
dpi = 300
)
We create helper functions below to check for variables that cause complete separation in logistic regression. It should be noted that, when being compared to numerical predictors, categorical predictors are especially likely to create this problem. The reason for this is due to their continuous nature and how they are often used in statistical analysis.
check_separation <- function(df, outcome, predictor) {
tbl <- table(df[[predictor]], df[[outcome]])
any(tbl[,1] == 0 | tbl[,2] == 0)
}
find_separators <- function(df, outcome) {
# Select only factor columns
factor_vars <- names(df)[sapply(df, is.factor)]
# Exclude outcome if it's a factor
factor_vars <- setdiff(factor_vars, outcome)
separators <- sapply(factor_vars, function(pred) {
check_separation(df, outcome, pred)
})
names(separators[separators])
}
Before fitting the logistic regression model that generates
propensity scores for PAL participation
(pal.flg), we check the the Math 12 dataset in the
code above to make sure there is not complete
separation. This is when a categorical predictor perfectly
predicts PAL versus non-PAL and skews logistic regression models.
Removing variables that cause complete separation is important because
it can lead to unreliable propensity scores and can compromise the
matching step by destabilizing logistic regression (e.g., very large or
effectively infinite coefficient estimates, huge standard errors, and/or
non-convergence). To prevent this, we define two helper functions that
automatically identify problematic predictors so they can be removed or
their categories collapsed before fitting the propensity score
model.
The function we created
(i.e.,check_separation(df, outcome, predictor)) performs a
one-predictor check by creating a contingency table of
predictor by the binary outcome (pal.flg) and
returns TRUE if any predictor level has zero observations
in either the PAL or non-PAL group–translating as a possible risk of
separation. The function find_separators(df, outcome) then
scales this up to the entire dataset by finding all factor variables in
math12.dat (excluding pal.flg itself),
applying check_separation() to each one, and returning the
names of predictors which trigger separation. We then use that returned
list of predictors–for instance,
sepvar12 <- find_separators(math12.dat, "pal.flg")–to
drop or to otherwise handle those predictors before fitting the
propensity score logistic regression, improving model stability and
ensuring the propensity scores are well-defined for matching.
We are now importing the PAL dataset, and it is also here where we convert variables into the right data types (e.g., creating factors for categorical variables) so models and matching routines interpret them correctly.
setwd("~/Downloads/Independent Study/CTL Files")
dat <- read_rds("pal_data_25.rds")
dat[sapply(dat, is.character)] <- lapply(dat[sapply(dat, is.character)],
as.factor)
Since we specifically wish to examine the effect of PAL on the single course MATH 12, the block of code below extracts all information on Math 12 from the dataset. We end by removing the students who did not complete PAL (e.g., PAL “NC/W”) as leaving it in could skew the resulting model by altering the “treatment group.”
math.classes <- paste("MATH", c(12))
math.ind <- which(dat$course == "MATH 12")
math12.dat <- dat[math.ind, ]
math12.dat$course <- droplevels(math12.dat$course)
math12.dat$term <- factor(math12.dat$term, ordered = TRUE,
levels = c("Fall 2021", "Spring 2022 ", "Summer 2022",
"Fall 2022", "Spring 2023",
"Fall 2023", "Spring 2024",
"Fall 2024"))
# remove the students who did not complete PAL
pal.dnf <- which(math12.dat$pal.grade == "NC" | math12.dat$pal.grade == "W")
math12.dat <- math12.dat[-pal.dnf,]
In the code above, the variable
math.ind <- which(dat$course == "MATH 12") finds the row
numbers where the course is MATH 12 using the which() function,
and math12.dat <- dat[math.ind, ] subsets the data to
only those rows. Then the function droplevels() removes
unused course levels so the subset does not keep categories from other
courses. Finally,
math12.dat$term <- factor(math12.dat$term, ordered = TRUE, levels = c("Fall 2021", ..., "Fall 2024"))
makes term an ordered timeline rather than plain text, so
terms sort chronologically in tables/plots and any modeling steps that
use term.
Here, I compute the simple mean grade for PAL vs non-PAL as a baseline “before any correction,” which shows the naive difference. Note that this is a biased comparison of the control and treatment groups but it is done to show us what the raw gap looks like and makes the need for propensity scores and matching self-evident.
raw.table = data.frame(class=character(),
nonPALavg=numeric(),
PALavg=numeric(),
Diff=numeric(),
NonPAL_Num= integer(),
PAL_Num=integer(),
row.names=NULL,
stringsAsFactors = FALSE)
for (i in 1:length(math.classes)){
curr.class = math.classes[i]
pal.avg <- round(mean(math12.dat$grade.num[math12.dat$course==curr.class &
math12.dat$pal.flg == "PAL"],
na.rm=TRUE), 2)
nopal.avg <- round(mean(math12.dat$grade.num[math12.dat$course==curr.class &
math12.dat$pal.flg == "Non-PAL"],
na.rm=TRUE), 2)
raw.table[i, 'class' ] = curr.class
raw.table[i, c(2:4)]=c(nopal.avg, pal.avg, pal.avg-nopal.avg)
raw.table[i, c(5,6)]= c(sum(math12.dat$course==curr.class &
math12.dat$pal.flg == "Non-PAL"),
sum(math12.dat$course==curr.class &
math12.dat$pal.flg == "PAL"))
}
# formatted table
kable(raw.table, caption = "Raw Comparison of PAL and non-PAL Grades
(No Propensity Adjustment)")
| class | nonPALavg | PALavg | Diff | NonPAL_Num | PAL_Num |
|---|---|---|---|---|---|
| MATH 12 | 2.23 | 2.64 | 0.41 | 4222 | 340 |
The unadjusted means are promising - PAL students have higher mean grades in each class than non-PAL students - but propensity score adjustment is necessary to draw causal conclusions about the relationship between PAL participation and course grade.
The line raw.table = data.frame(...) creates an empty
results table; The columns are the sample mean of non-PAL students
(i.e., nonPALavg), the average grade of the PAL students in
the sample (i.e., PALavg), the difference between these
groupings of mean grades (i.e., ‘Diff’), and the size of the sample of
our population (i.e., “Num”). The loop
for (i in 1:length(math.classes)) is included as well to
make it possible for this same code to be re-used; length()
returns how many course labels there are in math.classes,
and 1:length(...) produces the index values the loop will
run through. Inside the loop, curr.class = math.classes[i]
uses indexing ([i]) to pick the current course label, which
then controls all filtering. The line
pal.avg <- round(mean(math12.dat$grade.num[...], na.rm=TRUE), 2)
has several pieces: the square brackets [...] do row
filtering (logical conditions pick only the rows for the current course
AND PAL students), mean() computes the sample average of
grade.num (i.e., an estimate of the typical grade in that
group), na.rm=TRUE tells R to ignore missing grades rather
than letting missing values force the mean to become NA, and
round(...,2) formats the result to two decimal places for a
clean report table. The nopal.avg <- ... line repeats
the same steps but for the Non-PAL group. Next,
raw.table[i, 'class'] = curr.class and
raw.table[i, c(2:4)] = c(nopal.avg, pal.avg, pal.avg-nopal.avg)
fill the results table: c() combines values into a vector,
and c(2:4) selects the relevant columns (Non-PAL mean, PAL
mean, and their difference). The counts
raw.table[i, c(5,6)] = c(sum(...Non-PAL...), sum(...PAL...))
use sum() on logical statements; in code terms,
TRUE counts as 1 and FALSE counts as 0, so
summing a logical condition gives the number of rows satisfying the
condition; statistically, these counts are crucial because they tell you
how many observations each mean is based on (bigger N → more stable
average). Finally, kable(raw.table, ...) formats the
raw.table into a readable table for the knitted report.
Comparing the mean grades for the PAL and non-PAL students, we find that PAL participation can be linked to a higher course grade. Additionally, the sample sizes of 4222 Non-PAL and 340 PAL make the averages in the above table significant as the data collected is from a considerably sized group of students. We add the cautious reminder however that these scores have not in any way been adjusted and should thus be treated only as motivation to dig deeper.
The process of propensity score modeling, matching, and balance checks following is needed to minimize the differences between the covariates so that the end gap in the grades is an accurate indicator of causal relation.
We have written code below to make the given dataset better for modeling. We first reduce missingness problems by removing variables with more than 10% of their data missing. This helps the accuracy of the propensity model by making it easier to group PAL and non-PAL students with similar background variables. Additionally, we have created the variable “cum.percent.units.passed” which measures the proportion of attempted units that a student passed and gives a more standardized picture of academic progress than raw unit counts alone. The functions plot_missing() and profile_missing() are used to identify which variables have missing values and how much missingness these variables contain. This is important because variables with too much missing data can shrink the usable sample, make the model less stable, and reduce the quality of the matching step.
# create feature, proportion of cumulative units taken that were passes
math12.dat$cum.percent.units.passed = math12.dat$unt.passd.prgrss/
math12.dat$unt.taken.prgrss
plot_missing(math12.dat, missing_only = TRUE)
zero_missing_vars <- math12.dat %>%
summarise(across( ,list(function(x) sum(is.na(x))))) %>%
gather() %>%
arrange(value)
zero_missing_vars <- zero_missing_vars %>%
filter(zero_missing_vars$value != 0)
zero_missing_vars$var.name = substr(zero_missing_vars$key,1,
nchar(zero_missing_vars$key)-2)
zero_missing_vars <- math12.dat %>%
dplyr::select(zero_missing_vars$var.name)
missing_vars <- profile_missing(zero_missing_vars)
vars.missing.data = as.character(missing_vars[missing_vars$pct_missing>0.05, "feature"])
math12.dat = math12.dat[,!(names(math12.dat) %in% vars.missing.data)]
In the next step step of this data cleaning and feature engineering step, we discard variables with only one value, since such variables provide no real information for distinguishing PAL from non-PAL students and would only reduce the efficiency of the model. We also collapse sparse categories in several categorical predictors so that small groups do not create unstable estimates or make comparisons unreliable. We note that these steps are significant because logistic regression works better when predictors contain enough variation and when categories are not too rare. We then recode the instructor variable with an anonymous numeric index, which protects confidentiality. Lastly, we identify and remove variables that still cause complete separation in logistic regression. Together, these cleaning steps make the dataset more stable, more interpretable, and better suited for the later matching analysis.
math12.dat$yr.course.taken =
as.numeric(gsub(".*([0-9]{4})","\\1",math12.dat$coh.term))
# remove single-valued variables
single.category.vars = sapply(math12.dat, function(x) length(unique(x))==1)
sum(single.category.vars)
## [1] 7
names(math12.dat)[single.category.vars]
## [1] "course" "course.descr" "unt.taken"
## [4] "acad.group" "dept.abbr" "units.taken.remedial"
## [7] "units.passed.remedial"
keep.vars = names(which(single.category.vars==FALSE))
math12.dat = math12.dat[, keep.vars]
dim(math12.dat)
## [1] 4562 59
# collapse sparse categories
math12.dat=group_category(data = math12.dat, feature = "acad.plan1",
threshold = 0.1, update = TRUE)
math12.dat$acad.plan1 <- as.factor(math12.dat$acad.plan1)
table(math12.dat$acad.plan1, PAL=math12.dat$pal.flg)
## PAL
## Non-PAL PAL
## BIOLBMEDBS 29 3
## BIOLNONEBA 54 4
## BIOLNONEBS 358 17
## BIOLPBIONU 768 105
## CHEMBCHMBS 90 5
## CHEMBIOCBA 40 4
## CHEMFORCBA 71 14
## CHEMNONEBA 26 1
## CHEMNONEBS 46 5
## CIVENONEBS 316 18
## CMPENONEBS 284 11
## CONMPRCMNU 230 21
## CRIMPCRJNU 24 3
## CSCIPCSCNU 667 35
## EENGNONEBS 122 6
## FOODNUTSBS 23 4
## HLSCHLSCBS 69 9
## KINSPEXSNU 40 2
## MATHSUBPBA 39 4
## MECHNONEBS 369 27
## OTHER 432 30
## UNDCNONENU 125 12
math12.dat=group_category(data = math12.dat, feature = "coh",
threshold = 0.05, update = TRUE)
math12.dat$coh <- as.factor(math12.dat$coh)
table(math12.dat$coh, PAL=math12.dat$pal.flg)
## PAL
## Non-PAL PAL
## First-time 3871 298
## OTHER 351 42
math12.dat=group_category(data = math12.dat, feature = "learning.mode",
threshold = 0.05, update = TRUE)
math12.dat$learning.mode <- as.factor(math12.dat$learning.mode)
table(math12.dat$learning.mode, PAL=math12.dat$pal.flg)
## PAL
## Non-PAL PAL
## Asynchronous no meetings AB386 444 7
## Face-to-face 3378 322
## OTHER 400 11
math12.dat=group_category(data = math12.dat, feature = "eth.ipeds",
threshold = 0.05, update = TRUE)
math12.dat$eth.ipeds <- as.factor(math12.dat$eth.ipeds)
table(math12.dat$eth.ipeds, PAL=math12.dat$pal.flg)
## PAL
## Non-PAL PAL
## Asian 999 55
## Black 308 25
## Hispanic 1889 181
## OTHER 223 25
## Two or More Races 231 18
## White 572 36
math12.dat <- math12.dat[math12.dat$gender != "Nonbinary",]
math12.dat$gender <- droplevels(math12.dat$gender)
# code instructor as alphabetic letter for anonymity
math12.dat$instructor1=droplevels(factor(math12.dat$instructor1))
instructor.vec = sort(unique(math12.dat$instructor1))
num.instr = length(instructor.vec)
math12.dat$instructor1 = factor(math12.dat$instructor1, levels = instructor.vec,
labels=as.character(1:num.instr))
key.instr.code = cbind(as.character(instructor.vec),
1:num.instr)
# remove variables causing complete separation in log reg
sepvar12 <- find_separators(math12.dat, "pal.flg")
math12.dat <- math12.dat[,-which(names(math12.dat) %in% sepvar12)]
# Filter to relevant variables
vars.to.keep = c("grade.num","instructor1","leanring.mode","pal.flg","coh",
"base.time","gender","eth.ipeds","mother.ed","father.ed",
"pell.term.flg","term.age","last.school.attended.local.flg",
"school.zip.median.income","qr.cat","acad.plan1","term.units",
"cum.percent.units.passed","cum.gpa.start")
new.vars = intersect(vars.to.keep, names(math12.dat))
math12.dat = math12.dat[ ,new.vars]
# subset on complete cases
math12.dat <- math12.dat[complete.cases(math12.dat), ]
# recheck for complete separation
find_separators(math12.dat, "pal.flg")
## character(0)
In the multiple code blocks below we will produce a propensity score model using logistic regression and proceed to then pull from this the propensity score for each observation.
There is a risk of selection bias as students might have had
different background factors (e.g., work ethic, financial support,
etcetera) which influenced their decision to take PAL. To minimize this
selection bias, we first estimate each student’s propensity
score. The propensity score for a given student is the given
student’s probability of enrolling in PAL and is determined by observed
pre-treatment covariates. Though not always the case, the propensity
scores we obtain are determined by fitting a logistic
regression (i.e., a generalized linear model) where the outcome
is PAL participation (pal.flg) and the predictors are
student background variables which are believed to influence both PAL
enrollment and later course performance. This step does not predict
grades directly; Instead, it summarizes a multivariate covariate profile
of every student into a single scalar score that can be used to form
more comparable treated (PAL) and control (Non-PAL) groups. The code
glm(..., family = binomial) fits the logistic model and
returns coefficient estimates on the log-odds scale; the positive
coefficients correspond to higher odds of PAL enrollment, negative
coefficients correspond to lower odds, holding other covariates
fixed.
We begin with the Regression Model below. This is motivated first by our need to create a propensity model of the data and second by our desire to have it accurately predict PAL enrollment. Broken into its three core components, our code here begins with the ‘min.model’ function which ensures we keep several key variables in the creation of the propensity score. Next, the Upper Model Formula adds all other background variables back in except grade since we are creating this model with the end goal of predicting grade; including it in the propensity model would introduce post-treatment information and undermine causal interpretation. We then use forward stepwise selection, meaning the code tries adding extra variables one by one to the simple model. It uses AIC, a measure that balances model accuracy against unnecessary complexity, to decide whether each added variable should be kept or rejected. Here, the simple starting model performs best and hence stays as the final model. The fitted model then gives each student an estimated probability of joining PAL, which becomes that student’s propensity score.
# Minimal model (starting point)
min.model <- glm(pal.flg ~ cum.percent.units.passed + gender + eth.ipeds,
data = math12.dat, family = binomial)
# Upper model formula (including all but grade.num)
upper.formula <- as.formula("pal.flg ~ . - grade.num")
# Stepwise selection (forward)
math12.step.first.order <- stepAIC(min.model,
direction = "forward",
scope = list(lower = formula(min.model),
upper = upper.formula),
trace = FALSE, k = 2)
model.first.order <- formula(math12.step.first.order)
math12.first.order.prop.model <- glm(model.first.order, data=math12.dat, family=binomial)
summary(math12.first.order.prop.model)
##
## Call:
## glm(formula = model.first.order, family = binomial, data = math12.dat)
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -3.86151 0.28125 -13.730 < 2e-16 ***
## cum.percent.units.passed 1.48065 0.25817 5.735 9.74e-09 ***
## genderMale -0.46337 0.11655 -3.976 7.02e-05 ***
## eth.ipedsBlack 0.37472 0.26509 1.414 0.157492
## eth.ipedsHispanic 0.58746 0.16292 3.606 0.000311 ***
## eth.ipedsOTHER 0.75827 0.26034 2.913 0.003584 **
## eth.ipedsTwo or More Races 0.25110 0.29742 0.844 0.398530
## eth.ipedsWhite -0.01912 0.22950 -0.083 0.933586
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 2304.4 on 4242 degrees of freedom
## Residual deviance: 2226.1 on 4235 degrees of freedom
## AIC: 2242.1
##
## Number of Fisher Scoring iterations: 6
math12.final <- math12.dat
math12.final$propensity <- math12.first.order.prop.model$fitted.values
Looking at the fitted model, cum.percent.units.passed has a strong positive relationship with PAL enrollment (estimate = 1.48065, p ≈ 9.74e-09, where a smaller p-value means stronger statistical evidence that the variable is related to PAL enrollment). That is, higher cumulative pass rates corresponded to higher odds of PAL enrollment, holding the other covariates constant. The coefficient for genderMale is negative (−0.46337, p ≈ 7.02e-05), which indicates that male students had lower odds (odds = p/(1−p), where p is probability) of enrolling in PAL than the female reference group, after accounting for the other variables in the model. Ethnicity is interpreted relative to the reference category (the comparison group), which here is Asian; for example, the positive coefficient for eth.ipedsHispanic (0.58746, p ≈ 0.000311) tells us that Hispanic students had higher odds of PAL enrollment than Asian students in this model. We add here that a few ethnicity coefficients are not statistically significant, meaning that the estimated differences for these as compared to the reference category might be because of random variation. We continue by noting, however, that this does not matter here as we are simply looking at whether matching based on the scores leads to similar background characteristics across groups.
Next, we match PAL students to non-PAL students with similar propensity scores so the two groups are comparable on observed characteristics, like a “pseudo-randomized” comparison. This step is an important part of our study because it turns the propensity scores into an actual comparison set for estimating the treatment effect (ATT).
To make use of the propensity scores generated above, we implement
nearest-neighbor propensity score matching to create a
comparison group of Non-PAL students who are similar to PAL students
with respect to the observed covariates. This step targets the
average treatment effect on the treated (ATT): the mean
difference in grades between PAL participants and what would be expected
for those same participants had they not participated, under the
assumption of ignorability. To do this, the variable
pal.flg is first recoded to a logical treatment indicator
(TRUE = PAL, FALSE = Non-PAL) because Match() expects a
binary treatment vector. Following this, we call
Match(Y = grade.num, Tr = pal.flg, X = propensity, estimand = "ATT", M = 1, caliper = 0.25),
which (while enforcing a caliper to prevent poor-quality matches)
matches each treated student to one control student based on closeness
in propensity score. The caliper acts as a guardrail: it limits matches
to sufficiently similar propensity scores so that the matched control
group remains comparable to the treated group.
# propensity score matching
math12.final$pal.flg <- ifelse(math12.final$pal.flg == "PAL", TRUE, FALSE)
match.math12 <- Match(Y=math12.final$grade.num, Tr=math12.final$pal.flg,
X=math12.final$propensity, BiasAdjust = F,
estimand = "ATT", M=1, caliper=0.25)
We now check balance to determine whether matching actually reduced the differences in covariates between PAL and non-PAL groups. If this balance is still poor, then the comparison is still unfair and we should revise the propensity model or matching settings before trusting the effect estimate.
We evaluate the success of matching using covariate balance
diagnostics. The function bal.tab(...) reports
standardized differences in covariates before and after matching (e.g.,
Diff.Un versus Diff.Adj), and the
corresponding love.plot(...) visualizes the same quantities
against a prespecified threshold (commonly |SMD| < 0.10). In these
results, imbalance is large prior to matching for key predictors (e.g.,
cum.percent.units.passed has Diff.Un ≈ 0.5057) and is
substantially reduced after matching (Diff.Adj ≈ −0.0531). Similar
reductions toward zero occur across gender and ethnicity indicators.
Because post-match standardized differences are generally small (well
within typical balance thresholds), the matched sample appears to
represent a substantially fairer comparison between PAL and Non-PAL
students than the unadjusted sample.
# balance check
math12.bal <- bal.tab(match.math12, math12.step.first.order$formula,
data=math12.final, un=TRUE, quick=FALSE)
math12.bal
## Balance Measures
## Type Diff.Un Diff.Adj
## cum.percent.units.passed Contin. 0.5057 -0.0531
## gender_Male Binary -0.1246 -0.0415
## eth.ipeds_Asian Binary -0.0716 0.0023
## eth.ipeds_Black Binary -0.0042 -0.0102
## eth.ipeds_Hispanic Binary 0.0913 0.0051
## eth.ipeds_OTHER Binary 0.0241 -0.0013
## eth.ipeds_Two or More Races Binary -0.0047 -0.0061
## eth.ipeds_White Binary -0.0349 0.0103
##
## Sample sizes
## FALSE TRUE
## All 3916. 327
## Matched (ESS) 1490.36 327
## Matched (Unweighted) 2783. 327
## Unmatched 1133. 0
love.plot(math12.bal, binary = "raw", stars = "std", var.order = "unadjusted",
thresholds = c(m = .1), abs = F)
It should be noted that the adjusted differences are under the 0.01 threshold. Hence, matching successfully made the PAL and non-PAL groups more similar on the observed covariates.
The following code reports the estimated PAL effect on grades from the matched sample, along with a two-sided asymptotic Wald test (\(H_0:\) PAL \(\ne\) Non-PAL). In other words, it summarizes the matched comparison as a treatment-effect estimate together with measures of uncertainty, including the standard error, test statistic, and p-value.
est.effect <- summary(match.math12)
##
## Estimate... 0.095335
## AI SE...... 0.057671
## T-stat..... 1.6531
## p.val...... 0.098318
##
## Original number of observations.............. 4243
## Original number of treated obs............... 327
## Matched number of observations............... 327
## Matched number of observations (unweighted). 72325
##
## Caliper (SDs)........................................ 0.25
## Number of obs dropped by 'exact' or 'caliper' 0
The same test but for a one-sided hypothesis (\(H_0:\) PAL \(>\) Non-PAL) is run in the following block of code.
indx.trt <- match.math12$index.treated
indx.cntr <- match.math12$index.control
wts <- match.math12$weights
mean.trt <- sum(math12.final[indx.trt, 'grade.num']*wts)/sum(wts)
mean.cntrl <- sum(math12.final[indx.cntr, 'grade.num']*wts)/sum(wts)
diff.means <- match.math12$est
std.error <- match.math12$se
num.uniq.contrl.matched <- length(unique(match.math12$index.control))
num.trt <- match.math12$wnobs
p.val <- (1-pnorm(abs(diff.means/std.error))) #right-tail
results <- data.frame(course=c("Math 12"), mean.gpa.PAL=round(mean.trt,2),
mean.gpa.nonPAL=round(mean.cntrl,2),
difference.mean.gpa=round(diff.means,2),
standard.error=round(std.error,2),
p.val=round(p.val,4),
number.control=num.uniq.contrl.matched,
number.treated=num.trt)
kable(results, caption = "Propensity-scored Adjusted Comparison of PAL and non-PAL Grades")
| course | mean.gpa.PAL | mean.gpa.nonPAL | difference.mean.gpa | standard.error | p.val | number.control | number.treated |
|---|---|---|---|---|---|---|---|
| Math 12 | 2.62 | 2.53 | 0.1 | 0.06 | 0.0492 | 2783 | 327 |
Overall, after cleaning the data, estimating propensity scores, matching PAL students to comparable non-PAL students, and confirming good post-match balance, the adjusted analysis suggests a small positive effect of PAL on Math 12 grades. In the matched comparison, PAL students had an average grade of about 2.62 versus 2.53 for matched non-PAL students, corresponding to an estimated ATT of about 0.10 grade points. The two-sided test did not reach the conventional 0.05 significance level (estimate ≈ 0.095, SE ≈ 0.058, p ≈ 0.098), although the pre-specified one-sided test gave borderline evidence in favor of PAL (p ≈ 0.0492). All 327 treated students were matched within the caliper, no treated observations were dropped, and the post-match adjusted covariate differences were small, indicating that matching substantially improved comparability between the groups. Taken together, these results are consistent with a modest possible benefit of PAL in Math 12, but the evidence should still be interpreted cautiously because this remains an observational study rather than a randomized experiment.