Packages

We start this analysis by loading packages in r so that we have the needed tools to aid in our data wrangling (i.e., cleaning and organizing the data), matching (i.e., ensuring PAL students with similar and non-PAL students with similar predictors are grouped), and balance checking (i.e., ensuring these groupings are fair and accurate). We do this manually because the functions required for this project are not provided by R in a clean, standard way. By installing them here, we are preventing “function not found” errors later in the report.

library(dplyr)
library(MatchIt)
library(Matching)
library(DataExplorer)
library(cobalt)
library(backports)
library(rbounds)
library(tidyr)
library(knitr)
library(recipes)
library(readr)
knitr::opts_chunk$set(
  fig.path = "report_figures/",
  dev = "png",
  dpi = 300
)

Create Functions

We create helper functions below to check for variables that cause complete separation in logistic regression. It should be noted that, when being compared to numerical predictors, categorical predictors are especially likely to create this problem. The reason for this is due to their continuous nature and how they are often used in statistical analysis.

check_separation <- function(df, outcome, predictor) {
  tbl <- table(df[[predictor]], df[[outcome]])
  any(tbl[,1] == 0 | tbl[,2] == 0)
}

find_separators <- function(df, outcome) {
  # Select only factor columns
  factor_vars <- names(df)[sapply(df, is.factor)]
  # Exclude outcome if it's a factor
  factor_vars <- setdiff(factor_vars, outcome)
   
  separators <- sapply(factor_vars, function(pred) {
    check_separation(df, outcome, pred)
  })
  
  names(separators[separators])
}

Complete separation screening functions (used before propensity score modeling)

Before fitting the logistic regression model that generates propensity scores for PAL participation (pal.flg), we check the the Math 12 dataset in the code above to make sure there is not complete separation. This is when a categorical predictor perfectly predicts PAL versus non-PAL and skews logistic regression models. Removing variables that cause complete separation is important because it can lead to unreliable propensity scores and can compromise the matching step by destabilizing logistic regression (e.g., very large or effectively infinite coefficient estimates, huge standard errors, and/or non-convergence). To prevent this, we define two helper functions that automatically identify problematic predictors so they can be removed or their categories collapsed before fitting the propensity score model.

The function we created (i.e.,check_separation(df, outcome, predictor)) performs a one-predictor check by creating a contingency table of predictor by the binary outcome (pal.flg) and returns TRUE if any predictor level has zero observations in either the PAL or non-PAL group–translating as a possible risk of separation. The function find_separators(df, outcome) then scales this up to the entire dataset by finding all factor variables in math12.dat (excluding pal.flg itself), applying check_separation() to each one, and returning the names of predictors which trigger separation. We then use that returned list of predictors–for instance, sepvar12 <- find_separators(math12.dat, "pal.flg")–to drop or to otherwise handle those predictors before fitting the propensity score logistic regression, improving model stability and ensuring the propensity scores are well-defined for matching.

Import PAL Data

We are now importing the PAL dataset, and it is also here where we convert variables into the right data types (e.g., creating factors for categorical variables) so models and matching routines interpret them correctly.

setwd("~/Downloads/Independent Study/CTL Files")
dat <- read_rds("pal_data_25.rds")
dat[sapply(dat, is.character)] <- lapply(dat[sapply(dat, is.character)], 
                                         as.factor)

Subset Data

Since we specifically wish to examine the effect of PAL on the single course MATH 12, the block of code below extracts all information on Math 12 from the dataset. We end by removing the students who did not complete PAL (e.g., PAL “NC/W”) as leaving it in could skew the resulting model by altering the “treatment group.”

math.classes <- paste("MATH", c(12))
math.ind <- which(dat$course == "MATH 12")
math12.dat <- dat[math.ind, ]
math12.dat$course <- droplevels(math12.dat$course)

math12.dat$term <- factor(math12.dat$term, ordered = TRUE, 
                       levels = c("Fall 2021", "Spring 2022 ", "Summer 2022",
                                  "Fall 2022", "Spring 2023", 
                                  "Fall 2023", "Spring 2024",
                                  "Fall 2024"))
# remove the students who did not complete PAL
pal.dnf <- which(math12.dat$pal.grade == "NC" | math12.dat$pal.grade == "W")
math12.dat <- math12.dat[-pal.dnf,]

In the code above, the variable math.ind <- which(dat$course == "MATH 12") finds the row numbers where the course is MATH 12 using the which() function, and math12.dat <- dat[math.ind, ] subsets the data to only those rows. Then the function droplevels() removes unused course levels so the subset does not keep categories from other courses. Finally, math12.dat$term <- factor(math12.dat$term, ordered = TRUE, levels = c("Fall 2021", ..., "Fall 2024")) makes term an ordered timeline rather than plain text, so terms sort chronologically in tables/plots and any modeling steps that use term.

Raw Means

Here, I compute the simple mean grade for PAL vs non-PAL as a baseline “before any correction,” which shows the naive difference. Note that this is a biased comparison of the control and treatment groups but it is done to show us what the raw gap looks like and makes the need for propensity scores and matching self-evident.

raw.table = data.frame(class=character(),
                       nonPALavg=numeric(),
                       PALavg=numeric(), 
                       Diff=numeric(), 
                       NonPAL_Num= integer(),
                       PAL_Num=integer(),
                       row.names=NULL,
                       stringsAsFactors = FALSE)

for (i in 1:length(math.classes)){
  curr.class = math.classes[i]
  
  pal.avg <- round(mean(math12.dat$grade.num[math12.dat$course==curr.class & 
                                            math12.dat$pal.flg == "PAL"],
                        na.rm=TRUE), 2)
  nopal.avg <- round(mean(math12.dat$grade.num[math12.dat$course==curr.class & 
                                              math12.dat$pal.flg == "Non-PAL"],
                          na.rm=TRUE), 2)
  
  raw.table[i, 'class' ] = curr.class
  raw.table[i, c(2:4)]=c(nopal.avg, pal.avg, pal.avg-nopal.avg)
  raw.table[i, c(5,6)]= c(sum(math12.dat$course==curr.class &
                                math12.dat$pal.flg == "Non-PAL"),
                          sum(math12.dat$course==curr.class &
                                math12.dat$pal.flg == "PAL"))
  
}
# formatted table
kable(raw.table, caption = "Raw Comparison of PAL and non-PAL Grades 
      (No Propensity Adjustment)")
Raw Comparison of PAL and non-PAL Grades (No Propensity Adjustment)
class nonPALavg PALavg Diff NonPAL_Num PAL_Num
MATH 12 2.23 2.64 0.41 4222 340

The unadjusted means are promising - PAL students have higher mean grades in each class than non-PAL students - but propensity score adjustment is necessary to draw causal conclusions about the relationship between PAL participation and course grade.

Raw Means: Code Breakdown and Statistical Interpretation

The line raw.table = data.frame(...) creates an empty results table; The columns are the sample mean of non-PAL students (i.e., nonPALavg), the average grade of the PAL students in the sample (i.e., PALavg), the difference between these groupings of mean grades (i.e., ‘Diff’), and the size of the sample of our population (i.e., “Num”). The loop for (i in 1:length(math.classes)) is included as well to make it possible for this same code to be re-used; length() returns how many course labels there are in math.classes, and 1:length(...) produces the index values the loop will run through. Inside the loop, curr.class = math.classes[i] uses indexing ([i]) to pick the current course label, which then controls all filtering. The line pal.avg <- round(mean(math12.dat$grade.num[...], na.rm=TRUE), 2) has several pieces: the square brackets [...] do row filtering (logical conditions pick only the rows for the current course AND PAL students), mean() computes the sample average of grade.num (i.e., an estimate of the typical grade in that group), na.rm=TRUE tells R to ignore missing grades rather than letting missing values force the mean to become NA, and round(...,2) formats the result to two decimal places for a clean report table. The nopal.avg <- ... line repeats the same steps but for the Non-PAL group. Next, raw.table[i, 'class'] = curr.class and raw.table[i, c(2:4)] = c(nopal.avg, pal.avg, pal.avg-nopal.avg) fill the results table: c() combines values into a vector, and c(2:4) selects the relevant columns (Non-PAL mean, PAL mean, and their difference). The counts raw.table[i, c(5,6)] = c(sum(...Non-PAL...), sum(...PAL...)) use sum() on logical statements; in code terms, TRUE counts as 1 and FALSE counts as 0, so summing a logical condition gives the number of rows satisfying the condition; statistically, these counts are crucial because they tell you how many observations each mean is based on (bigger N → more stable average). Finally, kable(raw.table, ...) formats the raw.table into a readable table for the knitted report.

Raw Means (interpretation)

Comparing the mean grades for the PAL and non-PAL students, we find that PAL participation can be linked to a higher course grade. Additionally, the sample sizes of 4222 Non-PAL and 340 PAL make the averages in the above table significant as the data collected is from a considerably sized group of students. We add the cautious reminder however that these scores have not in any way been adjusted and should thus be treated only as motivation to dig deeper.

The process of propensity score modeling, matching, and balance checks following is needed to minimize the differences between the covariates so that the end gap in the grades is an accurate indicator of causal relation.

Data Cleaning and Feature Engineering

We have written code below to make the given dataset better for modeling. We first reduce missingness problems by removing variables with more than 10% of their data missing. This helps the accuracy of the propensity model by making it easier to group PAL and non-PAL students with similar background variables. Additionally, we have created the variable “cum.percent.units.passed” which measures the proportion of attempted units that a student passed and gives a more standardized picture of academic progress than raw unit counts alone. The functions plot_missing() and profile_missing() are used to identify which variables have missing values and how much missingness these variables contain. This is important because variables with too much missing data can shrink the usable sample, make the model less stable, and reduce the quality of the matching step.

# create feature, proportion of cumulative units taken that were passes
math12.dat$cum.percent.units.passed =  math12.dat$unt.passd.prgrss/
  math12.dat$unt.taken.prgrss

plot_missing(math12.dat, missing_only = TRUE)

zero_missing_vars <- math12.dat %>% 
  summarise(across( ,list(function(x) sum(is.na(x))))) %>%
  gather() %>%
  arrange(value)

zero_missing_vars <- zero_missing_vars %>%
  filter(zero_missing_vars$value != 0)

zero_missing_vars$var.name = substr(zero_missing_vars$key,1,
                                    nchar(zero_missing_vars$key)-2)

zero_missing_vars <- math12.dat %>%
  dplyr::select(zero_missing_vars$var.name)

missing_vars <- profile_missing(zero_missing_vars)

vars.missing.data = as.character(missing_vars[missing_vars$pct_missing>0.05, "feature"])
math12.dat = math12.dat[,!(names(math12.dat) %in% vars.missing.data)]

In the next step step of this data cleaning and feature engineering step, we discard variables with only one value, since such variables provide no real information for distinguishing PAL from non-PAL students and would only reduce the efficiency of the model. We also collapse sparse categories in several categorical predictors so that small groups do not create unstable estimates or make comparisons unreliable. We note that these steps are significant because logistic regression works better when predictors contain enough variation and when categories are not too rare. We then recode the instructor variable with an anonymous numeric index, which protects confidentiality. Lastly, we identify and remove variables that still cause complete separation in logistic regression. Together, these cleaning steps make the dataset more stable, more interpretable, and better suited for the later matching analysis.

math12.dat$yr.course.taken = 
  as.numeric(gsub(".*([0-9]{4})","\\1",math12.dat$coh.term))

# remove single-valued variables
single.category.vars = sapply(math12.dat, function(x) length(unique(x))==1)
sum(single.category.vars)
## [1] 7
names(math12.dat)[single.category.vars]
## [1] "course"                "course.descr"          "unt.taken"            
## [4] "acad.group"            "dept.abbr"             "units.taken.remedial" 
## [7] "units.passed.remedial"
keep.vars = names(which(single.category.vars==FALSE))
math12.dat = math12.dat[, keep.vars]
dim(math12.dat)
## [1] 4562   59
# collapse sparse categories
math12.dat=group_category(data = math12.dat, feature = "acad.plan1", 
                         threshold = 0.1, update = TRUE)
math12.dat$acad.plan1 <- as.factor(math12.dat$acad.plan1)
table(math12.dat$acad.plan1, PAL=math12.dat$pal.flg)
##             PAL
##              Non-PAL PAL
##   BIOLBMEDBS      29   3
##   BIOLNONEBA      54   4
##   BIOLNONEBS     358  17
##   BIOLPBIONU     768 105
##   CHEMBCHMBS      90   5
##   CHEMBIOCBA      40   4
##   CHEMFORCBA      71  14
##   CHEMNONEBA      26   1
##   CHEMNONEBS      46   5
##   CIVENONEBS     316  18
##   CMPENONEBS     284  11
##   CONMPRCMNU     230  21
##   CRIMPCRJNU      24   3
##   CSCIPCSCNU     667  35
##   EENGNONEBS     122   6
##   FOODNUTSBS      23   4
##   HLSCHLSCBS      69   9
##   KINSPEXSNU      40   2
##   MATHSUBPBA      39   4
##   MECHNONEBS     369  27
##   OTHER          432  30
##   UNDCNONENU     125  12
math12.dat=group_category(data = math12.dat, feature = "coh", 
                       threshold = 0.05, update = TRUE)
math12.dat$coh <- as.factor(math12.dat$coh)
table(math12.dat$coh, PAL=math12.dat$pal.flg)
##             PAL
##              Non-PAL  PAL
##   First-time    3871  298
##   OTHER          351   42
math12.dat=group_category(data = math12.dat, feature = "learning.mode", 
                       threshold = 0.05, update = TRUE)
math12.dat$learning.mode <- as.factor(math12.dat$learning.mode)
table(math12.dat$learning.mode, PAL=math12.dat$pal.flg)
##                                 PAL
##                                  Non-PAL  PAL
##   Asynchronous no meetings AB386     444    7
##   Face-to-face                      3378  322
##   OTHER                              400   11
math12.dat=group_category(data = math12.dat, feature = "eth.ipeds", 
                       threshold = 0.05, update = TRUE)
math12.dat$eth.ipeds <- as.factor(math12.dat$eth.ipeds)
table(math12.dat$eth.ipeds, PAL=math12.dat$pal.flg)
##                    PAL
##                     Non-PAL  PAL
##   Asian                 999   55
##   Black                 308   25
##   Hispanic             1889  181
##   OTHER                 223   25
##   Two or More Races     231   18
##   White                 572   36
math12.dat <- math12.dat[math12.dat$gender != "Nonbinary",]
math12.dat$gender <- droplevels(math12.dat$gender)

# code instructor as alphabetic letter for anonymity
math12.dat$instructor1=droplevels(factor(math12.dat$instructor1))

instructor.vec = sort(unique(math12.dat$instructor1))
num.instr = length(instructor.vec)

math12.dat$instructor1 = factor(math12.dat$instructor1, levels = instructor.vec, 
                             labels=as.character(1:num.instr))

key.instr.code = cbind(as.character(instructor.vec),
                       1:num.instr)

# remove variables causing complete separation in log reg
sepvar12 <- find_separators(math12.dat, "pal.flg")
math12.dat <- math12.dat[,-which(names(math12.dat) %in% sepvar12)]
# Filter to relevant variables
vars.to.keep = c("grade.num","instructor1","leanring.mode","pal.flg","coh",
                 "base.time","gender","eth.ipeds","mother.ed","father.ed",
                 "pell.term.flg","term.age","last.school.attended.local.flg",
                 "school.zip.median.income","qr.cat","acad.plan1","term.units",
                 "cum.percent.units.passed","cum.gpa.start")
new.vars = intersect(vars.to.keep, names(math12.dat))
math12.dat = math12.dat[ ,new.vars]

# subset on complete cases
math12.dat <- math12.dat[complete.cases(math12.dat), ]

# recheck for complete separation
find_separators(math12.dat, "pal.flg")
## character(0)

Propensity Score Models

In the multiple code blocks below we will produce a propensity score model using logistic regression and proceed to then pull from this the propensity score for each observation.

There is a risk of selection bias as students might have had different background factors (e.g., work ethic, financial support, etcetera) which influenced their decision to take PAL. To minimize this selection bias, we first estimate each student’s propensity score. The propensity score for a given student is the given student’s probability of enrolling in PAL and is determined by observed pre-treatment covariates. Though not always the case, the propensity scores we obtain are determined by fitting a logistic regression (i.e., a generalized linear model) where the outcome is PAL participation (pal.flg) and the predictors are student background variables which are believed to influence both PAL enrollment and later course performance. This step does not predict grades directly; Instead, it summarizes a multivariate covariate profile of every student into a single scalar score that can be used to form more comparable treated (PAL) and control (Non-PAL) groups. The code glm(..., family = binomial) fits the logistic model and returns coefficient estimates on the log-odds scale; the positive coefficients correspond to higher odds of PAL enrollment, negative coefficients correspond to lower odds, holding other covariates fixed.

We begin with the Regression Model below. This is motivated first by our need to create a propensity model of the data and second by our desire to have it accurately predict PAL enrollment. Broken into its three core components, our code here begins with the ‘min.model’ function which ensures we keep several key variables in the creation of the propensity score. Next, the Upper Model Formula adds all other background variables back in except grade since we are creating this model with the end goal of predicting grade; including it in the propensity model would introduce post-treatment information and undermine causal interpretation. We then use forward stepwise selection, meaning the code tries adding extra variables one by one to the simple model. It uses AIC, a measure that balances model accuracy against unnecessary complexity, to decide whether each added variable should be kept or rejected. Here, the simple starting model performs best and hence stays as the final model. The fitted model then gives each student an estimated probability of joining PAL, which becomes that student’s propensity score.

# Minimal model (starting point) 
min.model <- glm(pal.flg ~ cum.percent.units.passed + gender + eth.ipeds, 
                 data = math12.dat, family = binomial)

# Upper model formula (including all but grade.num)
upper.formula <- as.formula("pal.flg ~ . - grade.num")

# Stepwise selection (forward)
math12.step.first.order <- stepAIC(min.model,
                                  direction = "forward",
                                  scope = list(lower = formula(min.model), 
                                               upper = upper.formula),
                                  trace = FALSE, k = 2)
model.first.order <- formula(math12.step.first.order)

math12.first.order.prop.model <- glm(model.first.order, data=math12.dat, family=binomial)
summary(math12.first.order.prop.model)
## 
## Call:
## glm(formula = model.first.order, family = binomial, data = math12.dat)
## 
## Coefficients:
##                            Estimate Std. Error z value Pr(>|z|)    
## (Intercept)                -3.86151    0.28125 -13.730  < 2e-16 ***
## cum.percent.units.passed    1.48065    0.25817   5.735 9.74e-09 ***
## genderMale                 -0.46337    0.11655  -3.976 7.02e-05 ***
## eth.ipedsBlack              0.37472    0.26509   1.414 0.157492    
## eth.ipedsHispanic           0.58746    0.16292   3.606 0.000311 ***
## eth.ipedsOTHER              0.75827    0.26034   2.913 0.003584 ** 
## eth.ipedsTwo or More Races  0.25110    0.29742   0.844 0.398530    
## eth.ipedsWhite             -0.01912    0.22950  -0.083 0.933586    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 2304.4  on 4242  degrees of freedom
## Residual deviance: 2226.1  on 4235  degrees of freedom
## AIC: 2242.1
## 
## Number of Fisher Scoring iterations: 6
math12.final <- math12.dat
math12.final$propensity <- math12.first.order.prop.model$fitted.values

Looking at the fitted model, cum.percent.units.passed has a strong positive relationship with PAL enrollment (estimate = 1.48065, p ≈ 9.74e-09, where a smaller p-value means stronger statistical evidence that the variable is related to PAL enrollment). That is, higher cumulative pass rates corresponded to higher odds of PAL enrollment, holding the other covariates constant. The coefficient for genderMale is negative (−0.46337, p ≈ 7.02e-05), which indicates that male students had lower odds (odds = p/(1−p), where p is probability) of enrolling in PAL than the female reference group, after accounting for the other variables in the model. Ethnicity is interpreted relative to the reference category (the comparison group), which here is Asian; for example, the positive coefficient for eth.ipedsHispanic (0.58746, p ≈ 0.000311) tells us that Hispanic students had higher odds of PAL enrollment than Asian students in this model. We add here that a few ethnicity coefficients are not statistically significant, meaning that the estimated differences for these as compared to the reference category might be because of random variation. We continue by noting, however, that this does not matter here as we are simply looking at whether matching based on the scores leads to similar background characteristics across groups.

Propensity Score Matching

Next, we match PAL students to non-PAL students with similar propensity scores so the two groups are comparable on observed characteristics, like a “pseudo-randomized” comparison. This step is an important part of our study because it turns the propensity scores into an actual comparison set for estimating the treatment effect (ATT).

Propensity Score Matching and Balance Checking

To make use of the propensity scores generated above, we implement nearest-neighbor propensity score matching to create a comparison group of Non-PAL students who are similar to PAL students with respect to the observed covariates. This step targets the average treatment effect on the treated (ATT): the mean difference in grades between PAL participants and what would be expected for those same participants had they not participated, under the assumption of ignorability. To do this, the variable pal.flg is first recoded to a logical treatment indicator (TRUE = PAL, FALSE = Non-PAL) because Match() expects a binary treatment vector. Following this, we call Match(Y = grade.num, Tr = pal.flg, X = propensity, estimand = "ATT", M = 1, caliper = 0.25), which (while enforcing a caliper to prevent poor-quality matches) matches each treated student to one control student based on closeness in propensity score. The caliper acts as a guardrail: it limits matches to sufficiently similar propensity scores so that the matched control group remains comparable to the treated group.

# propensity score matching
math12.final$pal.flg <- ifelse(math12.final$pal.flg == "PAL", TRUE, FALSE)
match.math12 <- Match(Y=math12.final$grade.num, Tr=math12.final$pal.flg, 
                    X=math12.final$propensity, BiasAdjust = F, 
                    estimand = "ATT", M=1, caliper=0.25)

We now check balance to determine whether matching actually reduced the differences in covariates between PAL and non-PAL groups. If this balance is still poor, then the comparison is still unfair and we should revise the propensity model or matching settings before trusting the effect estimate.

We evaluate the success of matching using covariate balance diagnostics. The function bal.tab(...) reports standardized differences in covariates before and after matching (e.g., Diff.Un versus Diff.Adj), and the corresponding love.plot(...) visualizes the same quantities against a prespecified threshold (commonly |SMD| < 0.10). In these results, imbalance is large prior to matching for key predictors (e.g., cum.percent.units.passed has Diff.Un ≈ 0.5057) and is substantially reduced after matching (Diff.Adj ≈ −0.0531). Similar reductions toward zero occur across gender and ethnicity indicators. Because post-match standardized differences are generally small (well within typical balance thresholds), the matched sample appears to represent a substantially fairer comparison between PAL and Non-PAL students than the unadjusted sample.

# balance check
math12.bal <- bal.tab(match.math12, math12.step.first.order$formula, 
                    data=math12.final, un=TRUE, quick=FALSE)

math12.bal
## Balance Measures
##                                Type Diff.Un Diff.Adj
## cum.percent.units.passed    Contin.  0.5057  -0.0531
## gender_Male                  Binary -0.1246  -0.0415
## eth.ipeds_Asian              Binary -0.0716   0.0023
## eth.ipeds_Black              Binary -0.0042  -0.0102
## eth.ipeds_Hispanic           Binary  0.0913   0.0051
## eth.ipeds_OTHER              Binary  0.0241  -0.0013
## eth.ipeds_Two or More Races  Binary -0.0047  -0.0061
## eth.ipeds_White              Binary -0.0349   0.0103
## 
## Sample sizes
##                        FALSE TRUE
## All                  3916.    327
## Matched (ESS)        1490.36  327
## Matched (Unweighted) 2783.    327
## Unmatched            1133.      0
love.plot(math12.bal, binary = "raw", stars = "std", var.order = "unadjusted", 
          thresholds = c(m = .1), abs = F)

It should be noted that the adjusted differences are under the 0.01 threshold. Hence, matching successfully made the PAL and non-PAL groups more similar on the observed covariates.

Estimate Differences

The following code reports the estimated PAL effect on grades from the matched sample, along with a two-sided asymptotic Wald test (\(H_0:\) PAL \(\ne\) Non-PAL). In other words, it summarizes the matched comparison as a treatment-effect estimate together with measures of uncertainty, including the standard error, test statistic, and p-value.

est.effect <- summary(match.math12)
## 
## Estimate...  0.095335 
## AI SE......  0.057671 
## T-stat.....  1.6531 
## p.val......  0.098318 
## 
## Original number of observations..............  4243 
## Original number of treated obs...............  327 
## Matched number of observations...............  327 
## Matched number of observations  (unweighted).  72325 
## 
## Caliper (SDs)........................................   0.25 
## Number of obs dropped by 'exact' or 'caliper'  0

The same test but for a one-sided hypothesis (\(H_0:\) PAL \(>\) Non-PAL) is run in the following block of code.

indx.trt <- match.math12$index.treated
indx.cntr <- match.math12$index.control
wts <- match.math12$weights

mean.trt <- sum(math12.final[indx.trt, 'grade.num']*wts)/sum(wts)
mean.cntrl <- sum(math12.final[indx.cntr, 'grade.num']*wts)/sum(wts)

diff.means <- match.math12$est

std.error <- match.math12$se
num.uniq.contrl.matched <- length(unique(match.math12$index.control))

num.trt <- match.math12$wnobs

p.val <- (1-pnorm(abs(diff.means/std.error))) #right-tail

results <- data.frame(course=c("Math 12"), mean.gpa.PAL=round(mean.trt,2), 
                     mean.gpa.nonPAL=round(mean.cntrl,2), 
                     difference.mean.gpa=round(diff.means,2), 
                     standard.error=round(std.error,2),
                     p.val=round(p.val,4),
                     number.control=num.uniq.contrl.matched, 
                     number.treated=num.trt)

kable(results, caption = "Propensity-scored Adjusted Comparison of PAL and non-PAL Grades")
Propensity-scored Adjusted Comparison of PAL and non-PAL Grades
course mean.gpa.PAL mean.gpa.nonPAL difference.mean.gpa standard.error p.val number.control number.treated
Math 12 2.62 2.53 0.1 0.06 0.0492 2783 327

Overall, after cleaning the data, estimating propensity scores, matching PAL students to comparable non-PAL students, and confirming good post-match balance, the adjusted analysis suggests a small positive effect of PAL on Math 12 grades. In the matched comparison, PAL students had an average grade of about 2.62 versus 2.53 for matched non-PAL students, corresponding to an estimated ATT of about 0.10 grade points. The two-sided test did not reach the conventional 0.05 significance level (estimate ≈ 0.095, SE ≈ 0.058, p ≈ 0.098), although the pre-specified one-sided test gave borderline evidence in favor of PAL (p ≈ 0.0492). All 327 treated students were matched within the caliper, no treated observations were dropped, and the post-match adjusted covariate differences were small, indicating that matching substantially improved comparability between the groups. Taken together, these results are consistent with a modest possible benefit of PAL in Math 12, but the evidence should still be interpreted cautiously because this remains an observational study rather than a randomized experiment.