a12_MCAR_MAR_CMCAR.Rmd
TODO: check stat sig of the test below
Will use this to demonstrate the test of MCAR.
Notice that you don’t even need the MMRM - you just need to control for the covariate governing missingness.
library(SPR)
library(MASS)
library(emmeans)
library(nlme)
set.seed(03232021)
out <- sim_dat(N = 2000 ,
number.groups = 2 ,
number.timepoints = 4 ,
cond.mcar = T ,
#reg.formula = formula( ~ Time + Group + Time*Group),
Beta = 2 ,
corr = 'ar1' ,
cor.value = 0.8 ,
var.values = 2 )
dat <- out$dat
dat <- dropout(dat = dat,
type_dropout = c('cmcar'),
prop.miss = 0.3)
Does your score at a previous timepoint predict your score at the next timepoint?
dat1 <- dat[dat$Time == 'Time_3', 'Y_cmcar']
dat2 <- dat[dat$Time == 'Time_4', 'Y_cmcar']
dropped.out <- !is.na(dat1) & is.na(dat2) # these subjects dropped out at timepoint 4
table(dropped.out)
# Does the score at timepoint 3 predict drop-out at timepoint 4?
mod <- glm(dropped.out ~ dat1, family = 'binomial')
summary(mod)
# # TODO 3.23.21 - why is this not stat sig? resolve
Return to the Covariate adjustment later
What is the correctly specified model? Check data generation:
out$reg.formula
# OLS Regression Model
mod.ols1 <- lm(Y_comp ~ Group + Time + Covariate + Time * Group + Covariate * Time, data = dat)
# Correctly specified OLS model with missing data:
mod.ols2 <- lm(Y_cmcar ~ Group + Time + Covariate + Time * Group + Covariate * Time, data = dat)
# Misspecified OLS Model:
mod.ols3 <- lm(Y_cmcar ~ Group + Time + Time * Group, data = dat)
# Compute marginal mean contrasts
# Don't worry about the degrees of freedom, we aren't focused on the Type I error or Power
# Just look at the parameter estimates to check for bias
library(emmeans)
mod.emms.ols1 <- emmeans(mod.ols1, pairwise ~ Group | Time, adjust = 'none', mode = 'df.error')
mod.emms.ols2 <- emmeans(mod.ols2, pairwise ~ Group | Time, adjust = 'none', mode = 'df.error')
mod.emms.ols3 <- emmeans(mod.ols3, pairwise ~ Group | Time, adjust = 'none', mode = 'df.error')
# Examine output:
out$Beta
# Data generation done specifically to demonstrate this point
# Look at drop-out:
aggregate(Y_cmcar ~ Time, FUN = function(x) mean(is.na(x)), dat = dat, na.action = na.pass)
aggregate(Y_cmcar ~ Group + Time, FUN = function(x) mean(is.na(x)), dat = dat, na.action = na.pass)
# Substntially different drop out rates across the treatment arms
# Descriptive Statistics
aggregate(Y_comp ~ Group + Time, FUN = function(x) round(mean(x, na.rm = T),2), dat = dat, na.action = na.pass)
aggregate(Y_cmcar ~ Group + Time, FUN = function(x) round(mean(x, na.rm = T),2), dat = dat, na.action = na.pass)
# You can't adjust for the Covariate with descriptive statistics!
# OLS
mod.emms.ols1$contrasts
mod.emms.ols2$contrasts
mod.emms.ols3$contrasts
# Misspecified model leads to biased estimates
# With conditional MCAR data, the key is to control for the covariate causing dropout
# Once you do that, you're back to MCAR data, and a simple OLS works fine
# Of course, we can confirm that MMRM also yields unbiased estimates (not shown here)
Return to this - adjust for Covariate, recheck Test for MCAR:
dat1 <- dat[dat$Time == 'Time_3', 'Y_cmcar']
dat2 <- dat[dat$Time == 'Time_4', 'Y_cmcar']
dropped.out <- !is.na(dat1) & is.na(dat2) # these subjects dropped out at timepoint 4
table(dropped.out)
# Does the score at the previous timepoint predict drop-out AFTER adjusting for Covariate?
cov <- dat$Covariate[dat$Time == 'Time_4']
mod.mcar <- glm(dropped.out ~ dat1 + cov, family = 'binomial')
summary(mod.mcar)