The standardized (mean) difference is a measure of distance between two group means in terms of one or more variables. In practice it is often used as a balance measure of individual covariates before and after propensity score matching. As it is standardized, comparison across variables on different scales is possible. For definitions see https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3144483/#s11title .
Standardized mean differences can be easily calculated with tableone. All standardized mean differences in this package are absolute values, thus, there is no directionality.
## tableone package itself
library(tableone)
## PS matching
library(Matching)
## Weighted analysis
library(survey)
## Reorganizing data
library(reshape2)
## plotting
library(ggplot2)
The right heart catheterization dataset is available at https://biostat.app.vumc.org/wiki/Main/DataSets . This dataset was originally used in Connors et al. JAMA 1996;276:889-897, and has been made publicly available.
## Right heart cath dataset
<- read.csv("https://biostat.mc.vanderbilt.edu/wiki/pub/Main/DataSets/rhc.csv") rhc
Error in file(file, "rt"): cannot open the connection to 'https://biostat.mc.vanderbilt.edu/wiki/pub/Main/DataSets/rhc.csv'
Out of the 50 covariates, 32 have standardized mean differences of greater than 0.1, which is often considered the sign of important covariate imbalance (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3144483/#s11title ).
## Covariates
<- c("age","sex","race","edu","income","ninsclas","cat1","das2d3pc","dnr1",
vars "ca","surv2md1","aps1","scoma1","wtkilo1","temp1","meanbp1","resp1",
"hrt1","pafi1","paco21","ph1","wblc1","hema1","sod1","pot1","crea1",
"bili1","alb1","resp","card","neuro","gastr","renal","meta","hema",
"seps","trauma","ortho","cardiohx","chfhx","dementhx","psychhx",
"chrpulhx","renalhx","liverhx","gibledhx","malighx","immunhx",
"transhx","amihx")
## Construct a table
<- CreateTableOne(vars = vars, strata = "swang1", data = rhc, test = FALSE) tabUnmatched
Error in is.data.frame(data): object 'rhc' not found
## Show table with SMD
print(tabUnmatched, smd = TRUE)
Error in h(simpleError(msg, call)): error in evaluating the argument 'x' in selecting a method for function 'print': object 'tabUnmatched' not found
## Count covariates with important imbalance
addmargins(table(ExtractSmd(tabUnmatched) > 0.1))
Error in class(x)[1] %in% c("TableOne", "svyTableOne"): object 'tabUnmatched' not found
Usually a logistic regression model is used to estimate individual propensity scores. The model here is taken from “How To Use Propensity Score Analysis” (https://www.mc.vanderbilt.edu/crc/workshop_files/2008-04-11.pdf ). Predicted probabilities of being assigned to right heart catherterization, being assigned no right heart catherterization, being assigned to the true assignment, as well as the smaller of the probabilities of being assigned to right heart catherterization or no right heart catherterization are calculated for later use in propensity score matching and weighting.
$swang1 <- factor(rhc$swang1, levels = c("No RHC", "RHC")) rhc
Error in factor(rhc$swang1, levels = c("No RHC", "RHC")): object 'rhc' not found
## Fit model
<- glm(formula = swang1 ~ age + sex + race + edu + income + ninsclas +
psModel + das2d3pc + dnr1 + ca + surv2md1 + aps1 + scoma1 +
cat1 + temp1 + meanbp1 + resp1 + hrt1 + pafi1 +
wtkilo1 + ph1 + wblc1 + hema1 + sod1 + pot1 + crea1 +
paco21 + alb1 + resp + card + neuro + gastr + renal +
bili1 + hema + seps + trauma + ortho + cardiohx + chfhx +
meta + psychhx + chrpulhx + renalhx + liverhx + gibledhx +
dementhx + immunhx + transhx + amihx,
malighx family = binomial(link = "logit"),
data = rhc)
Error in is.data.frame(data): object 'rhc' not found
## Predicted probability of being assigned to RHC
$pRhc <- predict(psModel, type = "response") rhc
Error in predict(psModel, type = "response"): object 'psModel' not found
## Predicted probability of being assigned to no RHC
$pNoRhc <- 1 - rhc$pRhc rhc
Error in eval(expr, envir, enclos): object 'rhc' not found
## Predicted probability of being assigned to the
## treatment actually assigned (either RHC or no RHC)
$pAssign <- NA rhc
Error in rhc$pAssign <- NA: object 'rhc' not found
$pAssign[rhc$swang1 == "RHC"] <- rhc$pRhc[rhc$swang1 == "RHC"] rhc
Error in eval(expr, envir, enclos): object 'rhc' not found
$pAssign[rhc$swang1 == "No RHC"] <- rhc$pNoRhc[rhc$swang1 == "No RHC"] rhc
Error in eval(expr, envir, enclos): object 'rhc' not found
## Smaller of pRhc vs pNoRhc for matching weight
$pMin <- pmin(rhc$pRhc, rhc$pNoRhc) rhc
Error in pmin(rhc$pRhc, rhc$pNoRhc): object 'rhc' not found
The Matching package can be used for propensity score matching. The logit of propensity score is often used as the matching scale, and the matchign caliper is often 0.2 \(\times\) SD(logit(PS)). See https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3144483/#s5title for suggestions. After matching, all the standardized mean differences are below 0.1.
<- Match(Tr = (rhc$swang1 == "RHC"), # Need to be in 0,1
listMatch ## logit of PS,i.e., log(PS/(1-PS)) as matching scale
X = log(rhc$pRhc / rhc$pNoRhc),
## 1:1 matching
M = 1,
## caliper = 0.2 * SD(logit(PS))
caliper = 0.2,
replace = FALSE,
ties = TRUE,
version = "fast")
Error in Match(Tr = (rhc$swang1 == "RHC"), X = log(rhc$pRhc/rhc$pNoRhc), : object 'rhc' not found
## Extract matched data
<- rhc[unlist(listMatch[c("index.treated","index.control")]), ] rhcMatched
Error in eval(expr, envir, enclos): object 'rhc' not found
## Construct a table
<- CreateTableOne(vars = vars, strata = "swang1", data = rhcMatched, test = FALSE) tabMatched
Error in is.data.frame(data): object 'rhcMatched' not found
## Show table with SMD
print(tabMatched, smd = TRUE)
Error in h(simpleError(msg, call)): error in evaluating the argument 'x' in selecting a method for function 'print': object 'tabMatched' not found
## Count covariates with important imbalance
addmargins(table(ExtractSmd(tabMatched) > 0.1))
Error in class(x)[1] %in% c("TableOne", "svyTableOne"): object 'tabMatched' not found
The matching weight method is a weighting analogue to the 1:1 pairwise algorithmic matching (https://pubmed.ncbi.nlm.nih.gov/23902694/ ). The matching weight is defined as the smaller of the predicted probabilities of receiving or not receiving the treatment over the predicted probability of being assigned to the arm the patient is actually in. After weighting, all the standardized mean differences are below 0.1. The standardized mean differences in weighted data are explained in https://onlinelibrary.wiley.com/doi/full/10.1002/sim.6607 .
## Matching weight
$mw <- rhc$pMin / rhc$pAssign rhc
Error in eval(expr, envir, enclos): object 'rhc' not found
## Weighted data
<- svydesign(ids = ~ 1, data = rhc, weights = ~ mw) rhcSvy
Error in svydesign(ids = ~1, data = rhc, weights = ~mw): object 'rhc' not found
## Construct a table (This is a bit slow.)
<- svyCreateTableOne(vars = vars, strata = "swang1", data = rhcSvy, test = FALSE) tabWeighted
Error in c("svyrep.design", "survey.design2", "survey.design") %in% class(data): object 'rhcSvy' not found
## Show table with SMD
print(tabWeighted, smd = TRUE)
Error in h(simpleError(msg, call)): error in evaluating the argument 'x' in selecting a method for function 'print': object 'tabWeighted' not found
## Count covariates with important imbalance
addmargins(table(ExtractSmd(tabWeighted) > 0.1))
Error in class(x)[1] %in% c("TableOne", "svyTableOne"): object 'tabWeighted' not found
The overlap weight method is another alternative weighitng method (https://amstat.tandfonline.com/doi/abs/10.1080/01621459.2016.1260466). After weighting, all the standardized mean differences are below 0.1.
## Overlap weight
$ow <- (rhc$pAssign * (1 - rhc$pAssign)) / rhc$pAssign rhc
Error in eval(expr, envir, enclos): object 'rhc' not found
## Weighted data
<- svydesign(ids = ~ 1, data = rhc, weights = ~ ow) rhcSvyOw
Error in svydesign(ids = ~1, data = rhc, weights = ~ow): object 'rhc' not found
## Construct a table (This is a bit slow.)
<- svyCreateTableOne(vars = vars, strata = "swang1", data = rhcSvyOw, test = FALSE) tabWeightedOw
Error in c("svyrep.design", "survey.design2", "survey.design") %in% class(data): object 'rhcSvyOw' not found
## Show table with SMD
print(tabWeightedOw, smd = TRUE)
Error in h(simpleError(msg, call)): error in evaluating the argument 'x' in selecting a method for function 'print': object 'tabWeightedOw' not found
## Count covariates with important imbalance
addmargins(table(ExtractSmd(tabWeightedOw) > 0.1))
Error in class(x)[1] %in% c("TableOne", "svyTableOne"): object 'tabWeightedOw' not found
A plot showing covariate balance is often constructed to demonstrate the balancing effect of matching and/or weighting. Given the same propensity score model, the matching weight method often achieves better covariate balance than matching.
## Construct a data frame containing variable name and SMD from all methods
<- data.frame(variable = rownames(ExtractSmd(tabUnmatched)),
dataPlot Unmatched = as.numeric(ExtractSmd(tabUnmatched)),
Matched = as.numeric(ExtractSmd(tabMatched)),
Weighted = as.numeric(ExtractSmd(tabWeighted)),
WeightedOw = as.numeric(ExtractSmd(tabWeightedOw)))
Error in class(x)[1] %in% c("TableOne", "svyTableOne"): object 'tabUnmatched' not found
## Create long-format data for ggplot2
<- melt(data = dataPlot,
dataPlotMelt id.vars = c("variable"),
variable.name = "Method",
value.name = "SMD")
Error in melt(data = dataPlot, id.vars = c("variable"), variable.name = "Method", : object 'dataPlot' not found
## Order variable names by magnitude of SMD
<- as.character(dataPlot$variable)[order(dataPlot$Unmatched)] varNames
Error in eval(expr, envir, enclos): object 'dataPlot' not found
## Order factor levels in the same order
$variable <- factor(dataPlotMelt$variable,
dataPlotMeltlevels = varNames)
Error in factor(dataPlotMelt$variable, levels = varNames): object 'dataPlotMelt' not found
## Plot using ggplot2
ggplot(data = dataPlotMelt,
mapping = aes(x = variable, y = SMD, group = Method, color = Method)) +
geom_line() +
geom_point() +
geom_hline(yintercept = 0.1, color = "black", size = 0.1) +
coord_flip() +
theme_bw() + theme(legend.key = element_blank())
Error in ggplot(data = dataPlotMelt, mapping = aes(x = variable, y = SMD, : object 'dataPlotMelt' not found
To construct a side-by-side table, data can be extracted as a matrix and combined using the print() method, which actually invisibly returns a matrix.
## Column bind tables
<- cbind(print(tabUnmatched, printToggle = FALSE),
resCombo print(tabMatched, printToggle = FALSE),
print(tabWeighted, printToggle = FALSE),
print(tabWeightedOw, printToggle = FALSE))
Error in h(simpleError(msg, call)): error in evaluating the argument 'x' in selecting a method for function 'print': object 'tabUnmatched' not found
## Add group name row, and rewrite column names
<- rbind(Group = rep(c("No RHC","RHC"), 4), resCombo) resCombo
Error in rbind(Group = rep(c("No RHC", "RHC"), 4), resCombo): object 'resCombo' not found
colnames(resCombo) <- c("Unmatched","","Matched","","MW","","OW","")
Error in colnames(resCombo) <- c("Unmatched", "", "Matched", "", "MW", : object 'resCombo' not found
print(resCombo, quote = FALSE)
Error in h(simpleError(msg, call)): error in evaluating the argument 'x' in selecting a method for function 'print': object 'resCombo' not found
The final analysis can be conducted using matched and weighted data. The results from the matching and matching weight are similar. ShowRegTable() function may come in handly.
## Unmatched model (unadjsuted)
<- glm(formula = (death == "Yes") ~ swang1,
glmUnmatched family = binomial(link = "logit"),
data = rhc)
Error in is.data.frame(data): object 'rhc' not found
## Matched model
<- glm(formula = (death == "Yes") ~ swang1,
glmMatched family = binomial(link = "logit"),
data = rhcMatched)
Error in is.data.frame(data): object 'rhcMatched' not found
## Weighted model
<- svyglm(formula = (death == "Yes") ~ swang1,
glmWeighted family = binomial(link = "logit"),
design = rhcSvy)
Error in .svycheck(design): object 'rhcSvy' not found
## Show results together
<- list(Unmatched = ShowRegTable(glmUnmatched, printToggle = FALSE),
resTogether Matched = ShowRegTable(glmMatched, printToggle = FALSE),
Weighted = ShowRegTable(glmWeighted, printToggle = FALSE))
Error in class(model) %in% c("lme"): object 'glmUnmatched' not found
print(resTogether, quote = FALSE)
Error in h(simpleError(msg, call)): error in evaluating the argument 'x' in selecting a method for function 'print': object 'resTogether' not found