fwildclusterboot

Lifecycle: experimental CRAN status R-CMD-check Codecov test coverage

The fwildclusterboot package is an R port of STATA’s boottest package.

It implements the fast wild cluster bootstrap algorithm developed in Roodman et al (2019) for regression objects in R. It currently works for regression objects of type lm, felm and fixest from base R and the lfe and fixest packages.

The package’s central function is boottest(). It allows the user to test two-sided, univariate hypotheses using a wild cluster bootstrap. Importantly, it uses the “fast” algorithm developed in Roodman et al, which makes it feasible to calculate test statistics based on a large number of bootstrap draws even for large samples – as long as the number of bootstrapping clusters is not too large.

The fwildclusterboot package currently supports multi-dimensional clustering and one-dimensional, two-sided hypotheses. It supports regression weights, multiple distributions of bootstrap weights, fixed effects, restricted (WCR) and unrestricted (WCU) bootstrap inference and subcluster bootstrapping for few treated clusters (MacKinnon & Webb, (2018)).

The boottest() function

library(fixest)
library(fwildclusterboot)

data(voters)

# fit the model via fixest::feols(), lfe::felm() or stats::lm()
feols_fit <- feols(proposition_vote ~ treatment  + log_income | Q1_immigration + Q2_defense, data = voters)

# bootstrap inference via boottest()
feols_boot <- boottest(feols_fit, clustid = c("group_id1"), B = 9999, param = "treatment")

summary(feols_boot)
#> boottest.fixest(object = feols_fit, clustid = c("group_id1"), 
#>     param = "treatment", B = 9999)
#>  
#>  Hypothesis: 1*treatment = 0
#>  Observations: 300
#>   Bootstr. Type: rademacher
#>  Clustering: 1-way
#>  Confidence Sets: 95%
#>  Number of Clusters: 40
#> 
#>              term estimate statistic p.value conf.low conf.high
#> 1 1*treatment = 0    0.079     4.123       0    0.039     0.118

For a longer introduction to the package’s key function, boottest(), please follow this link.

Benchmarks

Results of timing benchmarks of boottest(), with a sample of N = 50000, k = 19 covariates and one cluster of dimension N_G (10 iterations each).

For a small number of clusters, fwildclusterboot is in generally faster than implementations of the wild cluster bootstrap in the sandwich and clusterSEs packages.

Installation

You can install fwildclusterboot from CRAN or the development version from github by following the steps below:

# from CRAN 
install.packages("fwildclusterboot")

# dev version from github
# note: installation requires Rtools
library(devtools)
install_github("s3alfisc/fwildclusterboot")