Analysing Longitudinal or Panel Data

This vignette explains the rational behind the demean() function. We give recommendations how to analyze multilevel or hierarchical data structures, when macro-indicators (or level-2 predictors, or higher-level units, or more general: group-level predictors) are used as covariates and the model suffers from heterogeneity bias (Bell and Jones 2015).

Sample data used in this vignette

library(parameters)
data("qol_cancer")

Heterogeneity bias

Heterogeneity bias occurs when group-level predictors vary within and across groups, and hence fixed effects may correlate with group (or random) effects. This is a typical situation when analyzing longitudinal or panel data: Due to the repeated measurements of persons, the “person” (or subject-ID) is now a level-2 variable. Predictors at level-1 (“fixed effects”), e.g. self-rated health or income, now have an effect at level-1 (“within”-effect) and at higher-level units (level-2, the subject-level, which is the “between”-effect) (see also this posting). This inevitably leads to correlating fixed effects and error terms - which, in turn, results in biased estimates, because both the within- and between-effect are captured in one estimate.

Adressing heterogeneity bias: the Fixed Effects Regression (FE) approach

Fixed effects regression models (FE) are a popular approach for panel data analysis in particular in econometrics and considered as gold standard. To avoid the problem of heterogeneity bias, in FE all higher-level variance (and thus, any between-effects), “are controlled out using the higher-level entities themselves, included in the model as dummy variables” (Bell and Jones 2015). As a consequence, FE models are only able estimate within-effects.

To remove between-effects and only model within-effects, the data needs some preparation: de-meaning. De-meaning, or person-mean centering, or centering within clusters, takes away the higher-level mean from the regression equation, and as such, FE avoids estimating a parameter for each higher-level unit.

Computing the de-meaned and group-meaned variables

qol_cancer <- cbind(
  qol_cancer,
  demean(qol_cancer, select = c("phq4", "QoL"), group = "ID")
)

Now we have:

A FE model is a classical linear model, where

fe_model1 <- lm(
  QoL ~ 0 + time + phq4_within + ID,
   data = qol_cancer
)
# we use only the first two rows, because the remaining rows are
# the estimates for "ID", which is not of interest here...
model_parameters(fe_model1)[1:2, ]
#> Parameter   | Coefficient |   SE |         95% CI |     t |  df |      p
#> ------------------------------------------------------------------------
#> time        |        1.09 | 0.64 | [-0.17,  2.34] |  1.70 | 374 | 0.089 
#> phq4_within |       -3.66 | 0.41 | [-4.46, -2.86] | -8.95 | 374 | < .001


# instead of removing the intercept, we could also use the 
# de-meaned response...
fe_model2 <- lm(
  QoL_within ~ time + phq4_within + ID,
   data = qol_cancer
)
model_parameters(fe_model2)[2:3, ]
#> Parameter   | Coefficient |   SE |         95% CI |     t |  df |      p
#> ------------------------------------------------------------------------
#> time        |        1.09 | 0.64 | [-0.17,  2.34] |  1.70 | 374 | 0.089 
#> phq4_within |       -3.66 | 0.41 | [-4.46, -2.86] | -8.95 | 374 | < .001


# we compare the results with those from the "lfe"-package for panel data
library(lfe)
fe_model3 <- felm(
  QoL ~ time + phq4_within | ID,
   data = qol_cancer
)
model_parameters(fe_model3)
#> Parameter   | Coefficient |   SE |         95% CI |     t |  df |      p
#> ------------------------------------------------------------------------
#> time        |        1.09 | 0.64 | [-0.16,  2.34] |  1.70 | 374 | 0.089 
#> phq4_within |       -3.66 | 0.41 | [-4.46, -2.86] | -8.95 | 374 | < .001

As we can see, the within-effect of PHQ-4 is -3.66, hence the mean of the change for an average individual case in our sample (or, the “net” effect), is -3.66.

But what about the between-effect? How do people with higher PHQ-4 score differ from people with lower PHQ-4 score? Or what about educational inequalities? Do higher educated people have a higher PHQ-4 score than lower educated people?

This question cannot be answered with FE regression. But: “Can one fit a multilevel model with varying intercepts (or coefficients) when the units and predictors correlate? The answer is yes. And the solution is simple.” (Bafumi and Gelman 2006)

Adressing heterogeneity bias: the Mixed Model approach

Mixed models include different levels of sources of variability (i.e. error terms at each level). Predictors used at level-1 that are varying across higher-level units will thus have residual errors at both level-1 and higher-level units. “Such covariates contain two parts: one that is specific to the higher-level entity that does not vary between occasions, and one that represents the difference between occasions, within higher-level entities” (Bell and Jones 2015). Hence, the error terms will be correlated with the covariate, which violates one of the assumptions of mixed models (iid, independent and identically distributed error terms) - also known and described above as heterogeneity bias.

But how can this issue be addressed outside the FE framework?

There are several ways how to address this using a mixed models approach:

For now, we will follow the last recommendation and use the within- and between-version of phq4.

library(lme4)
mixed_1 <- lmer(
  QoL ~ time + phq4_within + phq4_between + (1 | ID),
   data = qol_cancer
)
model_parameters(mixed_1)
#> Parameter   | Coefficient |   SE |         95% CI |     t |  df |      p
#> ------------------------------------------------------------------------
#> (Intercept) |       71.53 | 1.56 | [68.48, 74.58] | 45.98 | 558 | < .001
#> time        |        1.09 | 0.64 | [-0.16,  2.34] |  1.70 | 558 | 0.088 
#> 
#> # Within-Effects 
#> 
#> Parameter   | Coefficient |   SE |         95% CI |     t |  df |      p
#> ------------------------------------------------------------------------
#> phq4_within |       -3.66 | 0.41 | [-4.46, -2.86] | -8.95 | 558 | < .001
#> 
#> # Between-Effects 
#> 
#> Parameter    | Coefficient |   SE |         95% CI |      t |  df |      p
#> --------------------------------------------------------------------------
#> phq4_between |       -6.28 | 0.50 | [-7.27, -5.30] | -12.53 | 558 | < .001

# compare to FE-model
model_parameters(fe_model1)[1:2, ]
#> Parameter   | Coefficient |   SE |         95% CI |     t |  df |      p
#> ------------------------------------------------------------------------
#> time        |        1.09 | 0.64 | [-0.17,  2.34] |  1.70 | 374 | 0.089 
#> phq4_within |       -3.66 | 0.41 | [-4.46, -2.86] | -8.95 | 374 | < .001

As we can see, the estimates and standard errors are identical. The argument against the use of mixed models, i.e. that using mixed models for panel data will yield biased estimates and standard errors, is based on an incorrect model specification (Mundlak 1978). As such, when the (mixed) model is properly specified, the estimator of the mixed model is identical to the ‘within’ (i.e. FE) estimator.

As a consequence, we cannot only use the above specified mixed model for panel data, we can even specify more complex models including within-effects, between-effects or random effects variation. A mixed models approach can model the causes of endogeneity explicitly by including the (separated) within- and between-effects of time-varying fixed effects and including time-constant fixed effects.

mixed_2 <- lmer(
  QoL ~ time + phq4_within + phq4_between + education + (1 + time | ID),
   data = qol_cancer
)
model_parameters(mixed_2)
#> Parameter        | Coefficient |   SE |         95% CI |     t |  df |      p
#> -----------------------------------------------------------------------------
#> (Intercept)      |       67.36 | 2.48 | [62.49, 72.22] | 27.15 | 554 | < .001
#> time             |        1.09 | 0.66 | [-0.20,  2.38] |  1.65 | 554 | 0.098 
#> education [mid]  |        5.01 | 2.35 | [ 0.41,  9.61] |  2.14 | 554 | 0.033 
#> education [high] |        5.52 | 2.75 | [ 0.12, 10.91] |  2.00 | 554 | 0.045 
#> 
#> # Within-Effects 
#> 
#> Parameter   | Coefficient |   SE |         95% CI |     t |  df |      p
#> ------------------------------------------------------------------------
#> phq4_within |       -3.72 | 0.41 | [-4.52, -2.92] | -9.10 | 554 | < .001
#> 
#> # Between-Effects 
#> 
#> Parameter    | Coefficient |   SE |         95% CI |      t |  df |      p
#> --------------------------------------------------------------------------
#> phq4_between |       -6.13 | 0.52 | [-7.14, -5.11] | -11.84 | 554 | < .001

For more complex models, within-effects will naturally change slightly and are no longer identical to simpler FE models. This is no “bias”, but rather the result of building more complex models: FE models lack information of variation in the group-effects or between-subject effects. Furthermore, FE models cannot include random slopes, which means that fixed effects regressions are neglecting “cross-cluster differences in the effects of lower-level controls (which) reduces the precision of estimated context effects, resulting in (…) low statistical power” (Heisig, Schaeffer, and Giesecke 2017).

Conclusion: Complex Random Effects Within-Between Models

Depending on the structure of the data, the best approach to analyzing panel data would be a so called “complex random effects within-between” model (Bell, Fairbrother, and Jones 2018):

yit = β0 + β1W (xit - ͞xi) + β2B ͞xi + β3 zi + υi0 + υi1 (xit - ͞xi) + εit

In R-code, the model is written down like this:

rewb <- lmer(
  QoL ~ time + phq4_within + phq4_between + education + 
    (1 + time | ID) + (1 + phq4_within | ID),
   data = qol_cancer
)

The benefit of this kind of model is that you have information on within-, between- and other time-constant (i.e. between) effects…

model_parameters(rewb)
#> Parameter        | Coefficient |   SE |         95% CI |     t |  df |      p
#> -----------------------------------------------------------------------------
#> (Intercept)      |       67.18 | 2.39 | [62.50, 71.86] | 28.13 | 551 | < .001
#> time             |        1.18 | 0.60 | [ 0.00,  2.37] |  1.95 | 551 | 0.051 
#> education [mid]  |        4.95 | 2.35 | [ 0.35,  9.55] |  2.11 | 551 | 0.035 
#> education [high] |        5.62 | 2.76 | [ 0.21, 11.02] |  2.04 | 551 | 0.042 
#> 
#> # Within-Effects 
#> 
#> Parameter   | Coefficient |   SE |         95% CI |     t |  df |      p
#> ------------------------------------------------------------------------
#> phq4_within |       -4.50 | 0.58 | [-5.63, -3.37] | -7.78 | 551 | < .001
#> 
#> # Between-Effects 
#> 
#> Parameter    | Coefficient |   SE |         95% CI |      t |  df |      p
#> --------------------------------------------------------------------------
#> phq4_between |       -6.11 | 0.52 | [-7.13, -5.10] | -11.81 | 551 | < .001

… but you can also model the variation of (group) effects across time (and probably space), and you can even include more higher-level units (e.g. nested design or cross-classified design with more than two levels):

random_parameters(rewb)
#> # Random Effects
#> 
#> Within-Group Variance              119.47 (10.93)
#> Between-Group Variance
#>   Random Intercept (ID)             107.5 (10.37)
#>   Random Intercept (ID.1)           25.76  (5.08)
#>   Random Slope (ID.time)             0.49   (0.7)
#>   Random Slope (ID.1.phq4_within)   14.37  (3.79)
#> Correlations
#>   ID.time                           -0.99
#>   ID.phq4_within                     0.44
#> N (groups per factor)
#>   ID                                  188
#> Observations                          564

References

Bafumi, Joseph, and Andrew Gelman. 2006. “Fitting Multilevel Models When Predictors and Group Effects Correlate.” In. Philadelphia, PA.

Bell, Andrew, Malcolm Fairbrother, and Kelvyn Jones. 2018. “Fixed and Random Effects Models: Making an Informed Choice.” Quality & Quantity, August. https://doi.org/10.1007/s11135-018-0802-x.

Bell, Andrew, and Kelvyn Jones. 2015. “Explaining Fixed Effects: Random Effects Modeling of Time-Series Cross-Sectional and Panel Data.” Political Science Research and Methods 3 (1): 133–53. https://doi.org/10.1017/psrm.2014.7.

Gelman, Andrew, and Jennifer Hill. 2007. Data Analysis Using Regression and Multilevel/Hierarchical Models. Analytical Methods for Social Research. Cambridge; New York: Cambridge University Press.

Heisig, Jan Paul, Merlin Schaeffer, and Johannes Giesecke. 2017. “The Costs of Simplicity: Why Multilevel Models May Benefit from Accounting for Cross-Cluster Differences in the Effects of Controls.” American Sociological Review 82 (4): 796–827. https://doi.org/10.1177/0003122417717901.

Mundlak, Yair. 1978. “On the Pooling of Time Series and Cross Section Data.” Econometrica 46 (1): 69.

Shor, Boris, Joseph Bafumi, Luke Keele, and David Park. 2007. “A Bayesian Multilevel Modeling Approach to Time-Series Cross-Sectional Data.” Political Analysis 15 (2): 165–81. https://doi.org/10.1093/pan/mpm006.