Statistical Modelling 13 (5&6) (2013), 459–480

Regression tree-based diagnostics for linear multilevel models

Jeffrey S Simonoff
Leonard N. Stern School of Business,
New York University
USA
e-mail: jsimonoff@stern.nyu.edu

Abstract:

Longitudinal and clustered data, where multiple observations for individuals are observed, require special models that reflect their hierarchical structure. The most commonly used such model is the linear multilevel model, which combines a linear model for the population-level fixed effects, a linear model for normally distributed individual-level random effects and normally distributed observation-level errors with constant variance. It has the advantage of simplicity of interpretation, but if the assumptions of the model do not hold inferences drawn can be misleading. In this paper, we discuss the use of regression trees that are designed for multilevel data to construct goodness-of-fit tests for this model that can be used to test for nonlinearity of the fixed effects or heteroscedasticity of the errors. Simulations show that the resultant tests are slightly conservative as 0.05 level tests, and have good power to identify explainable model violations (that is, ones that are related to available covariate information in the data). Application of the tests is illustrated on two real datasets.

Keywords:

clustered data; goodness-of-fit test; heteroscedasticity; longitudinal data; nonlinearity
back