Statistical Modelling 23 (1) (2023), 5380
Mixed effect modelling and variable selection for quantile regression
Haim Bar,
Department of Statistics,
University of Connecticut,
Storrs,
CT,
USA.
James G Booth,
Department of Statistics and Data Science,
Cornell University,
Ithaca,
NY,
USA.
e-mail: jim.booth@cornell.edu
Martin T Wells,
Department of Statistics and Data Science,
Cornell University,
Ithaca,
NY,
USA.
Abstract:
It is known that the estimating equations for quantile regression (QR) can be
solved using an EM algorithm in which the M-step is computed via
weighted least squares, with weights computed at the E-step as the
expectation of independent generalized inverse-Gaussian variables.
This fact is exploited here to extend QR to allow for random
effects in the linear predictor. Convergence of the algorithm in this setting is established by showing that it is a generalized alternating
minimization (GAM) procedure. Another modification of the EM algorithm
also allows us to adapt a recently proposed method for variable
selection in mean regression models to the QR setting.
Simulations show the resulting method significantly outperforms
variable selection in QR models using the lasso penalty. Applications to real data include
a frailty QR analysis of hospital stays, and variable selection for age at
onset of lung cancer and for riboflavin production rate using
high-dimensional gene expression arrays for prediction.
Keywords:
Expectation-maximization (EM) algorithm,
Generalized alternating minimization (GAM) algorithm,
high-dimensional estimation,
mixture model,
mixed effects regression,
model diagnostics,
variable selection.
Downloads:
Code and data in zipped archive.
back