Statistical Modelling 15 (1) (2015), 24–50

Variable selection in joint modelling of the mean and variance for hierarchical data

Christiana Charalambous
School of Mathematics,
The University of Manchester,
Manchester,
UK
e-mail: christiana.charalambous@manchester.ac.uk

Jianxin Pan
School of Mathematics,
The University of Manchester,
Manchester,
UK


Mark Tranmer
School of Social Sciences,
The University of Manchester,
Manchester,
UK


Abstract:

We propose to extend the use of penalized likelihood variable selection to hierarchical generalized linear models (HGLMs) for jointly modelling the mean and variance structures. We assume a two-level hierarchical data structure, with subjects nested within groups. A generalized linear mixed model (GLMM) is fitted for the mean, with a structured dispersion in the form of a generalized linear model (GLM) for the between-group variation. To do variable selection, we use the smoothly clipped absolute deviation (SCAD) penalty, which simultaneously shrinks the coefficients of redundant variables to 0 and estimates the coefficients of the remaining important covariates. We run simulation studies and real data analysis for the joint mean–variance models, to assess the performance of the proposed procedure against a similar process which excludes variable selection. The results indicate that our method can successfully identify the zero/non-zero components in our models and can also significantly improve the efficiency of the resulting penalized estimates.

Keywords:

Generalized linear mixed models; H-likelihood; Mean-covariance modelling; Multilevel data; Smoothly clipped absolute deviation penalty.

Downloads:

Example data in zipped archive
back