Statistical Modelling 6 (2006), 23–42

Estimation in generalised linear mixed models with binary outcomes by simulated maximum likelihood

Edmond S.W. Ng
Centre for Multilevel Modelling,
Graduate School of Education,
University of Bristol,
35 Berkeley Square,
Bristol BS8 1JA
UK
eMail: edmondngsw@yahoo.com

James R. Carpenter
Medical Statistics Unit,
London School of Hygiene and Tropical Medicine,
University of London

Harvey Goldstein and Jon Rasbash
Centre for Multilevel Modelling,
Graduate School of Education,
University of Bristol

Abstract:

Fitting multilevel models to discrete outcome data is problematic because the discrete distribution of the response variable implies an analytically intractable log-likelihood function. Among a number of approximate methods proposed, second-order penalised quasi-likelihood (PQL) is commonly used and is one of the most accurate. Unfortunately, even the second-order PQL approximation has been shown to produce estimates biased toward zero in certain circumstances. This bias can be marked especially when the data are sparse. One option to reduce this bias is to use Monte-Carlo simulation. A bootstrap bias correction method proposed by Kuk has been implemented in MLwiN. However, a similar technique based on the Robbins–Monro (RM) algorithm is potentially more efficient. An alternative is to use simulated maximum likelihood (SML), either alone or to refine estimates identified by other methods. In this article, we first compare bias correction using the RM algorithm, Kuk's method and SML. We find that SML performs as efficiently as the other two methods and also yields standard errors of the bias-corrected parameter estimates and an estimate of the log-likelihood at the maximum, with which nested models can be compared. Secondly, using simulated and real data examples, we compare SML, second-order Laplace approximation (as implemented in HLM), Markov Chain Monte-Carlo (MCMC) (in MLwiN) and numerical integration using adaptive quadrature methods (in Stata's GLLAMM and in SAS's proc NLMIXED). We find that when the data are sparse, the second-order Laplace approximation produces markedly lower parameter estimates, whereas the MCMC method produces estimates that are noticeably higher than those from the SML and quadrature methods. Although proc NLMIXED is much faster than GLLAMM, it is not designed to fit models of more than two levels. SML produces parameter estimates and log-likelihoods very similar to those from quadrature methods. Further our SML approach extends to handle other link functions, discrete data distributions, non-normal random effects and higher-level models.

Keywords:

Bias correction; Kuk's method; Monte-Carlo integration; numerical integration; Robbins-Monro algorithm; simulated maximum likelihood
 

Downloads:

Data and Matlab code in zipped archive


back