Statistical Modelling 6 (2006), 2342
Estimation in generalised linear mixed models with binary outcomes
by simulated maximum likelihood
Edmond S.W. Ng
Centre for Multilevel Modelling,
Graduate School of Education,
University of Bristol,
35 Berkeley Square,
Bristol BS8 1JA
UK
eMail: edmondngsw@yahoo.com
James R. Carpenter
Medical Statistics Unit,
London School of Hygiene and Tropical Medicine,
University of London
Harvey Goldstein and Jon Rasbash
Centre for Multilevel Modelling,
Graduate School of Education,
University of Bristol
Abstract:
Fitting multilevel models to discrete outcome data is problematic
because the discrete distribution of the response variable implies
an analytically intractable log-likelihood function. Among a number
of approximate methods proposed, second-order penalised quasi-likelihood
(PQL) is commonly used and is one of the most accurate. Unfortunately,
even the second-order PQL approximation has been shown to produce
estimates biased toward zero in certain circumstances. This bias
can be marked especially when the data are sparse. One option to
reduce this bias is to use Monte-Carlo simulation. A bootstrap bias
correction method proposed by Kuk has been implemented in MLwiN.
However, a similar technique based on the Robbins–Monro (RM)
algorithm is potentially more efficient. An alternative is to use
simulated maximum likelihood (SML), either alone or to refine
estimates identified by other methods. In this article, we first
compare bias correction using the RM algorithm, Kuk's method and
SML. We find that SML performs as efficiently as the other two
methods and also yields standard errors of the bias-corrected
parameter estimates and an estimate of the log-likelihood at
the maximum, with which nested models can be compared. Secondly,
using simulated and real data examples, we compare SML, second-order
Laplace approximation (as implemented in HLM), Markov Chain
Monte-Carlo (MCMC) (in MLwiN) and numerical integration using
adaptive quadrature methods (in Stata's GLLAMM and in SAS's
proc NLMIXED). We find that when the data are sparse, the
second-order Laplace approximation produces markedly lower
parameter estimates, whereas the MCMC method produces
estimates that are noticeably higher than those from the
SML and quadrature methods. Although proc NLMIXED is much
faster than GLLAMM, it is not designed to fit models of more
than two levels. SML produces parameter estimates and
log-likelihoods very similar to those from quadrature methods.
Further our SML approach extends to handle other link functions,
discrete data distributions, non-normal random effects and
higher-level models.
Keywords:
Bias correction; Kuk's method; Monte-Carlo integration;
numerical integration; Robbins-Monro algorithm;
simulated maximum likelihood
Downloads:
Data
and Matlab code in zipped archive
back