Luyts, Molenberghs, Verbeke, Matthijs, Ribeiro Jr, GB Demétrio, Hinde 2019

Statistical Modelling 19 (5) (2019), 569–589

A Weibull-count approach for handling under- and overdispersed longitudinal/clustered data structures

Martial Luyts,
Interuniversity Institute for Biostatistics and Statistical Bioinformatics,
KU Leuven and Universiteit Hasselt,
Leuven,
Belgium.
e-mail: martial.luyts@kuleuven.be

Geert Molenberghs,
Interuniversity Institute for Biostatistics and Statistical Bioinformatics,
KU Leuven and Universiteit Hasselt,
Leuven,
Belgium.

Geert Verbeke,
Interuniversity Institute for Biostatistics and Statistical Bioinformatics,
KU Leuven and Universiteit Hasselt,
Leuven,
Belgium.

Koen Matthijs,
Family and Population Studies,
KU Leuven, Leuven,
Belgium.

Eduardo E Ribeiro Jr,
ESALQ, Piracicaba,
University of São Paulo,
São Paulo,
Brazil.

Clarice GB Demétrio,
ESALQ, Piracicaba,
University of São Paulo,
São Paulo,
Brazil.

John Hinde,
School of Mathematics, Statistics and Applied Mathematics,
NUI Galway,
Galway,
Ireland.

Abstract:

A Weibull-model-based approach is examined to handle under- and overdispersed discrete data in a hierarchical framework. This methodology was first introduced by Nakagawa and Osaki (1975, IEEE Transactions on Reliability, 24, 300–301), and later examined for under- and overdispersion by Klakattawi et al. (2018, Entropy, 20, 142) in the univariate case. Extensions to hierarchical approaches with under- and overdispersion were left unnoted, even though they can be obtained in a simple manner. This is of particular interest when analysing clustered/longitudinal data structures, where the underlying correlation structure is often more complex compared to cross-sectional studies. In this article, a random-effects extension of the Weibull-count model is proposed and applied to two motivating case studies, originating from the clinical and sociological research fields. A goodness-of-fit evaluation of the model is provided through a comparison of some well-known count models, that is, the negative binomial, Conway–Maxwell–Poisson and double Poisson models. Empirical results show that the proposed extension flexibly fits the data, more specifically, for heavy-tailed, zero-inflated, overdispersed and correlated count data. Discrete left-skewed time-to-event data structures are also flexibly modelled using the approach, with the ability to derive direct interpretations on the median scale, provided the complementary log–log link is used. Finally, a large simulated set of data is created to examine other characteristics such as computational ease and orthogonality properties of the model, with the conclusion that the approach behaves best for highly overdispersed cases.

Keywords:

longitudinal profiles; clustering; dispersion; random effects; weibull-count approach.

Downloads:

Example data and code in zipped archive.

back