Statistical Modelling 21 (6) (2021), 520–545

Poisson–Tweedie mixed-effects model: A flexible approach for the analysis of longitudinal RNA-seq data

Mirko Signorelli,
Department of Biomedical Data Sciences,
Leiden University Medical Center,
The Netherlands.
e-mail: m.signorelli@lumc.nl

Pietro Spitali,
Department of Human Genetics,
Leiden University Medical Center,
The Netherlands.
Roula Tsonaka,
Department of Biomedical Data Sciences,
Leiden University Medical Center,
The Netherlands.

Abstract:

We present a new modelling approach for longitudinal overdispersed counts that is motivated by the increasing availability of longitudinal RNA-sequencing experiments. The distribution of RNA-seq counts typically exhibits overdispersion, zero-inflation and heavy tails; moreover, in longitudinal designs repeated measurements from the same subject are typically (positively) correlated. We propose a generalized linear mixed model based on the Poisson–Tweedie distribution that can flexibly handle each of the aforementioned features of longitudinal overdispersed counts. We develop a computational approach to accurately evaluate the likelihood of the proposed model and to perform maximum likelihood estimation. Our approach is implemented in the R package ptmixed, which can be freely downloaded from CRAN. We assess the performance of ptmixed on simulated data, and we present an application to a dataset with longitudinal RNA-sequencing measurements from healthy and dystrophic mice. The applicability of the Poisson–Tweedie mixed-effects model is not restricted to longitudinal RNA-seq data, but it extends to any scenario where non-independent measurements of a discrete overdispersed response variable are available.

Keywords:

generalized linear mixed model, heavy tails, high-throughput sequencing, longitudinal count data, overdispersion, zero-inflation

Downloads:

Supplementary material (PDF); data in zipped archive.
back