Statistical Modelling 18 (1) (2018), 3–23

Stochastic variable selection strategies for zero-inflated models

Eva Cantoni
Research Center for Statistics and Geneva School of Economics and Management,
University of Geneva,
Geneva,
Switzerland
e-mail: Eva.Cantoni@unige.ch

Marie Auda
Research Center for Statistics and Geneva School of Economics and Management,
University of Geneva,
Geneva,
Switzerland


Abstract:

When count data exhibit excess zero, that is more zero counts than a simpler parametric distribution can model, the zero-inflated Poisson (ZIP) or zero-inflated negative binomial (ZINB) models are often used. Variable selection for these models is even more challenging than for other regression situations because the availability of p covariates implies 4p possible models. We adapt to zero-inflated models an approach for variable selection that avoids the screening of all possible models. This approach is based on a stochastic search through the space of all possible models, which generates a chain of interesting models. As an additional novelty, we propose three ways of extracting information from this rich chain and we compare them in two simulation studies, where we also contrast our approach with regularization (penalized) techniques available in the literature. The analysis of a typical dataset that has motivated our research is also presented, before concluding with some recommendations.

Keywords:

excess zeros; ZI model; Hurdle model; variable selection; stochastic search.

Downloads:

Example data and code in zipped archive.
back