Statistical Modelling 12 (2012), 93–115

Variational Bayesian inference and complexity control for stochastic block models

Pierre Latouche
Laboratoire Statistique et Génome,
Tour Evry 2,
523 place des terrasses de l’Agora
F–91000 Evry
France
eMail: pierre.latouche@genopole.cnrs.fr

E Birmelé and C Ambroise
Laboratoire Statistique et Génome,
UMR CNRS 8071, UEVE

Abstract:

It is now widely accepted that knowledge can be acquired from networks by clustering their vertices according to the connection profiles. Many methods have been proposed and in this paper we concentrate on the Stochastic Block Model (SBM). The clustering of vertices and the estimation of SBM model parameters have been subject to previous work, and numerous inference strategies such as variational expectation maximization (EM) and classification EM have been proposed. However, SBM still suffers from a lack of criteria to estimate the number of components in the mixture. To our knowledge, only one model-based criterion, Integrated Complete-data Likelihood (ICL), has been derived for SBM in the literature. It relies on an asymptotic approximation of the integrated complete-data likelihood and recent studies have shown that it tends to be too conservative in the case of small networks. To tackle this issue, we propose a new criterion that we call Integrated Likelihood Variational Bayes (ILvb), based on a non-asymptotic approximation of the marginal likelihood. We describe how the criterion can be computed through a variational Bayes EM algorithm.

Keywords:

random graphs; stochastic block models; community detection; variational EM; variational Bayes EM; integrated complete-data likelihood; integrated observed-data likelihood

Downloads:

Code in R package `mixer'.
back