Statistical Modelling 20 (1) (2020), 9–29

Model-based clustering for populations of networks

Mirko Signorelli,
Department of Biomedical Data Sciences,
Leiden University Medical Center,
Leiden,
The Netherlands.
e-mail: m.signorelli@lumc.nl

Ernst C. Wit,
Institute of Computational Science,
Università della Svizzera italiana,
Lugano,
Switzerland.

and

Bernoulli Institute for Mathematics, Computer Science and Artificial Intelligence,
University of Groningen,
Groningen,
The Netherlands.


Abstract:

Until recently obtaining data on populations of networks was typically rare. However, with the advancement of automatic monitoring devices and the growing social and scientific interest in networks, such data has become more widely available. From sociological experiments involving cognitive social structures to fMRI scans revealing large-scale brain networks of groups of patients, there is a growing awareness that we urgently need tools to analyse populations of networks and particularly to model the variation between networks due to covariates. We propose a model-based clustering method based on mixtures of generalized linear (mixed) models that can be employed to describe the joint distribution of a populations of networks in a parsimonious manner and to identify subpopulations of networks that share certain topological properties of interest (degree distribution, community structure, effect of covariates on the presence of an edge, etc.). Maximum likelihood estimation for the proposed model can be efficiently carried out with an implementation of the EM algorithm. We assess the performance of this method on simulated data and conclude with an example application on advice networks in a small business.

Keywords:

cognitive social structure; EM algorithm; Graph; mixture of generalized linear models; model-based clustering; network modelling; population of networks.

Downloads:

Full article can be found here. Example data and code in zipped archive.
back