Statistical Modelling 19 (2) (2019), 109–139

Merging the components of a finite mixture using posterior probabilities

Marc Comas-Cufí
Department of Computer Science,
Applied Mathematics and Statistics,
Polytechnic School,
University of Girona,
Spain.
e-mail: josepantoni.martin@udg.edu

Josep A Martín-Fernández
Department of Computer Science,
Applied Mathematics and Statistics,
Polytechnic School,
University of Girona,
Spain.


Glòria Mateu-Figueras
Department of Computer Science,
Applied Mathematics and Statistics,
Polytechnic School,
University of Girona,
Spain.


Abstract:

Methods in parametric cluster analysis commonly assume data can be modelled by means of a finite mixture of distributions. However, associating each mixture component to one cluster is frequently misleading because different mixture components can overlap, and then, associated clusters can overlap too suggesting a unique cluster. A number of approaches have already been proposed to construct the clusters by merging components using the posterior probabilities. This article presents a generic approach for building a hierarchy of mixture components that integrates and generalizes some techniques proposed earlier in the literature. Using this proposal, two new techniques based on the log-ratio of posterior probabilities are introduced. Moreover, to decide the final number of clusters, two new methods are presented. Simulated and real datasets are used to illustrate this methodology.

Keywords:

Hierarchical clustering; Log-ratio; merging components; mixture model; model-based clustering; simplex.

Downloads:

Example data and code in zipped archive.
back