Statistical Modelling 19 (2) (2019), 109139
Merging the components of a finite mixture using posterior probabilities
Marc Comas-Cufí
Department of Computer Science,
Applied Mathematics and Statistics,
Polytechnic School,
University of Girona,
Spain.
e-mail: josepantoni.martin@udg.edu
Josep A Martín-Fernández
Department of Computer Science,
Applied Mathematics and Statistics,
Polytechnic School,
University of Girona,
Spain.
Glòria Mateu-Figueras
Department of Computer Science,
Applied Mathematics and Statistics,
Polytechnic School,
University of Girona,
Spain.
Abstract:
Methods in parametric cluster analysis commonly assume data can be modelled by means of a finite mixture of distributions. However, associating each mixture component to one cluster is frequently misleading because different mixture components can overlap, and then, associated clusters can overlap too suggesting a unique cluster. A number of approaches have already been proposed to construct the clusters by merging components using the posterior probabilities. This article presents a generic approach for building a hierarchy of mixture components that integrates and generalizes some techniques proposed earlier in the literature. Using this proposal, two new techniques based on the log-ratio of posterior probabilities are introduced. Moreover, to decide the final number of clusters, two new methods are presented. Simulated and real datasets are used to illustrate this methodology.
Keywords:
Hierarchical clustering; Log-ratio; merging components; mixture model; model-based clustering; simplex.
Downloads:
Example data and code in
zipped archive.
back