Statistical Modelling 14 (3) (2014), 229–255

A confusion index for measuring separation and clustering

Nicholas T Longford
SNTL and UPF,
Barcelona,
Spain
e-mail: sntlnick@sntl.co.uk

Jitka Bartošová
Vysoká Škola Ekonomická v Praze,
Jindřichüv Hradec,
Czech Republic


Abstract:

An index for characterizing the separation of two distributions is introduced. It is applied to assessing whether mixture components are clusters. A related property of being a satellite and a partial ordering of the components are defined. A sequence of clustering structures is defined for a finite mixture with a continuum of thresholds that qualify a cluster. The approach is suitable for outcomes with arbitrary univariate or multivariate distributions and their mixtures. The properties of the index are explored by simulations and on examples.

Keywords:

clusters; confusion; mixture; satellite separation

Downloads:

Example data in zipped archive
back