Statistical Modelling 18 (1) (2018), 50–72

Robust mixtures of factor analysis models using the restricted multivariate skew-t distribution

Tsung-I Lin
Institute of Statistics,
National Chung Hsing University,
Taichung 402,
Taiwan
e-mail: tilin@nchu.edu.tw

and

Department of Public Health,
China Medical University,
Taichung 404,
Taiwan


Wan-Lun Wang Department of Statistics,
Feng Chia University,
Taichung 407,
Taiwan


Geoffrey J. McLachlan
Department of Mathematics,
University of Queensland,
St Lucia, 4072,
Australia


Sharon X. Lee
Department of Mathematics,
University of Queensland,
St Lucia, 4072,
Australia


Abstract:

This article introduces a robust extension of the mixture of factor analysis models based on the restricted multivariate skew-t distribution, called mixtures of skew-t factor analysis (MSTFA) model. This model can be viewed as a powerful tool for model-based clustering of high-dimensional data where observations in each cluster exhibit non-normal features such as heavy-tailed noises and extreme skewness. Missing values may be frequently present due to the incomplete collection of data. A computationally feasible EM-type algorithm is developed to carry out maximum likelihood estimation and create single imputation of possible missing values under a missing at random mechanism. The numbers of factors and mixture components are determined via penalized likelihood criteria. The utility of our proposed methodology is illustrated through analysing both simulated and real datasets. Numerical results are shown to perform favourably compared to existing approaches.

Keywords:

model-based clustering; Factor analysis; heavy tails; missing values; rMST distribution; robustness.

Downloads:

Example data and code in zipped archive.
back