Quotation Malsiner-Walli, Gertraud, Frühwirth-Schnatter, Sylvia, Grün, Bettina. 2016. Model-based clustering based on sparse finite Gaussian mixtures. Statistics and Computing 26 (1), 303-324.




In the framework of Bayesian model-based clustering based on a finite mixture of Gaussian distributions, we present a joint approach to estimate the number of mixture components and identify cluster-relevant variables simultaneously as well as to obtain an identified model. Our approach consists in specifying sparse hierarchical priors on the mixture weights and component means. In a deliberately overfitting mixture model the sparse prior on the weights empties superfluous components during MCMC. A straightforward estimator for the true number of components is given by the most frequent number of non-empty components visited during MCMC sampling. Specifying a shrinkage prior, namely the normal gamma prior, on the component means leads to improved parameter estimates as well as identification of cluster-relevant variables. After estimating the mixture model using MCMC methods based on data augmentation and Gibbs sampling, an identified model is obtained by relabeling the MCMC output in the point process representation of the draws. This is performed using K-centroids cluster analysis based on the Mahalanobis distance. We evaluate our proposed strategy in a simulation setup with artificial data and by applying it to benchmark data sets.


Press 'enter' for creating the tag

Publication's profile

Status of publication Published
Affiliation WU
Type of publication Journal article
Journal Statistics and Computing
Citation Index SCI
WU-Journal-Rating new FIN-A, VW-D
Language English
Title Model-based clustering based on sparse finite Gaussian mixtures
Volume 26
Number 1
Year 2016
Page from 303
Page to 324
URL http://link.springer.com/article/10.1007%2Fs11222-014-9500-2
DOI http://dx.doi.org/10.1007/s11222-014-9500-2


Malsiner-Walli, Gertraud (Details)
Frühwirth-Schnatter, Sylvia (Details)
Grün, Bettina (Details)
Institute for Statistics and Mathematics IN (Details)
Google Scholar: Search