Jump to Main Content
Mixtures of generalized hyperbolic distributions and mixtures of skew-t distributions for model-based clustering with incomplete data
- Wei, Yuhong, Tang, Yang, McNicholas, Paul D.
- Computational statistics & data analysis 2019 v.130 pp. 18-41
- algorithms, data collection, models
- Robust clustering from incomplete data is an important topic because, in many practical situations, real datasets are heavy-tailed, asymmetric, and/or have arbitrary patterns of missing observations. Flexible methods and algorithms for model-based clustering are presented via mixture of the generalized hyperbolic distributions and its limiting case, the mixture of multivariate skew-t distributions. An analytically feasible EM algorithm is formulated for parameter estimation and imputation of missing values for mixture models employing missing at random mechanisms. The proposed methodologies are investigated through a simulation study with varying proportions of synthetic missing values and illustrated using a real dataset. Comparisons are made with those obtained from the traditional mixture of generalized hyperbolic distribution counterparts by filling in the missing data using the mean imputation method.