PubAg

Main content area

Addressing overfitting and underfitting in Gaussian model-based clustering

Author:
Andrews, Jeffrey L.
Source:
Computational statistics & data analysis 2018 v.127 pp. 160-171
ISSN:
0167-9473
Subject:
algorithms, cluster analysis, models
Abstract:
The expectation–maximization (EM) algorithm is a common approach for parameter estimation in the context of cluster analysis using finite mixture models. This approach suffers from the well-known issue of convergence to local maxima, but also the less obvious problem of overfitting. These combined, and competing, concerns are illustrated through simulation and then addressed by introducing an algorithm that augments the traditional EM with the nonparametric bootstrap. Further simulations and applications to real data lend support for the usage of this bootstrap augmented EM-style algorithm to avoid both overfitting and local maxima.
Agid:
5972336