Main content area

Estimating the mean and variance of a high-dimensional normal distribution using a mixture prior

Sinha, Shyamalendu, Hart, Jeffrey D.
Computational statistics & data analysis 2019 v.138 pp. 201-221
algorithms, models, normal distribution, shrinkage, variance, variance covariance matrix
A framework is provided for estimating the mean and variance of a high-dimensional normal density. The main setting considered is a fixed number of vectors following a high-dimensional normal distribution with unknown mean and diagonal covariance matrix. The diagonal covariance matrix can be known or unknown. If the covariance matrix is unknown, the sample size can be as small as 2. The proposed estimator is based on the idea that the unobserved mean/variance pairs across dimensions are drawn from an unknown bivariate distribution, which is modeled as a mixture of normal-inverse gammas. The mixture of normal-inverse gamma distributions provides advantages over more traditional empirical Bayes methods, which are based on a normal–normal model. When fitting a mixture model, the algorithm is essentially clustering the unobserved mean and variance pairs into different groups, with each group having a different normal-inverse gamma distribution. The proposed estimator of each mean is the posterior mean of shrinkage estimates, each of which shrinks a sample mean towards a different component of the mixture distribution. The proposed estimator of variance has an analogous interpretation in terms of sample variances and components of the mixture distribution. If the diagonal covariance matrix is known, then the sample size can be as small as 1, and the pairs of known variances and unknown means across dimensions are treated as random observations coming from a flexible mixture of normal-inverse gamma distributions.