Main content area

Missing covariate data in generalized linear mixed models with distribution-free random effects

Liu, Li, Xiang, Liming
Computational statistics & data analysis 2019 v.134 pp. 1-16
data collection, forest health, monitoring
We consider generalized linear mixed models in which random effects are free of parametric distributions and missing at random data are present in some covariates. To overcome the problem of missing data, we propose two novel methods relying on auxiliary variables: a penalized conditional likelihood method when covariates are independent of random effects, and a two-step procedure consisting of a pairwise likelihood for estimating fixed effects in the first step and a penalized conditional likelihood for estimating random effects in the second step while covariates can be related to random effects. Our methods allow a nonparametric structure for the missing covariate data and do not rely on distribution assumptions for random effects, which are not observed in the data, thus providing great flexibility in capturing a board range of the missingness mechanism and behaviors of random effects. We show that the proposed estimators enjoy desirable theoretical properties by relaxing the conditions for a finite number of clusters or finite cluster size imposed in the literature. The finite sample performance of the estimators is assessed through extensive simulations. We illustrate the application of the methods using a longitudinal data set on forest health monitoring.