Main content area

Semi-parametric copula sample selection models for count responses

Marra, Giampiero, Wyszynski, Karol
Computational statistics & data analysis 2016 v.104 pp. 110-129
algorithms, equations, military veterans, models, observational studies, surveys, United States
In observational studies, a response of interest (as well as some individual level characteristics) may be observed for a non-randomly selected sample of the population. In this situation, standard models such as linear and probit regressions will yield biased and inconsistent parameter estimates. Selection models can address this issue and mainly consist of two regressions: a binary selection equation which determines whether the statistical units will enter the sample, and an outcome equation which models the response. While sample selection models for continuous and binary outcomes have been widely studied in the literature, the case of count response has not received as much attention. Sample selection models for count data which allow for the use of potentially any discrete distribution, non-Gaussian dependencies between the selection and outcome equations, and flexible covariate effects are introduced. The estimation algorithm is based on the penalized likelihood estimation framework. The method is illustrated in simulation and using data from a United States Veterans Administration Survey.