Main content area

A Bayesian group sparse multi-task regression model for imaging genetics

Greenlaw, Keelin, Szefer, Elena, Graham, Jinko, Lesperance, Mary, Nathoo, Farouk S.
Bioinformatics 2017 v.33 no.16 pp. 2513-2522
Alzheimer disease, Bayesian theory, algorithms, bioinformatics, brain, computer software, data collection, genes, genetic variation, genomics, genotyping, image analysis, models, regression analysis, single nucleotide polymorphism, statistical inference, value added
Recent advances in technology for brain imaging and high-throughput genotyping have motivated studies examining the influence of genetic variation on brain structure. Wang et al. have developed an approach for the analysis of imaging genomic studies using penalized multi-task regression with regularization based on a novel group l2,1-norm penalty which encourages structured sparsity at both the gene level and SNP level. While incorporating a number of useful features, the proposed method only furnishes a point estimate of the regression coefficients; techniques for conducting statistical inference are not provided. A new Bayesian method is proposed here to overcome this limitation. We develop a Bayesian hierarchical modeling formulation where the posterior mode corresponds to the estimator proposed by Wang et al. and an approach that allows for full posterior inference including the construction of interval estimates for the regression parameters. We show that the proposed hierarchical model can be expressed as a three-level Gaussian scale mixture and this representation facilitates the use of a Gibbs sampling algorithm for posterior simulation. Simulation studies demonstrate that the interval estimates obtained using our approach achieve adequate coverage probabilities that outperform those obtained from the nonparametric bootstrap. Our proposed methodology is applied to the analysis of neuroimaging and genetic data collected as part of the Alzheimer’s Disease Neuroimaging Initiative (ADNI), and this analysis of the ADNI cohort demonstrates clearly the value added of incorporating interval estimation beyond only point estimation when relating SNPs to brain imaging endophenotypes. Software and sample data is available as an R package ‘bgsmtr’ that can be downloaded from The Comprehensive R Archive Network (CRAN). Supplementary data are available at Bioinformatics online.