Main content area

Increased prediction accuracy using a genomic feature model including prior information on quantitative trait locus regions in purebred Danish Duroc pigs

Sarup, Pernille, Jensen, Just, Ostersen, Tage, Henryon, Mark, Sørensen, Peter
BMC genetics 2016 v.17 no.1 pp. 11
Duroc, animal breeding, average daily gain, boars, data collection, databases, feed conversion, genetic markers, genetic traits, genetic variance, genetic variation, genotype, heritability, lean meat, phenotype, prediction, purebreds, quantitative trait loci, single nucleotide polymorphism, statistical models, variance
BACKGROUND: In animal breeding, genetic variance for complex traits is often estimated using linear mixed models that incorporate information from single nucleotide polymorphism (SNP) markers using a realized genomic relationship matrix. In such models, individual genetic markers are weighted equally and genomic variation is treated as a “black box.” This approach is useful for selecting animals with high genetic potential, but it does not generate or utilise knowledge of the biological mechanisms underlying trait variation. Here we propose a linear mixed-model approach that can evaluate the collective effects of sets of SNPs and thereby open the “black box.” The described genomic feature best linear unbiased prediction (GFBLUP) model has two components that are defined by genomic features. RESULTS: We analysed data on average daily gain, feed efficiency, and lean meat percentage from 3,085 Duroc boars, along with genotypes from a 60 K SNP chip. In addition information on known quantitative trait loci (QTL) from the animal QTL database was integrated in the GFBLUP as a genomic feature. Our results showed that the most significant QTL categories were indeed biologically meaningful. Additionally, for high heritability traits, prediction accuracy was improved by the incorporation of biological knowledge in prediction models. A simulation study using the real genotypes and simulated phenotypes demonstrated challenges regarding detection of causal variants in low to medium heritability traits. CONCLUSIONS: The GFBLUP model showed increased predictive ability when enough causal variants were included in the genomic feature to explain over 10 % of the genomic variance, and when dilution by non-causal markers was minimal. In the observed data set, predictive ability was increased by the inclusion of prior QTL information obtained outside the training data set, but only for the trait with highest heritability.