Alternative haplotype construction methods for genomic evaluation

Jónás, Dávid, Ducrocq, Vincent, Fouilloux, Marie-Noëlle, Croiseau, Pascal
Journal of dairy science 2016 v.99 no.6 pp. 4537-4546
alleles, breeding value, bulls, cattle production, dairy cattle, genetic markers, genomics, haplotypes, linkage disequilibrium, prediction, quantitative trait loci, single nucleotide polymorphism, France
Genomic evaluation methods today use single nucleotide polymorphism (SNP) as genomic markers to trace quantitative trait loci (QTL). Today most genomic prediction procedures use biallelic SNP markers. However, SNP can be combined into short, multiallelic haplotypes that can improve genomic prediction due to higher linkage disequilibrium between the haplotypes and the linked QTL. The aim of this study was to develop a method to identify the haplotypes, which can be expected to be superior in genomic evaluation, as compared with either SNP or other haplotypes of the same size. We first identified the SNP (termed as QTL-SNP) from the bovine 50K SNP chip that had the largest effect on the analyzed trait. It was assumed that these SNP were not the causative mutations and they merely indicated the approximate location of the QTL. Haplotypes of 3, 4, or 5 SNP were selected from short genomic windows surrounding these markers to capture the effect of the QTL. Two methods described in this paper aim at selecting the most optimal haplotype for genomic evaluation. They assumed that if an allele has a high frequency, its allele effect can be accurately predicted. These methods were tested in a classical validation study using a dairy cattle population of 2,235 bulls with genotypes from the bovine 50K SNP chip and daughter yield deviations (DYD) on 5 dairy cattle production traits. Combining the SNP into haplotypes was beneficial with all tested haplotypes, leading to an average increase of 2% in terms of correlations between DYD and genomic breeding value estimates compared with the analysis when the same SNP were used individually. Compared with haplotypes built by merging the QTL-SNP with its flanking SNP, the haplotypes selected with the proposed criteria carried less under- and over-represented alleles: the proportion of alleles with frequencies <1 or >40% decreased, on average, by 17.4 and 43.4%, respectively. The correlations between DYD and genomic breeding value estimates increased by 0.7 to 0.9 percentage points when the haplotypes were selected using any of the proposed methods compared with using the haplotypes built from the QTL-SNP and its flanking markers. We showed that the efficiency of genomic prediction could be improved at no extra costs, only by selecting the proper markers or combinations of markers for genomic prediction. One of the presented approaches was implemented in the new genomic evaluation procedure applied in dairy cattle in France in April 2015.