Main content area

Implementation of Bayesian methods to identify SNP and haplotype regions with transmission ratio distortion across the whole genome: TRDscan v.1.0

Id-Lahoucine, S., Cánovas, A., Jaton, C., Miglior, F., Fonseca, P.A.S., Sargolzaei, M., Miller, S., Schenkel, F.S., Medrano, J.F., Casellas, J.
Journal of dairy science 2019 v.102 no.4 pp. 3175-3188
Bayesian theory, Holstein, Mendelian inheritance, alleles, computer software, data collection, fetus, germ cells, haplotypes, heterozygosity, livestock, loci, models, parents, progeny, single nucleotide polymorphism
Realized deviations from the expected Mendelian inheritance of alleles from heterozygous parents have been previously reported in a broad range of organisms (i.e., transmission ratio distortion; TRD). Various biological mechanisms affecting gametes, embryos, fetuses, or even postnatal offspring can produce patterns of TRD. However, knowledge about its prevalence and potential causes in livestock species is still scarce. Specific Bayesian models have been recently developed for the analyses of TRD for biallelic loci, which accommodated a wide range of population structures, enabling TRD investigation in livestock populations. The parameterization of these models is flexible and allows the study of overall (parent-unspecific) TRD and sire- and dam-specific TRD. This research aimed at deriving Bayesian models for fitting TRD on the basis of haplotypes, testing the models for both haplotype- and SNP-based methods in simulated data and actual Holstein genotypes, and developing a specific software for TRD analyses. Results obtained on simulated data sets showed that the statistical power of the analysis increased with sample size of trios (n), proportion of heterozygous parents, and the magnitude of the TRD. On the other hand, the statistical power to detect TRD decreased with the number of alleles at each loci. Bayesian analyses showed a strong Pearson correlation coefficient (≥0.97) between simulated and estimated TRD that reached the significance level of Bayes factor ≥10 for both single-marker and haplotype analyses when n ≥ 25. Moreover, the accuracy in terms of the mean absolute error decreased with the increase of the sample size and increased with the number of alleles at each loci. Using real data (55,732 genotypes of Holstein trios), SNP- and haplotype-based distortions were detected with overall TRD, sire-TRD, or dam-TRD, showing different magnitudes of TRD and statistical relevance. Additionally, the haplotype-based method showed more ability to capture TRD compared with individual SNP. To discard possible random TRD in real data, an approximate empirical null distribution of TRD was developed. The program TRDscan v.1.0 was written in Fortran 2008 language and provides a powerful statistical tool to scan for TRD regions across the whole genome. This developed program is freely available at