Utilization of information from gene networks towards a better understanding of functional similarities between complex traits: a dairy cattle model

Frąszczak, Magdalena, Suchocki, Tomasz, Szyda, Joanna
Journal of applied genetics 2016 v.57 no.1 pp. 129-133
biochemical pathways, dairy cattle, data collection, databases, genes, genetic correlation, marker-assisted selection, milk fat yield, milk protein yield, single nucleotide polymorphism, somatic cells, statistical models
Our study focused on quantifying functional similarities between complex traits recorded in dairy cattle: milk yield, fat yield, protein yield, somatic cell score and stature. Similarities were calculated based on gene sets forming gene networks and on gene ontology term sets underlying genes estimated as significant for the analysed traits. Gene networks were obtained by the Bisogenet and Gene Set Linkage Analysis (GSLA) software. The highest similarity was observed between milk yield and fat yield. A very low degree of similarity was attributed to protein yield and stature when using gene sets as a similarity criterion, as well as to protein yield and fat yield when using sets of gene ontology terms. Pearson correlation coefficients between gene effect estimates, representing additive polygenic similarities, were highest for protein yield and milk yield, and the lowest in case of protein yield and somatic cell score. Using the 50 K Illumina SNP chip from the national genomic selection data set only the most significant gene-trait associations can be retrieved, while enhancing it by the functional information contained in interaction data stored in public data bases and by metabolic pathways information facilitates a better characterization of the functional background of the traits and furthermore — trait comparison. The most interesting result of our study was that the functional similarity observed between protein yield and milk-/fat yields contradicted moderate genetic correlations estimated earlier for the same population based on a multivariate mixed model. The discrepancy indicates that an infinitesimal model assumed in that study reflects an averaged correlation due to polygenes, but fails to reveal the functional background underlying the traits, which is due to the cumulative composition of many genes involved in metabolic pathways, which appears to differ between protein-fat yield and protein-milk yield pairs.