Main content area

Genome wide search to identify reference genes candidates for gene expression analysis in Gossypium hirsutum

Smitha, P. K., Vishnupriyan, K., Kar, Ananya S., Anil Kumar, M., Bathula, Christopher, Chandrashekara, K. N., Dhar, Sujan K., Das, Manjula
BMC plant biology 2019 v.19 no.1 pp. 405
Gossypium hirsutum, algorithms, cotton, crops, data collection, forage, gene expression, genes, leaves, oils, pests, plant tissues, quantitative polymerase chain reaction, sequence analysis, transgenic plants
BACKGROUND: Cotton is one of the most important commercial crops as the source of natural fiber, oil and fodder. To protect it from harmful pest populations number of newer transgenic lines have been developed. For quick expression checks in successful agriculture qPCR (quantitative polymerase chain reaction) have become extremely popular. The selection of appropriate reference genes plays a critical role in the outcome of such experiments as the method quantifies expression of the target gene in comparison with the reference. Traditionally most commonly used reference genes are the “house-keeping genes”, involved in basic cellular processes. However, expression levels of such genes often vary in response to experimental conditions, forcing the researchers to validate the reference genes for every experimental platform. This study presents a data science driven unbiased genome-wide search for the selection of reference genes by assessing variation of > 50,000 genes in a publicly available RNA-seq dataset of cotton species Gossypium hirsutum. RESULT: Five genes (TMN5, TBL6, UTR5B, AT1g65240 and CYP76B6) identified by data-science driven analysis, along with two commonly used reference genes found in literature (PP2A1 and UBQ14) were taken through qPCR in a set of 33 experimental samples consisting of different tissues (leaves, square, stem and root), different stages of leaf (young and mature) and square development (small, medium and large) in both transgenic and non-transgenic plants. Expression stability of the genes was evaluated using four algorithms - geNorm, BestKeeper, NormFinder and RefFinder. CONCLUSION: Based on the results we recommend the usage of TMN5 and TBL6 as the optimal candidate reference genes in qPCR experiments with normal and transgenic cotton plant tissues. AT1g65240 and PP2A1 can also be used if expression study includes squares. This study, for the first time successfully displays a data science driven genome-wide search method followed by experimental validation as a method of choice for selection of stable reference genes over the selection based on function alone.