Identification and characterization of transcript polymorphisms in soybean lines varying in oil composition and content
- BMC Genomics 2014 v.15 pp. 1-17
- Glycine max, amino acid sequences, breeding, fatty acid composition, genes, lipid content, loci, messenger RNA, oleic acid, seeds, single nucleotide polymorphism, soybean oil, soybeans, stearic acid, transcriptome
- Genetic/genome diversity underlying variation in seed oil composition and content among soybean varieties can be largely depicted by differences in transcript sequences and/or transcript accumulation of oil producing related genes in seeds. In an effort to identify these variations, we sequenced transcriptomes of soybean seeds from nine lines varying in oil composition and/or total oil content. Our results showed that 50,485 distinct transcripts from 32,885 genes were expressed in seeds. A total of 8037 transcript expression polymorphisms and 50,485 transcript sequence polymorphisms (48,792 SNPs and 1693 small Indels) were identified among the lines. Effects of the transcript polymorphisms on their protein sequences and functionalities were predicted. We provided independent evidence that the lack of FAD2-1A gene activity and a non-synonymous SNP in the coding sequence of FAB2C caused elevated oleic acid and stearic acid levels in soybean lines M23 and FAM94-41, respectively. We also observed an elevated transcript accumulation of genes clustered at the rhg1 locus of soybean line Jack, suggesting their role in SCN-resistance in Jack. The collection of transcript polymorphisms coupled with their predicted functional effects presented in this study should be a valuable asset for further discovery of genes and gene variants important to oil qualities and for development of highly effective markers for soybean oil quality breeding programs.