Jump to Main Content
Diversity and population structure of northern switchgrass as revealed through exome capture sequencing
- Joseph Evans, Emily Crisovan, Kerrie Barry, Chris Daum, Jerry Jenkins, Govindarajan Kunde‐Ramamoorthy, Aruna Nandety, Chew Yee Ngan, Brieanne Vaillancourt, Chia‐Lin Wei, Jeremy Schmutz, Shawn M. Kaeppler, Michael D. Casler, Carol Robin Buell
- plant journal 2015 v.84 no.4 pp. 800-815
- Panicum virgatum, lowlands, heading, biofuels, conserved sequences, ecotypes, feedstocks, gene dosage, genes, genetic distance, grasses, highlands, phenotype, phenotypic variation, population structure, single nucleotide polymorphism, species diversity, tetraploidy, North America
- Panicum virgatum L. (switchgrass) is a polyploid, perennial grass species that is native to North America, and is being developed as a future biofuel feedstock crop. Switchgrass is present primarily in two ecotypes: a northern upland ecotype, composed of tetraploid and octoploid accessions, and a southern lowland ecotype, composed of primarily tetraploid accessions. We employed high‐coverage exome capture sequencing (~2.4 Tb) to genotype 537 individuals from 45 upland and 21 lowland populations. From these data, we identified ~27 million single‐nucleotide polymorphisms (SNPs), of which 1 590 653 high‐confidence SNPs were used in downstream analyses of diversity within and between the populations. From the 66 populations, we identified five primary population groups within the upland and lowland ecotypes, a result that was further supported through genetic distance analysis. We identified conserved, ecotype‐restricted, non‐synonymous SNPs that are predicted to affect the protein function of CONSTANS (CO) and EARLY HEADING DATE 1 (EHD1), key genes involved in flowering, which may contribute to the phenotypic differences between the two ecotypes. We also identified, relative to the near‐reference Kanlow population, 17 228 genes present in more copies than in the reference genome (up‐CNVs), 112 630 genes present in fewer copies than in the reference genome (down‐CNVs) and 14 430 presence/absence variants (PAVs), affecting a total of 9979 genes, including two upland‐specific CNV clusters. In total, 45 719 genes were affected by an SNP, CNV, or PAV across the panel, providing a firm foundation to identify functional variation associated with phenotypic traits of interest for biofuel feedstock production.