U.S. flag

An official website of the United States government

Dot gov

Official websites use .gov
A .gov website belongs to an official government organization in the United States.


Secure .gov websites use HTTPS
A lock ( ) or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.


Main content area

A resource of single-nucleotide polymorphisms for rainbow trout generated by restriction-site associated DNA sequencing of doubled haploids

Yniv Palti, Guangtu Gao, Michael R. Miller, Roger L. Vallejo, Paul A Wheeler, Edwige Quillet, Jianbo Yao, Gary H. Thorgaard, Mohamed Salem, Caird E. Rexroad III
Molecular Ecology Resources 2014 v.14 no.3 pp. 588-596
DNA, Oncorhynchus mykiss, data collection, doubled haploids, gene frequency, genome, genotype, haploidy, sequence analysis, single nucleotide polymorphism
Salmonid genomes are considered to be in a pseudo-tetraploid state as a result of an evolutionarily recent genome duplication event. This situation complicates single nucleotide polymorphism (SNP) discovery in rainbow trout as many putative SNPs are actually paralogous sequence variants (PSVs) and not simple allelic variants. To minimize false discovery of PSVs we used a panel of 19 homozygous doubled haploid (DH) lines that represent a wide geographic range of rainbow trout populations. In the first phase of the study, we analyzed SbfI restriction-site associated DNA (RAD) sequence data from all the 19 lines and selected 11 lines for an extended SNP discovery. In the second phase, we conducted extended SNP discovery using PstI RAD sequence data from the selected 11 lines. The dataset is composed of 145,168 highquality putative SNPs that were genotyped in at least 9 of the 11 lines, of which 71,446 (49%)had minor allele frequencies (MAF) of at least 18% (i.e. at least 2 of the 11 lines). Approximately 14% of the RAD SNPs in this dataset are from expressed or coding rainbow trout sequences. In the support files for this resource we provided annotation to the positions of the SNPs in the working draft of the rainbow trout reference genome, provided the genotypes of each sample in the discovery panel and identified SNPs that are likely to be in coding sequences. Our comparison of the current dataset with previous SNP discovery datasets revealed that 99% of our SNPs are novel.