Main content area

Effect of cleavage enzyme, search algorithm and decoy database on mass spectrometric identification of wheat gluten proteins

Vensel, William H., Dupont, Frances M., Sloane, Stacia, Altenbach, Susan B.
Phytochemistry 2011 v.72 no.10 pp. 1154-1161
algorithms, amino acid sequences, chymotrypsin, computer software, cultivars, databases, energy, expressed sequence tags, gliadin, glutamine, glutenins, mass spectrometry, peptides, proline, trypsin, wheat, wheat gluten
While tandem mass spectrometry (MS/MS) is routinely used to identify proteins from complex mixtures, certain types of proteins present unique challenges for MS/MS analyses. The major wheat gluten proteins, gliadins and glutenins, are particularly difficult to distinguish by MS/MS. Each of these groups contains many individual proteins with similar sequences that include repetitive motifs rich in proline and glutamine. These proteins have few cleavable tryptic sites, often resulting in only one or two tryptic peptides that may not provide sufficient information for identification. Additionally, there are less than 14,000 complete protein sequences from wheat in the current NCBInr release. In this paper, MS/MS methods were optimized for the identification of the wheat gluten proteins. Chymotrypsin and thermolysin as well as trypsin were used to digest the proteins and the collision energy was adjusted to improve fragmentation of chymotryptic and thermolytic peptides. Specialized databases were constructed that included protein sequences derived from contigs from several assemblies of wheat expressed sequence tags (ESTs), including contigs assembled from ESTs of the cultivar under study. Two different search algorithms were used to interrogate the database and the results were analyzed and displayed using a commercially available software package (Scaffold). We examined the effect of protein database content and size on the false discovery rate. We found that as database size increased above 30,000 sequences there was a decrease in the number of proteins identified. Also, the type of decoy database influenced the number of proteins identified. Using three enzymes, two search algorithms and a specialized database allowed us to greatly increase the number of detected peptides and distinguish proteins within each gluten protein group.