PubAg

Main content area

pSumo-CD: predicting sumoylation sites in proteins with covariance discriminant algorithm by incorporating sequence-coupled effects into general PseAAC

Author:
Jia, Jianhua, Zhang, Liuxia, Liu, Zi, Xiao, Xuan, Chou, Kuo-Chen
Source:
Bioinformatics 2016 v.32 no.20 pp. 3133-3141
ISSN:
1460-2059
Subject:
Parkinson disease, algorithms, amino acids, bioinformatics, chemical bonding, covariance, drugs, equations, genes, prediction, proteins, sumoylation
Abstract:
Motivation: Sumoylation is a post-translational modification (PTM) process, in which small ubiquitin-related modifier (SUMO) is attaching by covalent bonds to substrate protein. It is critical to many different biological processes such as replicating genome, expressing gene, localizing and stabilizing proteins; unfortunately, it is also involved with many major disorders including Alzheimer’s and Parkinson’s diseases. Therefore, for both basic research and drug development, it is important to identify the sumoylation sites in proteins. Results: To address such a problem, we developed a predictor called pSumo-CD by incorporating the sequence-coupled information into the general pseudo-amino acid composition (PseAAC) and introducing the covariance discriminant (CD) algorithm, in which a bias-adjustment term, which has the function to automatically adjust the errors caused by the bias due to the imbalance of training data, had been incorporated. Rigorous cross-validations indicated that the new predictor remarkably outperformed the existing state-of-the-art prediction method for the same purpose. Availability and implementation: For the convenience of most experimental scientists, a user-friendly web-server for pSumo-CD has been established at http://www.jci-bioinfo.cn/pSumo-CD, by which users can easily obtain their desired results without the need to go through the complicated mathematical equations involved. Contact: jjia@gordonlifescience.org, xxiao@gordonlifescience.org or kcchou@gordonlifescience.org Supplementary information: Supplementary data are available at Bioinformatics online.
Agid:
6247474