Main content area

Large-scale prediction of drug–target interactions using protein sequences and drug topological structures

Cao, Dong-Sheng, Liu, Shao, Xu, Qing-Song, Lu, Hong-Mei, Huang, Jian-Hua, Hu, Qian-Nan, Liang, Yi-Zeng
Analytica chimica acta 2012 v.752 pp. 1-10
G-protein coupled receptors, amino acid sequences, data collection, drugs, enzymes, humans, ion channels, models, physicochemical properties, prediction, structure-activity relationships, support vector machines, topology
The identification of interactions between drugs and target proteins plays a key role in the process of genomic drug discovery. It is both consuming and costly to determine drug–target interactions by experiments alone. Therefore, there is an urgent need to develop new in silico prediction approaches capable of identifying these potential drug–target interactions in a timely manner. In this article, we aim at extending current structure–activity relationship (SAR) methodology to fulfill such requirements. In some sense, a drug–target interaction can be regarded as an event or property triggered by many influence factors from drugs and target proteins. Thus, each interaction pair can be represented theoretically by using these factors which are based on the structural and physicochemical properties simultaneously from drugs and proteins. To realize this, drug molecules are encoded with MACCS substructure fingerings representing existence of certain functional groups or fragments; and proteins are encoded with some biochemical and physicochemical properties. Four classes of drug–target interaction networks in humans involving enzymes, ion channels, G-protein-coupled receptors (GPCRs) and nuclear receptors, are independently used for establishing predictive models with support vector machines (SVMs). The SVM models gave prediction accuracy of 90.31%, 88.91%, 84.68% and 83.74% for four datasets, respectively. In conclusion, the results demonstrate the ability of our proposed method to predict the drug–target interactions, and show a general compatibility between the new scheme and current SAR methodology. They open the way to a host of new investigations on the diversity analysis and prediction of drug–target interactions.