期刊
JOURNAL OF COMPUTATIONAL CHEMISTRY
卷 30, 期 9, 页码 1414-1423出版社
WILEY
DOI: 10.1002/jcc.21163
关键词
gray level co-occurrence matrix; cellular automaton image; pseudo amino acid composition; covariant-discriminant algorithm; evolutionary pharmacology; G-protein-coupled receptor
资金
- National Natural Science Foundation of China [60661003]
- Province National Natural Science Foundation of Jiangxi [0611060]
- The plan for training youth scientists (stars of Jing-Gang) of Jiangxi Province
Given an uncharacterized protein sequence, how can we identify whether it is a G-protein-coupled receptor (GPCR) or not? If it is, which functional family class does it belong to? It is important to address these questions because GPCRs are among the most frequent targets of therapeutic drugs and the information thus obtained is very useful for comparative and evolutionary pharmacology, a technique often used for drug development. Here, we present a web-server predictor called GPCR-CA, where CA stands for Cellular Automaton (Wolfram, S. Nature 1984, 311, 419), meaning that the CA images have been utilized to reveal the pattern features hidden in piles of long and complicated protein sequences. Meanwhile, the gray-level co-occurrence matrix factors extracted from the CA images are used to represent the samples of proteins through their pseudo amino acid composition (Chou, K.C. Proteins 2001, 43, 246). GPCR-CA is a two-layer predictor: the first layer prediction engine is for identifying a query protein as GPCR on non-GPCR; if it is a GPCR protein, the process will be automatically continued with the second-layer prediction engine to further identify its type among the following six functional classes: (a) rhodopsin-like, (b) secretin-like, (c) metabotrophic/glutamate/pheromone; (d) fungal pheromone, (e) cAMP receptor, and (f) frizzled/smoothened family. The overall success rates by the predictor for the first and second layers are over 91% and 83%, respectively, that were obtained through rigorous jackknife cross-validation tests on a new-constructed stringent benchmark dataset in which none of proteins has >= 40% pairwise sequence identity to any other in a same subset. GPCR-CA is freely accessible at http://218.65.61.89:8080/bioinfo/GPCR-CA, by which one can get the desired two-layer results for a query protein sequence within about 20 seconds. (C) 2008 Wiley Periodicals, Inc. J Comput Chem 30: 1414-1423, 2009
作者
我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。
推荐
暂无数据