4.3 Article

Support vector machines for predicting rRNA-, RNA-, and DNA-binding proteins from amino acid sequence

Journal

BIOCHIMICA ET BIOPHYSICA ACTA-PROTEINS AND PROTEOMICS
Volume 1648, Issue 1-2, Pages 127-133

Publisher

ELSEVIER
DOI: 10.1016/S1570-9639(03)00112-2

Keywords

classification; feature vector; function; functional genomic; machine learning; prediction; pseudo-amino acid composition; support vector machine; SVM

Ask authors/readers for more resources

Classification of gene function remains one of the most important and demanding tasks in the post-genome era. Most of the current predictive computer methods rely on comparing features that are essentially linear to the protein sequence. However, features of a protein nonlinear to the sequence may also be predictive to its function. Machine learning methods, for instance the Support Vector Machines (SVMs), are particularly suitable for exploiting such features. In this work we introduce SVM and the pseudo-amino acid composition, a collection of nonlinear features extractable from protein sequence, to the field of protein function prediction. We have developed prototype SVMs for binary classification of rRNA-, RNA-, and DNA-binding proteins. Using a protein's amino acid composition and limited range correlation of hydrophobicity and solvent accessible surface area as input, each of the SVMs predicts whether the protein belongs to one of the three classes. In self-consistency and cross-validation tests, which measures the success of learning and prediction, respectively, the rRNA-binding SVM has consistently achieved >95% accuracy. The RNA- and DNA-binding SVMs demonstrate more diverse accuracy, ranging from similar to 76% to similar to 97%. Analysis of the test results suggests the directions of improving the SVMs. (C) 2003 Elsevier Science B.V. All rights reserved..

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.3
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available