☆ 4.5 Review

Graphlet Kernels for Prediction of Functional Residues in Protein Structures

JOURNAL OF COMPUTATIONAL BIOLOGY (2010)

Journal

JOURNAL OF COMPUTATIONAL BIOLOGY

Volume 17, Issue 1, Pages 55-72

Publisher

MARY ANN LIEBERT, INC

DOI: 10.1089/cmb.2009.0029

Keywords

algorithms; graphs; kernel methods; machine learning; protein structure; protein function

Funding

NIH [1R21CA113711]
NSF [IIS-0447773, DBI-0321756, DBI-0644017]
NATIONAL CANCER INSTITUTE [R21CA113711] Funding Source: NIH RePORTER

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Abstract

We introduce a novel graph-based kernel method for annotating functional residues in protein structures. A structure is first modeled as a protein contact graph, where nodes correspond to residues and edges connect spatially neighboring residues. Each vertex in the graph is then represented as a vector of counts of labeled non-isomorphic subgraphs (graphlets), centered on the vertex of interest. A similarity measure between two vertices is expressed as the inner product of their respective count vectors and is used in a supervised learning framework to classify protein residues. We evaluated our method on two function prediction problems: identification of catalytic residues in proteins, which is a well-studied problem suitable for benchmarking, and a much less explored problem of predicting phosphorylation sites in protein structures. The performance of the graphlet kernel approach was then compared against two alternative methods, a sequence-based predictor and our implementation of the FEATURE framework. On both tasks, the graphlet kernel performed favorably; however, the margin of difference was considerably higher on the problem of phosphorylation site prediction. While there is data that phosphorylation sites are preferentially positioned in intrinsically disordered regions, we provide evidence that for the sites that are located in structured regions, neither the surface accessibility alone nor the averaged measures calculated from the residue microenvironments utilized by FEATURE were sufficient to achieve high accuracy. The key benefit of the graphlet representation is its ability to capture neighborhood similarities in protein structures via enumerating the patterns of local connectivity in the corresponding labeled graphs.

Graphlet Kernels for Prediction of Functional Residues in Protein Structures

Journal

JOURNAL OF COMPUTATIONAL BIOLOGY

Publisher

MARY ANN LIEBERT, INC

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Graphlet Kernels for Prediction of Functional Residues in Protein Structures

Journal

JOURNAL OF COMPUTATIONAL BIOLOGY

Publisher

MARY ANN LIEBERT, INC

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper