☆ 4.5 Article

Towards Improved Assessment of Functional Similarity in Large-Scale Screens: A Study on Indel Length

JOURNAL OF COMPUTATIONAL BIOLOGY (2010)

Journal

JOURNAL OF COMPUTATIONAL BIOLOGY

Volume 17, Issue 1, Pages 1-20

Publisher

MARY ANN LIEBERT, INC

DOI: 10.1089/cmb.2009.0031

Keywords

algorithms; biochemical networks; computational molecular biology; gene expression; HMM; machine learning; RNA; secondary structure

Funding

Pacific Institute for the Mathematical Sciences

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Abstract

Although insertions and deletions are a common type of evolutionary sequence variation, their origins and their functional consequences have not been comprehensively understood. Most alignment algorithms/programs only roughly reflect the evolutionary processes that result in gaps-which typically require further evaluation. Interestingly, it is widely believed that gaps are the predominant form of sequence variation resulting in structural and functional changes. Thus it is desirable to distinguish between gaps that reflect true point mutations and alignment artifacts when it comes to assessing the functional similarity of proteins based on computational alignments. Here we introduce pair hidden Markov model-based solutions to rapidly assess the statistical significance of gaps in alignments resulting from classical Needleman-Wunsch-like alignment procedures which implement affine gap penalty scoring schemes. Surprisingly, although it has a natural formulation, the emanating Markov chain problem had no known efficient solution thus far. In this article, we present the first efficient algorithm to solve it. We demonstrate that, when comparing paralogous protein pairs (from Escherichia coli) of equal alignment identity and similarity, alignments that contain gaps of significant length are significantly less similar in terms of functionality, as measured with respect to Gene Ontology (GO) term similarity. This demonstrates for the first time, in a formally sound manner, that insertions and deletions cause more severe functional changes between proteins than substitutions. Our method can be reliably employed to quickly filter alignment outputs for protein pairs that are more likely to be functionally similar and/or divergent and establishes a sound and useful add-on for large-scale alignment studies.

Towards Improved Assessment of Functional Similarity in Large-Scale Screens: A Study on Indel Length

Journal

JOURNAL OF COMPUTATIONAL BIOLOGY

Publisher

MARY ANN LIEBERT, INC

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Towards Improved Assessment of Functional Similarity in Large-Scale Screens: A Study on Indel Length

Journal

JOURNAL OF COMPUTATIONAL BIOLOGY

Publisher

MARY ANN LIEBERT, INC

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper