☆ 4.8 Article

The whole alignment and nothing but the alignment: the problem of spurious alignment flanks

NUCLEIC ACIDS RESEARCH (2008)

期刊

NUCLEIC ACIDS RESEARCH

卷 36, 期 18, 页码 5863-5871

出版社

OXFORD UNIV PRESS

DOI: 10.1093/nar/gkn579

关键词

类别

Biochemistry & Molecular Biology

资金

National Library of Medicine
National Institutes of Health

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

摘要

Pairwise sequence alignment is a ubiquitous tool for inferring the evolution and function of DNA, RNA and protein sequences. It is therefore essential to identify alignments arising by chance alone, i.e. spurious alignments. On one hand, if an entire alignment is spurious, statistical techniques for identifying and eliminating it are well known. On the other hand, if only a part of the alignment is spurious, elimination is much more problematic. In practice, even the sizes and frequencies of spurious subalignments remain unknown. This article shows that some common scoring schemes tend to overextend alignments and generate spurious alignment flanks up to hundreds of base pairs/amino acids in length. In the UCSC genome database, e.g. spurious flanks probably comprise >18% of the human-fugu genome alignment. To evaluate the possibility that chance alone generated a particular flank on a particular pairwise alignment, we provide a simple 'overalignment' P-value. The overalignment P-value can identify spurious alignment flanks, thereby eliminating potentially misleading inferences about evolution and function. Moreover, by explicitly demonstrating the tradeoff between over- and under-alignment, our methods guide the rational choice of scoring schemes for various alignment tasks.

The whole alignment and nothing but the alignment: the problem of spurious alignment flanks

期刊

NUCLEIC ACIDS RESEARCH

出版社

OXFORD UNIV PRESS

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

The whole alignment and nothing but the alignment: the problem of spurious alignment flanks

期刊

NUCLEIC ACIDS RESEARCH

出版社

OXFORD UNIV PRESS

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文