4.7 Article

genBlastA: Enabling BLAST to identify homologous gene sequences

期刊

GENOME RESEARCH
卷 19, 期 1, 页码 143-149

出版社

COLD SPRING HARBOR LAB PRESS, PUBLICATIONS DEPT
DOI: 10.1101/gr.082081.108

关键词

-

资金

  1. Natural Sciences and Engineering Research Council of Canada ( NSERC)
  2. SFU Community Trust
  3. Michael Smith Foundation for Health Research ( MSFHR) Scholar

向作者/读者索取更多资源

BLAST is an extensively used local similarity search tool for identifying homologous sequences. When a gene sequence (either protein sequence or nucleotide sequence) is used as a query to search for homologous sequences in a genome, the search results, represented as a list of high-scoring pairs (HSPs), are fragments of candidate genes rather than full-length candidate genes. Relevant HSPs (signals), which represent candidate genes in the target genome sequences, are buried within a report that contains also hundreds to thousands of random HSPs (noises). Consequently, BLAST results are often overwhelming and confusing even to experienced users. For effective use of BLAST, a program is needed for extracting relevant HSPs that represent candidate homologous genes from the entire HSP report. To achieve this goal, we have designed a graph-based algorithm, genBlastA, which automatically filters HSPs into well-defined groups, each representing a candidate gene in the target genome. The novelty of genBlastA is an edge length metric that reflects a set of biologically motivated requirements so that each shortest path corresponds to an HSP group representing a homologous gene. We have demonstrated that this novel algorithm is both efficient and accurate for identifying homologous sequences, and that it outperforms existing approaches with similar functionalities.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据