4.7 Article

Sequence-based heuristics for faster annotation of non-coding RNA families

期刊

BIOINFORMATICS
卷 22, 期 1, 页码 35-39

出版社

OXFORD UNIV PRESS
DOI: 10.1093/bioinformatics/bti743

关键词

-

资金

  1. NHGRI NIH HHS [HG 00035, R01 HG 02602] Funding Source: Medline
  2. NATIONAL HUMAN GENOME RESEARCH INSTITUTE [R01HG002602] Funding Source: NIH RePORTER

向作者/读者索取更多资源

Motivation: Non-coding RNAs (ncRNAs) are functional RNA molecules that do not code for proteins. Covariance Models (CMs) are a useful statistical tool to find new members of an ncRNA gene family in a large genome database, using both sequence and, importantly, RNA secondary structure information. Unfortunately, CM searches are extremely slow. Previously, we created rigorous filters, which provably sacrifice none of a CM's accuracy, while making searches significantly faster for virtually all ncRNA families. However, these rigorous filters make searches slower than heuristics could be. Results: In this paper we introduce profile HMM-based heuristic filters. We show that their accuracy is usually superior to heuristics based on BLAST. Moreover, we compared our heuristics with those used in tRNAscan-SE, whose heuristics incorporate a significant amount of work specific to tRNAs, where our heuristics are generic to any ncRNA. Performance was roughly comparable, so we expect that our heuristics provide a high-quality solution that-unlike family-specific solutions-can scale to hundreds of ncRNA families.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据