4.5 Article

An automated, high-throughput sequence read classification pipeline for preliminary genome characterization

期刊

ANALYTICAL BIOCHEMISTRY
卷 373, 期 1, 页码 78-87

出版社

ACADEMIC PRESS INC ELSEVIER SCIENCE
DOI: 10.1016/j.ab.2007.08.008

关键词

DNA; sequence analysis; transposon; genome; bioinformatics; computational analysis; genomics; comparative

向作者/读者索取更多资源

In the absence of a complete genome sequence, considerable insight into genome structure can be gained from survey sequencing of genomic DNA. To facilitate high-throughput characterization of genome structure based on shotgun sequence reads, we have developed an automated sequence read classification pipeline (SRCP). The SRCP uses a battery of novel and standard sequence analysis algorithms along with a sophisticated decision tree to place reads into best fit functional/descriptive categories. Once primed with genomic sequence data, the SRCP also permits estimation of gene/repeat enrichment afforded by reduced-representation sequencing techniques. To our knowledge, the SRCP is the only tool that has been designed to provide a description of a genome or a genome component based on sample sequence reads. In an initial test of the SRCP using sequence data from Sorghum bicolor, it was shown to provide results similar in quality to results generated by manual classification. Although the SRCP is not a replacement for manual sequence characterization, it can provide a rapid, high-quality overview of genome sequence content and facilitate subsequent annotation. The SRCP presumably can be adapted for analysis of any eukaryotic genome. (c) 2007 Elsevier Inc. All rights reserved.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.5
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据