☆ 4.7 Article

SSAHA: A fast search method for large DNA databases

GENOME RESEARCH (2001)

期刊

GENOME RESEARCH

卷 11, 期 10, 页码 1725-1729

出版社

COLD SPRING HARBOR LAB PRESS, PUBLICATIONS DEPT

DOI: 10.1101/gr.194201

关键词

类别

Biochemistry & Molecular Biology Biotechnology & Applied Microbiology Genetics & Heredity

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

摘要

We describe an algorithm, SSAHA (Sequence Search and Alignment by Hashing Algorithm), for performing fast searches on databases containing multiple gigabases of DNA. Sequences in the database are preprocessed by breaking them into consecutive k-tuples of k contiguous bases and then using a hash table to store the position of each occurrence of each k-tuple. Searching for a query sequence in the database is done by obtaining from the hash table the hits for each k-tuple in the query sequence and then performing a sort on the results. We discuss the effect of the tuple length k on the search speed, memory usage, and sensitivity of the algorithm and present the results of computational experiments which show that Ss: HA can be three to four orders of magnitude faster than BLAST or FASTA, while requiring less memory than suffix tree methods. The SSAHA algorithm is used for high-throughput single nucleotide polymorphism (SNP) detection and very large scale sequence assembly. Also, it provides Web-based sequence search facilities for Ensembl projects.

SSAHA: A fast search method for large DNA databases

期刊

GENOME RESEARCH

出版社

COLD SPRING HARBOR LAB PRESS, PUBLICATIONS DEPT

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

SSAHA: A fast search method for large DNA databases

期刊

GENOME RESEARCH

出版社

COLD SPRING HARBOR LAB PRESS, PUBLICATIONS DEPT

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文