Journal
BIOINFORMATICS
Volume 23, Issue 21, Pages 2949-2951Publisher
OXFORD UNIV PRESS
DOI: 10.1093/bioinformatics/btm479
Keywords
-
Categories
Funding
- Intramural NIH HHS Funding Source: Medline
Ask authors/readers for more resources
Motivation: The blastp and tblastn modules of BLAST are widely used methods for searching protein queries against protein and nucleotide databases, respectively. One heuristic used in BLAST is to consider only database sequences that contain a high-scoring match of length at most 5 to the query. We implemented the capability to use words of length 6 or 7. We demonstrate an improved trade-off between running time and retrieval accuracy, controlled by the score threshold used for short word matches. For example, the running time can be reduced by 20-30 while achieving ROC (receiver operator characteristic) scores similar to those obtained with current default parameters.
Authors
I am an author on this paper
Click your name to claim this paper and add it to your profile.
Reviews
Recommended
No Data Available