4.8 Article

ROCker: accurate detection and quantification of target genes in short-read metagenomic data sets by modeling sliding-window bitscores

Journal

NUCLEIC ACIDS RESEARCH
Volume 45, Issue 3, Pages -

Publisher

OXFORD UNIV PRESS
DOI: 10.1093/nar/gkw900

Keywords

-

Funding

  1. U.S. Department of Energy
  2. Office of Biological and Environmental Research
  3. Genomic Science Program [DE-SC0006662]
  4. US National Science Foundation [1241046, 1356288]
  5. Chilean Fulbright-Conicyt doctoral scholarship [L.H.O.]
  6. Division Of Environmental Biology
  7. Direct For Biological Sciences [1241046] Funding Source: National Science Foundation
  8. Div Of Biological Infrastructure
  9. Direct For Biological Sciences [1356288] Funding Source: National Science Foundation

Ask authors/readers for more resources

Functional annotation of metagenomic and metatran-scriptomic data sets relies on similarity searches based on e-value thresholds resulting in an unknown number of false positive and negative matches. To overcome these limitations, we introduce ROCker, aimed at identifying position-specific, most-discriminant thresholds in sliding windows along the sequence of a target protein, accounting for non-discriminative domains shared by unrelated proteins. ROCker employs the receiver operating characteristic (ROC) curve to minimize false discovery rate (FDR) and calculate the best thresholds based on how simulated shotgun metagenomic reads of known composition map onto well-curated reference protein sequences and thus, differs from HMM profiles and related methods. We showcase ROCker using ammonia monooxygenase (amoA) and nitrous oxide reductase (nosZ) genes, mediating oxidation of ammonia and the reduction of the potent greenhouse gas, N2O, to inert N-2, respectively. ROCker typically showed 60-fold lower FDR when compared to the common practice of using fixed e-values. Previously uncounted 'atypical' nosZ genes were found to be two times more abundant, on average, than their typical counterparts in most soil metagenomes and the abundance of bacterial amoA was quantified against the highly-related particulate methane monooxygenase (pmoA). Therefore, ROCker can reliably detect and quantify target genes in short-read metagenomes.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.8
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available