期刊
PEERJ
卷 5, 期 -, 页码 -出版社
PEERJ INC
DOI: 10.7717/peerj.3812
关键词
Time series; Microbiota; Clustering; Marker gene; Visualization
资金
- Natural Sciences and Engineering Research Council of Canada (NSERC)
- Canada Research Chairs program
- NSERC Discovery Grants program
- United States National Science Foundation (NSF) Microbial Observatories program [MCB-0702395]
- Long Term Ecological Research program [NTL-LTER DEB-1440297]
- INSPIRE award [DEB-1344254]
- National Institute of Food and Agriculture
- US Department of Agriculture [1002996]
- Division Of Environmental Biology
- Direct For Biological Sciences [1344254] Funding Source: National Science Foundation
- NIFA [1002996, 811025] Funding Source: Federal RePORTER
Taxonomic markers such as the 16S ribosomal RNA gene are widely used in microbial community analysis. A common first step in marker-gene analysis is grouping genes into clusters to reduce data sets to a more manageable size and potentially mitigate the effects of sequencing error. Instead of clustering based on sequence identity, marker-gene data sets collected over time can be clustered based on temporal correlation to reveal ecologically meaningful associations. We present Ananke, a free and open-source algorithm and software package that complements existing sequence-identity-based clustering approaches by clustering marker-gene data based on time-series profiles and provides interactive visualization of clusters, including highlighting of internal OTU inconsistencies. Ananke is able to cluster distinct temporal patterns from simulations of multiple ecological patterns, such as periodic seasonal dynamics and organism appearances/disappearances. We apply our algorithm to two longitudinal marker gene data sets: faecal communities from the human gut of an individual sampled over one year, and communities from a freshwater lake sampled over eleven years. Within the gut, the segregation of the bacterial community around a food-poisoning event was immediately clear. In the freshwater lake, we found that high sequence identity between marker genes does not guarantee similar temporal dynamics, and Ananke time-series clusters revealed patterns obscured by clustering based on sequence identity or taxonomy.
作者
我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。
推荐
暂无数据