4.7 Article

Efficient statistical significance approximation for local similarity analysis of high-throughput time series data

期刊

BIOINFORMATICS
卷 29, 期 2, 页码 230-237

出版社

OXFORD UNIV PRESS
DOI: 10.1093/bioinformatics/bts668

关键词

-

资金

  1. US NSF [DMS-1043075, OCE 1136818]
  2. National Natural Science Foundation of China [60928007, 60805010]
  3. Direct For Mathematical & Physical Scien
  4. Division Of Mathematical Sciences [1043075] Funding Source: National Science Foundation
  5. Division Of Ocean Sciences
  6. Directorate For Geosciences [1136818] Funding Source: National Science Foundation

向作者/读者索取更多资源

Motivation: Local similarity analysis of biological time series data helps elucidate the varying dynamics of biological systems. However, its applications to large scale high-throughput data are limited by slow permutation procedures for statistical significance evaluation. Results: We developed a theoretical approach to approximate the statistical significance of local similarity analysis based on the approximate tail distribution of the maximum partial sum of independent identically distributed (i.i.d.) random variables. Simulations show that the derived formula approximates the tail distribution reasonably well ( starting at time points >10 with no delay and >20 with delay) and provides P-values comparable with those from permutations. The new approach enables efficient calculation of statistical significance for pairwise local similarity analysis, making possible all-to-all local association studies otherwise prohibitive. As a demonstration, local similarity analysis of human microbiome time series shows that core operational taxonomic units (OTUs) are highly synergetic and some of the associations are body-site specific across samples.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据