4.7 Article Proceedings Paper

A combinatorial approach for analyzing intra-tumor heterogeneity from high-throughput sequencing data

Journal

BIOINFORMATICS
Volume 30, Issue 12, Pages 78-86

Publisher

OXFORD UNIV PRESS
DOI: 10.1093/bioinformatics/btu284

Keywords

-

Funding

  1. National Science Foundation [CCF-1053753]
  2. National Institutes of Health [R01HG5690]
  3. Career Award at the Scientific Interface from the Burroughs Wellcome Fund
  4. Alfred P. Sloan Research Fellowship
  5. Natural Sciences and Engineering Research Council of Canada (NSERC)
  6. Direct For Computer & Info Scie & Enginr
  7. Division of Computing and Communication Foundations [1053753] Funding Source: National Science Foundation

Ask authors/readers for more resources

Motivation: High-throughput sequencing of tumor samples has shown that most tumors exhibit extensive intra-tumor heterogeneity, with multiple subpopulations of tumor cells containing different somatic mutations. Recent studies have quantified this intra-tumor heterogeneity by clustering mutations into subpopulations according to the observed counts of DNA sequencing reads containing the variant allele. However, these clustering approaches do not consider that the population frequencies of different tumor subpopulations are correlated by their shared ancestry in the same population of cells. Results: We introduce the binary tree partition (BTP), a novel combinatorial formulation of the problem of constructing the subpopulations of tumor cells from the variant allele frequencies of somatic mutations. We show that finding a BTP is an NP-complete problem; derive an approximation algorithm for an optimization version of the problem; and present a recursive algorithm to find a BTP with errors in the input. We show that the resulting algorithm outperforms existing clustering approaches on simulated and real sequencing data.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available