Journal
BIOINFORMATICS
Volume 30, Issue 15, Pages 2121-2129Publisher
OXFORD UNIV PRESS
DOI: 10.1093/bioinformatics/btu174
Keywords
-
Categories
Funding
- National Institute of Health [R01HG006870]
Ask authors/readers for more resources
Motivation: Next-generation sequencing (NGS) has revolutionized the study of cancer genomes. However, the reads obtained from NGS of tumor samples often consist of a mixture of normal and tumor cells, which themselves can be of multiple clonal types. A prominent problem in the analysis of cancer genome sequencing data is deconvolving the mixture to identify the reads associated with tumor cells or a particular subclone of tumor cells. Solving the problem is, however, challenging because of the so-called 'identifiability problem', where different combinations of tumor purity and ploidy often explain the sequencing data equally well. Results: We propose a new model to resolve the identifiability problem by integrating two types of sequencing information-somatic copy number alterations and loss of heterozygosity-within a unified probabilistic framework. We derive algorithms to solve our model, and implement them in a software package called PyLOH. We benchmark the performance of PyLOH using both simulated data and 12 breast cancer sequencing datasets and show that PyLOH outperforms existing methods in disambiguating the identifiability problem and estimating tumor purity.
Authors
I am an author on this paper
Click your name to claim this paper and add it to your profile.
Reviews
Recommended
No Data Available