☆ 4.6 Article Proceedings Paper

Reconstructing cancer genomes from paired-end sequencing data

BMC BIOINFORMATICS (2012)

期刊

BMC BIOINFORMATICS

卷 13, 期 -, 页码 -

出版社

BMC

DOI: 10.1186/1471-2105-13-S6-S10

关键词

类别

Biochemical Research Methods Biotechnology & Applied Microbiology Mathematical & Computational Biology

资金

Division of Computing and Communication Foundations
Direct For Computer & Info Scie & Enginr [1053753] Funding Source: National Science Foundation
NHGRI NIH HHS [R01 HG005690] Funding Source: Medline

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

摘要

Background: A cancer genome is derived from the germline genome through a series of somatic mutations. Somatic structural variants - including duplications, deletions, inversions, translocations, and other rearrangements - result in a cancer genome that is a scrambling of intervals, or blocks of the germline genome sequence. We present an efficient algorithm for reconstructing the block organization of a cancer genome from paired-end DNA sequencing data. Results: By aligning paired reads from a cancer genome - and a matched germline genome, if available - to the human reference genome, we derive: (i) a partition of the reference genome into intervals; (ii) adjacencies between these intervals in the cancer genome; (iii) an estimated copy number for each interval. We formulate the Copy Number and Adjacency Genome Reconstruction Problem of determining the cancer genome as a sequence of the derived intervals that is consistent with the measured adjacencies and copy numbers. We design an efficient algorithm, called Paired-end Reconstruction of Genome Organization (PREGO), to solve this problem by reducing it to an optimization problem on an interval-adjacency graph constructed from the data. The solution to the optimization problem results in an Eulerian graph, containing an alternating Eulerian tour that corresponds to a cancer genome that is consistent with the sequencing data. We apply our algorithm to five ovarian cancer genomes that were sequenced as part of The Cancer Genome Atlas. We identify numerous rearrangements, or structural variants, in these genomes, analyze reciprocal vs. non-reciprocal rearrangements, and identify rearrangements consistent with known mechanisms of duplication such as tandem duplications and breakage/fusion/bridge (B/F/B) cycles. Conclusions: We demonstrate that PREGO efficiently identifies complex and biologically relevant rearrangements in cancer genome sequencing data. An implementation of the PREGO algorithm is available at http://compbio.cs.brown.edu/software/.

Reconstructing cancer genomes from paired-end sequencing data

期刊

BMC BIOINFORMATICS

出版社

BMC

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Reconstructing cancer genomes from paired-end sequencing data

期刊

BMC BIOINFORMATICS

出版社

BMC

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文