4.6 Article

Comprehensive analysis of cancer breakpoints reveals signatures of genetic and epigenetic contribution to cancer genome rearrangements

期刊

PLOS COMPUTATIONAL BIOLOGY
卷 17, 期 3, 页码 -

出版社

PUBLIC LIBRARY SCIENCE
DOI: 10.1371/journal.pcbi.1008749

关键词

-

资金

  1. Centre of Fundamental Research of the National Research University Higher School of Economics

向作者/读者索取更多资源

The study analyzed a large number of cancer breakpoints and identified that transcription and formation of non-B DNA structures are the major processes contributing to cancer genome fragility, with epigenetic factors also playing a role. Each cancer type has its own characteristics in breakpoint distribution, and predictive models for cancer breakpoint formation can be improved using machine learning approaches.
Author summary We analysed more than half a million breakpoints from all major cancer types and quantified contributions of genetic and epigenetic factors to cancer breakpoint mutagenesis. The results suggest that transcription and formation of non-B DNA structures are the two major processes responsible for cancer genome fragility. Epigenetic factors, such as chromatin organization in TADs, open/closed regions, histone marks are less informative while still contributive. Despite the common trends, each cancer type has its own peculiarities. Breakpoint hotspots in brain can be predicted by distribution of non-B DNA structures, those in liver by transcription factor binding sites, those in blood by non-B DNA structures and promoter regions. Cancer breakpoint landscape can be viewed as hotspots and individual breakpoints scattered all over the genome. Hotspots have distinct genomic and epigenomic signatures with relative contribution varied for different cancer types. Individual cancer breakpoints are the mixture of random noise and breakpoints with a recognizable mutation signature. Quantifying contribution of different factors to cancer breakpoint mutagenesis for individual cancer genomes will enhance our understanding of individual mechanisms of cancer genome rearrangement. Understanding mechanisms of cancer breakpoint mutagenesis is a difficult task and predictive models of cancer breakpoint formation have to this time failed to achieve even moderate predictive power. Here we take advantage of a machine learning approach that can gather important features from big data and quantify contribution of different factors. We performed comprehensive analysis of almost 630,000 cancer breakpoints and quantified the contribution of genomic and epigenomic features-non-B DNA structures, chromatin organization, transcription factor binding sites and epigenetic markers. The results showed that transcription and formation of non-B DNA structures are two major processes responsible for cancer genome fragility. Epigenetic factors, such as chromatin organization in TADs, open/closed regions, DNA methylation, histone marks are less informative but do make their contribution. As a general trend, individual features inside the groups show a relatively high contribution of G-quadruplexes and repeats and CTCF, GABPA, RXRA, SP1, MAX and NR2F2 transcription factors. Overall, the cancer breakpoint landscape can be represented by well-predicted hotspots and poorly predicted individual breakpoints scattered across genomes. We demonstrated that hotspot mutagenesis has genomic and epigenomic factors, and not all individual cancer breakpoints are just random noise but have a definite mutation signature. Besides we found a long-range action of some features on breakpoint mutagenesis. Combining omics data, cancer-specific individual feature importance and adding the distant to local features, predictive models for cancer breakpoint formation achieved 70-90% ROC AUC for different cancer types; however precision remained low at 2% and the recall did not exceed 50%. On the one hand, the power of models strongly correlates with the size of available cancer breakpoint and epigenomic data, and on the other hand finding strong determinants of cancer breakpoint formation still remains a challenge. The strength of predictive signals of each group and of each feature inside a group can be converted into cancer-specific breakpoint mutation signatures. Overall our results add to the understanding of cancer genome rearrangement processes.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据