4.7 Article

Overcoming selection bias in synthetic lethality prediction

期刊

BIOINFORMATICS
卷 38, 期 18, 页码 4360-4368

出版社

OXFORD UNIV PRESS
DOI: 10.1093/bioinformatics/btac523

关键词

-

资金

  1. Holland Proton Therapy Center [2019020]
  2. United States National Institutes of Health [U54EY032442]

向作者/读者索取更多资源

Synthetic lethality (SL) occurs when simultaneous loss of function of two genes leads to cell death. Identifying SL relationships is challenging due to the vast number of candidate pairs. This study introduces a selection bias-resilient synthetic lethality (SBSL) prediction method that shows higher predictive performance, better generalizability, and robustness to selection bias compared to current SL prediction methods.
Motivation: Synthetic lethality (SL) between two genes occurs when simultaneous loss of function leads to cell death. This holds great promise for developing anti-cancer therapeutics that target synthetic lethal pairs of endogenously disrupted genes. Identifying novel SL relationships through exhaustive experimental screens is challenging, due to the vast number of candidate pairs. Computational SL prediction is therefore sought to identify promising SL gene pairs for further experimentation. However, current SL prediction methods lack consideration for generalizability in the presence of selection bias in SL data. Results: We show that SL data exhibit considerable gene selection bias. Our experiments designed to assess the robustness of SL prediction reveal that models driven by the topology of known SL interactions (e.g. graph, matrix factorization) are especially sensitive to selection bias. We introduce selection bias-resilient synthetic lethality (SBSL) prediction using regularized logistic regression or random forests. Each gene pair is described by 27 molecular features derived from cancer cell line, cancer patient tissue and healthy donor tissue samples. SBSL models are built and tested using approximately 8000 experimentally derived SL pairs across breast, colon, lung and ovarian cancers. Compared to other SL prediction methods, SBSL showed higher predictive performance, better generalizability and robustness to selection bias. Gene dependency, quantifying the essentiality of a gene for cell survival, contributed most to SBSL predictions. Random forests were superior to linear models in the absence of dependency features, highlighting the relevance of mutual exclusivity of somatic mutations, co-expression in healthy tissue and differential expression in tumour samples.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据