4.8 Article

Deep learning-assisted genome-wide characterization of massively parallel reporter assays

期刊

NUCLEIC ACIDS RESEARCH
卷 50, 期 20, 页码 11442-11454

出版社

OXFORD UNIV PRESS
DOI: 10.1093/nar/gkac990

关键词

-

资金

  1. NIH/NIA [AG066206]

向作者/读者索取更多资源

Researchers have developed a deep learning model called MpraNet to identify potential MPRA targets. The model can efficiently distinguish MPRA positives from the background genome, and predict potential MPRA functional variants across the genome. Additionally, the study found that MPRA positives are not uniformly distributed in the genome and proposed the model as a screen for filtering MPRA experiment candidates.
Massively parallel reporter assay (MPRA) is a high-throughput method that enables the study of the regulatory activities of tens of thousands of DNA oligonucleotides in a single experiment. While MPRA experiments have grown in popularity, their small sample sizes compared to the scale of the human genome limits our understanding of the regulatory effects they detect. To address this, we develop a deep learning model, MpraNet, to distinguish potential MPRA targets from the background genome. This model achieves high discriminative performance (AUROC = 0.85) at differentiating MPRA positives from a set of control variants that mimic the background genome when applied to the lymphoblastoid cell line. We observe that existing functional scores represent very distinct functional effects, and most of them fail to characterize the regulatory effect that MPRA detects. Using MpraNet, we predict potential MPRA functional variants across the genome and identify the distributions of MPRA effect relative to other characteristics of genetic variation, including allele frequency, alternative functional annotations specified by FAVOR, and phenome-wide associations. We also observed that the predicted MPRA positives are not uniformly distributed across the genome; instead, they are clumped together in active regions comprising 9.95% of the genome and inactive regions comprising 89.07% of the genome. Furthermore, we propose our model as a screen to filter MPRA experiment candidates at genome-wide scale, enabling future experiments to be more cost-efficient by increasing precision relative to that observed from previous MPRAs.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.8
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据