4.7 Article

Enhancing droplet-based single-nucleus RNA-seq resolution using the semi-supervised machine learning classifier DIEM

期刊

SCIENTIFIC REPORTS
卷 10, 期 1, 页码 -

出版社

NATURE PORTFOLIO
DOI: 10.1038/s41598-020-67513-5

关键词

-

资金

  1. National Institutes of Health (NIH) [HL-095056, HL-28481, U01 DK105561]
  2. HHMI Gilliam Fellowship
  3. NIH [F31HL142180, T32HG002536, DK41301, 1R56MD013312, 1R01MH115979, 5R25GM112625, 5UL1TR001881]
  4. National Science Foundation [1705197]
  5. NIH/NHGRI [HG010505-02]
  6. National Science Foundation Graduate Research Fellowship Program [DGE-1650604]
  7. AHA [19PRE34430112]
  8. Academy of Finland [272376, 266286, 314383, 315035]
  9. Finnish Medical Foundation
  10. Finnish Diabetes Research Foundation
  11. Novo Nordisk Foundation
  12. Sigrid Juselius Foundation
  13. Helsinki University Hospital Research Funds
  14. University of Helsinki
  15. DDRC Pilot and Feasibility of the National Institutes of Health [DKP3041301]
  16. National Center for Advancing Translational Sciences at UCLA, CTSI [ULTR001881]
  17. Gyllenberg Foundation

向作者/读者索取更多资源

Single-nucleus RNA sequencing (snRNA-seq) measures gene expression in individual nuclei instead of cells, allowing for unbiased cell type characterization in solid tissues. We observe that snRNA-seq is commonly subject to contamination by high amounts of ambient RNA, which can lead to biased downstream analyses, such as identification of spurious cell types if overlooked. We present a novel approach to quantify contamination and filter droplets in snRNA-seq experiments, called Debris Identification using Expectation Maximization (DIEM). Our likelihood-based approach models the gene expression distribution of debris and cell types, which are estimated using EM. We evaluated DIEM using three snRNA-seq data sets: (1) human differentiating preadipocytes in vitro, (2) fresh mouse brain tissue, and (3) human frozen adipose tissue (AT) from six individuals. All three data sets showed evidence of extranuclear RNA contamination, and we observed that existing methods fail to account for contaminated droplets and led to spurious cell types. When compared to filtering using these state of the art methods, DIEM better removed droplets containing high levels of extranuclear RNA and led to higher quality clusters. Although DIEM was designed for snRNA-seq, our clustering strategy also successfully filtered single-cell RNA-seq data. To conclude, our novel method DIEM removes debris-contaminated droplets from single-cell-based data fast and effectively, leading to cleaner downstream analysis. Our code is freely available for use at https://github.com/marcalva/diem.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据