4.8 Article

Recovery of missing single-cell RNA-sequencing data with optimized transcriptomic references

期刊

NATURE METHODS
卷 20, 期 10, 页码 1506-+

出版社

NATURE PORTFOLIO
DOI: 10.1038/s41592-023-02003-w

关键词

-

向作者/读者索取更多资源

This study identifies the reasons for missing gene expression data in single-cell RNA sequencing and proposes a method to optimize the reference transcriptome. By recovering false intergenic reads, implementing a hybrid pre-mRNA mapping strategy, and resolving gene overlaps, missing gene expression data can be restored. The findings have important implications for improving cellular profiling resolution and discovering missing cell types and marker genes.
Single-cell RNA-sequencing (scRNA-seq) is an indispensable tool for characterizing cellular diversity and generating hypotheses throughout biology. Droplet-based scRNA-seq datasets often lack expression data for genes that can be detected with other methods. Here we show that the observed sensitivity deficits stem from three sources: (1) poor annotation of 3' gene ends; (2) issues with intronic read incorporation; and (3) gene overlap-derived read loss. We show that missing gene expression data can be recovered by optimizing the reference transcriptome for scRNA-seq through recovering false intergenic reads, implementing a hybrid pre-mRNA mapping strategy and resolving gene overlaps. We demonstrate, with a diverse collection of mouse and human tissue data, that reference optimization can substantially improve cellular profiling resolution and reveal missing cell types and marker genes. Our findings argue that transcriptomic references need to be optimized for scRNA-seq analysis and warrant a reanalysis of previously published datasets and cell atlases.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.8
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据