4.7 Article

High-throughput single-cell RNA-seq data imputation and characterization with surrogate-assisted automated deep learning

Journal

BRIEFINGS IN BIOINFORMATICS
Volume 23, Issue 1, Pages -

Publisher

OXFORD UNIV PRESS
DOI: 10.1093/bib/bbab368

Keywords

-

Funding

  1. National Natural Science Foundation of China [62076109, 32000464]
  2. Natural Science Foundation of Jilin Province [20190103006JH]
  3. Research Grants Council of the Hong Kong Special Administrative Region [CityU 11200218]
  4. Health and Medical Research Fund
  5. Food and Health Bureau, The Government of the Hong Kong Special Administrative Region [07181426]
  6. Hong Kong Institute for Data Science (HKIDS) at City University of Hong Kong
  7. City University of Hong Kong [CityU 11202219, CityU 11203520]
  8. Shenzhen Research Institute, City University of Hong Kong

Ask authors/readers for more resources

The article introduces a method called SEDIM, which automatically designs deep neural network architectures for imputing gene expression levels. It improves computational efficiency by constructing an offline surrogate model. Experimental results show that SEDIM significantly improves imputation and clustering performance, and also performs well in other contexts and platforms.
Single-cell RNA sequencing (scRNA-seq) technologies have been heavily developed to probe gene expression profiles at single-cell resolution. Deep imputation methods have been proposed to address the related computational challenges (e.g. the gene sparsity in single-cell data). In particular, the neural architectures of those deep imputation models have been proven to be critical for performance. However, deep imputation architectures are difficult to design and tune for those without rich knowledge of deep neural networks and scRNA-seq. Therefore, Surrogate-assisted Evolutionary Deep Imputation Model (SEDIM) is proposed to automatically design the architectures of deep neural networks for imputing gene expression levels in scRNA-seq data without any manual tuning. Moreover, the proposed SEDIM constructs an offline surrogate model, which can accelerate the computational efficiency of the architectural search. Comprehensive studies show that SEDIM significantly improves the imputation and clustering performance compared with other benchmark methods. In addition, we also extensively explore the performance of SEDIM in other contexts and platforms including mass cytometry and metabolic profiling in a comprehensive manner. Marker gene detection, gene ontology enrichment and pathological analysis are conducted to provide novel insights into cell-type identification and the underlying mechanisms. The source code is available at https://github.com/li-shaochuan/SEDIM.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available