4.6 Article

A Personalized Low-Rank Subspace Clustering Method Based on Locality and Similarity Constraints for scRNA-seq Data Analysis

Journal

IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS
Volume 27, Issue 5, Pages 2575-2584

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/JBHI.2023.3247723

Keywords

Kernel; Clustering methods; Data models; Data mining; Robustness; Clustering algorithms; Sequential analysis; Clustering; local structure constraint; low-rank representation; scRNA-seq; similarity constraint

Ask authors/readers for more resources

Single-cell RNA sequencing (scRNA-seq) technology provides expression profiles of individual cells, driving biological research into a new stage. However, the high-dimensional, sparse, and noisy nature of scRNA-seq data poses challenges for single-cell clustering. In response, we propose a personalized low-rank subspace clustering method (PLRLS) that learns more accurate subspace structures from both global and local perspectives.
Single-cell RNA sequencing (scRNA-seq) technology can provide expression profile of single cells, which propels biological research into a new chapter. Clustering individual cells based on their transcriptome is a critical objective of scRNA-seq data analysis. However, the high-dimensional, sparse and noisy nature of scRNA-seq data pose a challenge to single-cell clustering. Therefore, it is urgent to develop a clustering method targeting scRNA-seq data characteristics. Due to its powerful subspace learning capability and robustness to noise, the subspace segmentation method based on low-rank representation (LRR) is broadly used in clustering researches and achieves satisfactory results. In view of this, we propose a personalized low-rank subspace clustering method, namely PLRLS, to learn more accurate subspace structures from both global and local perspectives. Specifically, we first introduce the local structure constraint to capture the local structure information of the data, while helping our method to obtain better inter-cluster separability and intra-cluster compactness. Then, in order to retain the important similarity information that is ignored by the LRR model, we utilize the fractional function to extract similarity information between cells, and introduce this information as the similarity constraint into the LRR framework. The fractional function is an efficient similarity measure designed for scRNA-seq data, which has theoretical and practical implications. In the end, based on the LRR matrix learned from PLRLS, we perform downstream analyses on real scRNA-seq datasets, including spectral clustering, visualization and marker gene identification. Comparative experiments show that the proposed method achieves superior clustering accuracy and robustness.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available