4.7 Article

Visualization, benchmarking and characterization of nested single-cell heterogeneity as dynamic forest mixtures

期刊

BRIEFINGS IN BIOINFORMATICS
卷 23, 期 2, 页码 -

出版社

OXFORD UNIV PRESS
DOI: 10.1093/bib/bbac017

关键词

forest mixtures; multimodality; minimum spanning tree; nested models; single-cell trajectory analysis; cell differentiation

资金

  1. National Institute of Environmental Health Sciences [1ZIAES103350-01/02]

向作者/读者索取更多资源

A major topic of debate in developmental biology is whether development is continuous, discontinuous or a mixture of both. This study presents a data-driven framework that addresses the limitations of current models for visualizing and characterizing complex relationships during biological processes. The results indicate that gene expression during normal development exhibits non-uniformly distributed profiles, mostly right-skewed and multimodal.
A major topic of debate in developmental biology centers on whether development is continuous, discontinuous, or a mixture of both. Pseudo-time trajectory models, optimal for visualizing cellular progression, model cell transitions as continuous state manifolds and do not explicitly model real-time, complex, heterogeneous systems and are challenging for benchmarking with temporal models. We present a data-driven framework that addresses these limitations with temporal single-cell data collected at discrete time points as inputs and a mixture of dependent minimum spanning trees (MSTs) as outputs, denoted as dynamic spanning forest mixtures (DSFMix). DSFMix uses decision-tree models to select genes that account for variations in multimodality, skewness and time. The genes are subsequently used to build the forest using tree agglomerative hierarchical clustering and dynamic branch cutting. We first motivate the use of forest-based algorithms compared to single-tree approaches for visualizing and characterizing developmental processes. We next benchmark DSFMix to pseudo-time and temporal approaches in terms of feature selection, time correlation, and network similarity. Finally, we demonstrate how DSFMix can be used to visualize, compare and characterize complex relationships during biological processes such as epithelial-mesenchymal transition, spermatogenesis, stem cell pluripotency, early transcriptional response from hormones and immune response to coronavirus disease. Our results indicate that the expression of genes during normal development exhibits a high proportion of non-uniformly distributed profiles that are mostly right-skewed and multimodal; the latter being a characteristic of major steady states during development. Our study also identifies and validates gene signatures driving complex dynamic processes during somatic or germline differentiation.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据