4.8 Article

Improved biomarker discovery through a plot twist in transcriptomic data analysis

期刊

BMC BIOLOGY
卷 20, 期 1, 页码 -

出版社

BMC
DOI: 10.1186/s12915-022-01398-w

关键词

Gene expression analysis; Gene networks; Weighted gene co-expression network analysis (WGCNA); Sex determination and differentiation; Gonadal development; Biomarker discovery

类别

资金

  1. Spanish Ministry of Science (SMS) predoctoral scholarship [BES-2017-079744]
  2. SMS [PID2019-108888RB-I00]
  3. Spanish government through the `Severo Ochoa Centre of Excellence' accreditation [CEX2019-000928-S]

向作者/读者索取更多资源

The study proposes a new method for transcriptomic data analysis, where WGCNA is first applied to utilize the entire dataset followed by filtering with DEGs, which outperformed the traditional method of filtering with DEGs first and then applying WGCNA. The new method showed improved network model fit, node connectivity measures, and provided a more nuanced representation of biological processes through GO terms.
Background Transcriptomic analysis is crucial for understanding the functional elements of the genome, with the classic method consisting of screening transcriptomics datasets for differentially expressed genes (DEGs). Additionally, since 2005, weighted gene co-expression network analysis (WGCNA) has emerged as a powerful method to explore relationships between genes. However, an approach combining both methods, i.e., filtering the transcriptome dataset by DEGs or other criteria, followed by WGCNA (DEGs + WGCNA), has become common. This is of concern because such approach can affect the resulting underlying architecture of the network under analysis and lead to wrong conclusions. Here, we explore a plot twist to transcriptome data analysis: applying WGCNA to exploit entire datasets without affecting the topology of the network, followed with the strength and relative simplicity of DEG analysis (WGCNA + DEGs). We tested WGCNA + DEGs against DEGs + WGCNA to publicly available transcriptomics data in one of the most transcriptomically complex tissues and delicate processes: vertebrate gonads undergoing sex differentiation. We further validate the general applicability of our approach through analysis of datasets from three distinct model systems: European sea bass, mouse, and human. Results In all cases, WGCNA + DEGs clearly outperformed DEGs + WGCNA. First, the network model fit and node connectivity measures and other network statistics improved. The gene lists filtered by each method were different, the number of modules associated with the trait of interest and key genes retained increased, and GO terms of biological processes provided a more nuanced representation of the biological question under consideration. Lastly, WGCNA + DEGs facilitated biomarker discovery. Conclusions We propose that building a co-expression network from an entire dataset, and only thereafter filtering by DEGs, should be the method to use in transcriptomic studies, regardless of biological system, species, or question being considered.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.8
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据