☆ 4.7 Article

Constrained Standardization of Count Data from Massive Parallel Sequencing

JOURNAL OF MOLECULAR BIOLOGY (2021)

期刊

JOURNAL OF MOLECULAR BIOLOGY

卷 433, 期 11, 页码 -

出版社

ACADEMIC PRESS LTD- ELSEVIER SCIENCE LTD

DOI: 10.1016/j.jmb.2021.166966

关键词

normalization; RNA-seq; transcriptomics; proteomics; multi-omics

类别

Biochemistry & Molecular Biology

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

This study investigated the application of a normalization method called CONSTANd in transcriptomics and proposed an adjustment for joint analysis. The results showed that CONSTANd can efficiently process large datasets, reduce systematic bias, and quickly reveal the underlying biological structure.

In high-throughput omics disciplines like transcriptomics, researchers face a need to assess the quality of an experiment prior to an in-depth statistical analysis. To efficiently analyze such voluminous collections of data, researchers need triage methods that are both quick and easy to use. Such a normalization method for relative quantitation, CONSTANd, was recently introduced for isobarically-labeled mass spectra in proteomics. It transforms the data matrix of abundances through an iterative, convergent process enforcing three constraints: (I) identical column sums; (II) each row sum is fixed (across matrices) and (III) identical to all other row sums. In this study, we investigate whether CONSTANd is suitable for count data from massively parallel sequencing, by qualitatively comparing its results to those of DESeq2. Further, we propose an adjustment of the method so that it may be applied to identically balanced but differently sized experiments for joint analysis. We find that CONSTANd can process large data sets at well over 1 million count records per second whilst mitigating unwanted systematic bias and thus quickly uncovering the underlying biological structure when combined with a PCA plot or hierarchical clustering. Moreover, it allows joint analysis of data sets obtained from different batches, with different protocols and from different labs but without exploiting information from the experimental setup other than the delineation of samples into identically processed sets (IPSs). CONSTANd's simplicity and applicability to proteomics as well as transcriptomics data make it an interesting candidate for integration in multi-omics workflows. (C) 2021 Elsevier Ltd. All rights reserved.

Constrained Standardization of Count Data from Massive Parallel Sequencing

期刊

JOURNAL OF MOLECULAR BIOLOGY

出版社

ACADEMIC PRESS LTD- ELSEVIER SCIENCE LTD

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Constrained Standardization of Count Data from Massive Parallel Sequencing

期刊

JOURNAL OF MOLECULAR BIOLOGY

出版社

ACADEMIC PRESS LTD- ELSEVIER SCIENCE LTD

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文