4.6 Review

Overview of data preprocessing for machine learning applications in human microbiome research

期刊

FRONTIERS IN MICROBIOLOGY
卷 14, 期 -, 页码 -

出版社

FRONTIERS MEDIA SA
DOI: 10.3389/fmicb.2023.1250909

关键词

human microbiome; data preprocessing; machine learning; compositionality; normalization; metagenomics data

向作者/读者索取更多资源

This mini review examines the preprocessing and transformation methods used in recent human microbiome studies, highlighting the limited adoption of statistical transformation methods specifically targeting microbiome sequencing data characteristics. Instead, relative and normalization-based transformations are commonly used without considering the specific attributes of microbiome data. The lack of information on preprocessing and transformations applied to the data raises concerns about reproducibility, comparability, and reliability of results.
Although metagenomic sequencing is now the preferred technique to study microbiome-host interactions, analyzing and interpreting microbiome sequencing data presents challenges primarily attributed to the statistical specificities of the data (e.g., sparse, over-dispersed, compositional, inter-variable dependency). This mini review explores preprocessing and transformation methods applied in recent human microbiome studies to address microbiome data analysis challenges. Our results indicate a limited adoption of transformation methods targeting the statistical characteristics of microbiome sequencing data. Instead, there is a prevalent usage of relative and normalization-based transformations that do not specifically account for the specific attributes of microbiome data. The information on preprocessing and transformations applied to the data before analysis was incomplete or missing in many publications, leading to reproducibility concerns, comparability issues, and questionable results. We hope this mini review will provide researchers and newcomers to the field of human microbiome research with an up-to-date point of reference for various data transformation tools and assist them in choosing the most suitable transformation method based on their research questions, objectives, and data characteristics.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据