☆ 4.7 Article

METAbolomics data Balancing with Over-sampling Algorithms (META-BOA): an online resource for addressing class imbalance

BIOINFORMATICS (2022)

Journal

BIOINFORMATICS

Volume 38, Issue 23, Pages 5326-5327

Publisher

OXFORD UNIV PRESS

DOI: 10.1093/bioinformatics/btac649

Keywords

Funding

Natural Sciences and Engineering Research Council of Canada (NSERC) [RGPIN-2019-06796]
NSERC CREATE Matrix Metabolomics Training grant
National Research Council AI for Design Challenge Program [AI-4D-102-3]
NSERC Discovery Grant
NSERC CREATE Matrix Metabolomics Scholarship

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Automated Summary New
Abstract

This study developed an online application called META-BOA to handle class imbalance and provide assistance in data visualization and sample classification. The tool offers four different methods for class balancing and can generate a new balanced dataset to observe the augmentation effects.

Motivation: Class imbalance, or unequal sample sizes between classes, is an increasing concern in machine learning for metabolomic and lipidomic data mining, which can result in overfitting for the over-represented class. Numerous methods have been developed for handling class imbalance, but they are not readily accessible to users with limited computational experience. Moreover, there is no resource that enables users to easily evaluate the effect of different over-sampling algorithms. Results: METAbolomics data Balancing with Over-sampling Algorithms (META-BOA) is a web-based application that enables users to select between four different methods for class balancing, followed by data visualization and classification of the sample to observe the augmentation effects. META-BOA outputs a newly balanced dataset, generating additional samples in the minority class, according to the user's choice of Synthetic Minority Over-sampling Technique (SMOTE), Borderline-SMOTE (BSMOTE), Adaptive Synthetic (ADASYN) or Random Over-Sampling Examples (ROSE). To present the effect of over-sampling on the data META-BOA further displays both principal component analysis and t-distributed stochastic neighbor embedding visualization of data pre- and post-over-sampling. Random forest classification is utilized to compare sample classification in both the original and balanced datasets, enabling users to select the most appropriate method for their further analyses.

METAbolomics data Balancing with Over-sampling Algorithms (META-BOA): an online resource for addressing class imbalance

Journal

BIOINFORMATICS

Publisher

OXFORD UNIV PRESS

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

METAbolomics data Balancing with Over-sampling Algorithms (META-BOA): an online resource for addressing class imbalance

Journal

BIOINFORMATICS

Publisher

OXFORD UNIV PRESS

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper