4.7 Article

A theoretical distribution analysis of synthetic minority oversampling technique (SMOTE) for imbalanced learning

Related references

Note: Only part of the references are listed.
Article Computer Science, Artificial Intelligence

DeepSMOTE: Fusing Deep Learning and SMOTE for Imbalanced Data

Damien Dablain et al.

Summary: In this study, we propose a novel oversampling algorithm called DeepSMOTE for deep learning models, which generates high-quality artificial images to increase the number of samples for minority classes and balance the training set.

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS (2023)

Article Computer Science, Artificial Intelligence

Correlation-based Oversampling aided Cost Sensitive Ensemble learning technique for Treatment of Class Imbalance

Debashree Devi et al.

Summary: Class imbalance has a significant impact on conventional learning models, with SMOTE offering a solution for balancing data but generating redundant data, and ensemble learning improving prediction abilities but not considering data imbalance. The proposed CorrOV-CSEn method combines correlation-based oversampling with the AdaBoost ensemble learning model, and experimental results show its effectiveness in addressing the issues.

JOURNAL OF EXPERIMENTAL & THEORETICAL ARTIFICIAL INTELLIGENCE (2022)

Article Computer Science, Artificial Intelligence

Simulated annealing based undersampling (SAUS): a hybrid multi-objective optimization method to tackle class imbalance

Venkata Krishnaveni Chennuru et al.

Summary: The paper introduces a Simulated Annealing-based Under Sampling (SAUS) method to address the learning from imbalanced datasets problem in machine learning. By balancing the Error Rate cost function, the method effectively balances Sensitivity and Specificity measures in each iteration of the solution, successfully tackling the class imbalance issue.

APPLIED INTELLIGENCE (2022)

Article Medical Informatics

Solving the class imbalance problem using ensemble algorithm: application of screening for aortic dissection

Lijue Liu et al.

Summary: This study addresses the problem of class imbalance in medical data and validates an approach for early screening of aortic dissection (AD). By integrating feature selection, undersampling, cost-sensitive learning, and bagging methods, an effective classification model is developed.

BMC MEDICAL INFORMATICS AND DECISION MAKING (2022)

Article Computer Science, Artificial Intelligence

LDAS: Local density-based adaptive sampling for imbalanced data classification

Yuanting Yan et al.

Summary: The study proposes a local density-based adaptive sampling method (LDAS) to address the issue of class imbalance. LDAS assigns local density to minority class examples, removes overlapping majority examples, and weights each minority example based on its approaching degree of decision boundary. This approach generates synthetic examples in the safe area and the border area simultaneously.

EXPERT SYSTEMS WITH APPLICATIONS (2022)

Article Computer Science, Artificial Intelligence

Two density-based sampling approaches for imbalanced and overlapping data

Sima Mayabadi et al.

Summary: An imbalanced dataset poses challenges to learning algorithms due to the majority class dominance, and this paper proposes two density-based algorithms that use undersampling and oversampling techniques to eliminate overlap and noise, achieving balanced and normalized class distribution. These algorithms outperform other popular algorithms in various evaluation criteria.

KNOWLEDGE-BASED SYSTEMS (2022)

Proceedings Paper Computer Science, Artificial Intelligence

A Preliminary Study of SMOTE on Imbalanced Big Datasets When Dealing with Sparse and Dense High Dimensionality

A. Bolivar et al.

Summary: The interest in using machine learning with big datasets has led to a need to adapt classic strategies to the new paradigm defined by volume, speed, and variety. Data quality is critical in building classifiers, so new data preprocessing techniques have been developed or adapted. The class imbalance problem, where the class of interest has fewer examples than the majority class, is one of the significant challenges. SMOTE is a well-recognized technique for address this problem by generating instances of the minority class. However, recent studies have shown that SMOTE may not be suitable for high-dimensional datasets. This article aims to analyze the behavior of SMOTE-BD on imbalanced big datasets with both sparse and dense dimensionalities.

PATTERN RECOGNITION, MCPR 2022 (2022)

Article Computer Science, Artificial Intelligence

CDSMOTE: class decomposition and synthetic minority class oversampling technique for imbalanced-data classification

Eyad Elyan et al.

Summary: Class-imbalanced datasets are common in various domains, and using class decomposition and oversampling methods can effectively reduce the dominance of majority class instances. Experimental results demonstrate the effectiveness and superiority of the proposed hybrid approach in addressing class imbalance.

NEURAL COMPUTING & APPLICATIONS (2021)

Article Engineering, Multidisciplinary

Early and accurate prediction of diabetics based on FCBF feature selection and SMOTE

Amit Kishor et al.

Summary: This study introduces a machine learning-based healthcare model for accurate and early detection of diabetes. The experimental results suggest that using a few relevant features can enhance the accuracy of the model. The Random Forest classifier achieves the highest accuracy, sensitivity, and specificity.

INTERNATIONAL JOURNAL OF SYSTEM ASSURANCE ENGINEERING AND MANAGEMENT (2021)

Article Computer Science, Artificial Intelligence

Fuzzy least squares projection twin support vector machines for class imbalance learning

M. A. Ganaie et al.

Summary: This paper introduces a novel fuzzy least squares projection twin support vector machines for class imbalance learning, which outperforms baseline models in experiments.

APPLIED SOFT COMPUTING (2021)

Article Computer Science, Artificial Intelligence

RB-CCR: Radial-Based Combined Cleaning and Resampling algorithm for imbalanced data classification

Michal Koziarski et al.

Summary: In this paper, a Radial-Based Combined Cleaning and Resampling (RB-CCR) approach is proposed for improving classification performance on imbalanced binary data, achieving better precision-recall trade-off and generally out-performing state-of-the-art resampling methods in terms of AUC and G-mean.

MACHINE LEARNING (2021)

Article Computer Science, Artificial Intelligence

A hybrid method with dynamic weighted entropy for handling the problem of class imbalance with overlap in credit card fraud detection

Zhenchuan Li et al.

Summary: This paper proposes a novel hybrid method to handle the problem of class imbalance with overlap in electronic fraud transaction detection. The method involves training an anomaly detection model on minority samples, excluding outliers and majority samples, then using a non-linear classifier to distinguish the remaining overlapping subset. The proposed method significantly outperforms state-of-the-art ones based on extensive experiments.

EXPERT SYSTEMS WITH APPLICATIONS (2021)

Article Computer Science, Artificial Intelligence

No Free Lunch in imbalanced learning

Nuno Moniz et al.

Summary: The No Free Lunch theorems have sparked debates on the impact of data preprocessing methods in the field of imbalanced domain learning. The study concludes that in the context of imbalanced domain learning, resampling strategies have equivalent impact on predictive model performance.

KNOWLEDGE-BASED SYSTEMS (2021)

Review Computer Science, Information Systems

Review of Classification Methods on Unbalanced Data Sets

Le Wang et al.

Summary: This paper explores the classification of unbalanced data sets, analyzing various methods from data sampling, algorithm, feature, cost-sensitive function, and deep learning perspectives, comparing the advantages and disadvantages of different techniques, and outlining future research directions.

IEEE ACCESS (2021)

Article Computer Science, Artificial Intelligence

LoRAS: an oversampling approach for imbalanced datasets

Saptarshi Bej et al.

Summary: This article introduces a method, LoRAS, that overcomes the limitations of SMOTE oversampling technique, and through experiments, proves that LoRAS generates better machine learning models on imbalanced datasets, improving F1-Score and balanced accuracy. Compared to most SMOTE extensions, LoRAS achieves better results in generating classification models.

MACHINE LEARNING (2021)

Article Computer Science, Information Systems

Neighbourhood-based undersampling approach for handling imbalanced and overlapped data

Pattaramon Vuttipittayamongkol et al.

INFORMATION SCIENCES (2020)

Article Computer Science, Information Systems

Data imbalance in classification: Experimental evaluation

Fadi Thabtah et al.

INFORMATION SCIENCES (2020)

Article Computer Science, Information Systems

A Comprehensive Analysis of Synthetic Minority Oversampling Technique (SMOTE) for handling class imbalance

Dina Elreedy et al.

INFORMATION SCIENCES (2019)

Article Computer Science, Artificial Intelligence

SMOTE for Learning from Imbalanced Data: Progress and Challenges, Marking the 15-year Anniversary

Alberto Fernandez et al.

JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH (2018)

Article Computer Science, Artificial Intelligence

A Systematic Study of Online Class Imbalance Learning With Concept Drift

Shuo Wang et al.

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS (2018)

Article Computer Science, Artificial Intelligence

Adaptive Learning-Based k-Nearest Neighbor Classifiers With Resilience to Class Imbalance

Sankha Subhra Mullick et al.

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS (2018)

Article Computer Science, Artificial Intelligence

A systematic study of the class imbalance problem in convolutional neural networks

Mateusz Buda et al.

NEURAL NETWORKS (2018)

Review Computer Science, Artificial Intelligence

Learning from class-imbalanced data: Review of methods and applications

Guo Haixiang et al.

EXPERT SYSTEMS WITH APPLICATIONS (2017)

Article Computer Science, Artificial Intelligence

RWO-Sampling: A random walk over-sampling approach to imbalanced data classification

Huaxiang Zhang et al.

INFORMATION FUSION (2014)

Article Computer Science, Artificial Intelligence

PDFOS: PDF estimation based over-sampling for imbalanced two-class problems

Ming Gao et al.

NEUROCOMPUTING (2014)

Article Neurosciences

Analysis of sampling techniques for imbalanced data: An n=648 ADNI study

Rashmi Dubey et al.

NEUROIMAGE (2014)

Article Computer Science, Artificial Intelligence

The quest for the optimal class distribution: an approach for enhancing the effectiveness of learning via resampling methods for imbalanced data sets

Inaki Albisua et al.

PROGRESS IN ARTIFICIAL INTELLIGENCE (2013)

Article Computer Science, Artificial Intelligence

Addressing data complexity for imbalanced data sets: analysis of SMOTE-based oversampling and evolutionary undersampling

Julian Luengo et al.

SOFT COMPUTING (2011)

Article Computer Science, Artificial Intelligence

Evolutionary Undersampling for Classification with Imbalanced Datasets: Proposals and Taxonomy

Salvador Garcia et al.

EVOLUTIONARY COMPUTATION (2009)

Article Computer Science, Artificial Intelligence

Strategies for learning in class imbalance problems

R Barandela et al.

PATTERN RECOGNITION (2003)

Article Computer Science, Artificial Intelligence

Learning when training data are costly: The effect of class distribution on tree induction

GM Weiss et al.

JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH (2003)

Article Computer Science, Artificial Intelligence

Density estimation and random variate generation using multilayer networks

M Magdon-Ismail et al.

IEEE TRANSACTIONS ON NEURAL NETWORKS (2002)