☆ 4.4 Article

An approach for classification of highly imbalanced data using weighting and undersampling

AMINO ACIDS (2010)

Journal

AMINO ACIDS

Volume 39, Issue 5, Pages 1385-1391

Publisher

SPRINGER WIEN

DOI: 10.1007/s00726-010-0595-2

Keywords

Imbalanced datasets; SVM; Undersampling technique

Funding

Agency for Science, Technology, and Research, Singapore (A*Star) [052 101 0020]

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Abstract

Real-world datasets commonly have issues with data imbalance. There are several approaches such as weighting, sub-sampling, and data modeling for handling these data. Learning in the presence of data imbalances presents a great challenge to machine learning. Techniques such as support-vector machines have excellent performance for balanced data, but may fail when applied to imbalanced datasets. In this paper, we propose a new undersampling technique for selecting instances from the majority class. The performance of this approach was evaluated in the context of several real biological imbalanced data. The ratios of negative to positive samples vary from similar to 9:1 to similar to 100:1. Useful classifiers have high sensitivity and specificity. Our results demonstrate that the proposed selection technique improves the sensitivity compared to weighted support-vector machine and available results in the literature for the same datasets.

An approach for classification of highly imbalanced data using weighting and undersampling

Journal

AMINO ACIDS

Publisher

SPRINGER WIEN

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

An approach for classification of highly imbalanced data using weighting and undersampling

Journal

AMINO ACIDS

Publisher

SPRINGER WIEN

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper