☆ 4.7 Article

Cost-sensitive linguistic fuzzy rule based classification systems under the MapReduce framework for imbalanced big data

FUZZY SETS AND SYSTEMS (2015)

Journal

FUZZY SETS AND SYSTEMS

Volume 258, Issue -, Pages 5-38

Publisher

ELSEVIER

DOI: 10.1016/j.fss.2014.01.015

Keywords

Fuzzy rule based classification systems; Big data; MapReduce; Hadoop; Imbalanced datasets; Cost-sensitive learning

Funding

Spanish Ministry of Science and Technology [TIN2011-28488]
Andalusian Research Plans [P11-TIC-7765, P10-TIC-6858]
Spanish Ministry of Education

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Abstract

Classification with big data has become one of the latest trends when talking about learning from the available information. The data growth in the last years has rocketed the interest in effectively acquiring knowledge to analyze and predict trends. The variety and veracity that are related to big data introduce a degree of uncertainty that has to be handled in addition to the volume and velocity requirements. This data usually also presents what is known as the problem of classification with imbalanced datasets, a class distribution where the most important concepts to be learned are presented by a negligible number of examples in relation to the number of examples from the other classes. In order to adequately deal with imbalanced big data we propose the Chi-FRBCS-BigDataCS algorithm, a fuzzy rule based classification system that is able to deal with the uncertainly that is introduced in large volumes of data without disregarding the learning in the underrepresented class. The method uses the MapReduce framework to distribute the computational operations of the fuzzy model while it includes cost-sensitive learning techniques in its design to address the imbalance that is present in the data. The good performance of this approach is supported by the experimental analysis that is carried out over twenty-four imbalanced big data cases of study. The results obtained show that the proposal is able to handle these problems obtaining competitive results both in the classification performance of the model and the time needed for the computation. (C) 2014 Elsevier B.V. All rights reserved.

Cost-sensitive linguistic fuzzy rule based classification systems under the MapReduce framework for imbalanced big data

Journal

FUZZY SETS AND SYSTEMS

Publisher

ELSEVIER

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Cost-sensitive linguistic fuzzy rule based classification systems under the MapReduce framework for imbalanced big data

Journal

FUZZY SETS AND SYSTEMS

Publisher

ELSEVIER

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper