☆ 4.7 Article

Intelligent ensembling of auto-ML system outputs for solving classification problems

INFORMATION SCIENCES (2022)

Journal

INFORMATION SCIENCES

Volume 609, Issue -, Pages 766-780

Publisher

ELSEVIER SCIENCE INC

DOI: 10.1016/j.ins.2022.07.061

Keywords

Ensemble methods; Auto -ML; Grammatical evolution; Supervised learning

Funding

University of Alicante
Ministry of Science and Innovation of the Spanish Government
Fondo Europeo de Desarrollo Regional (FEDER)
Generalitat Valenciana (Conselleria d'Educacio, Investigacio, Cultura i Esport) [CIPROM/2021/21, PID2021-122263OB-C22, RTI2018-094653-B-C21/C22, PID2021-123956OB-I00]
University of Havana

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Automated Summary New
Abstract

This paper presents a two-phase optimization system that utilizes Auto-ML tools to solve classification problems and generate more robust classifiers. The experimental results show that ensembling a subset of already tested models can build a better solution, and ensuring diversity using the double-fault measure produces better results.

Automatic Machine Learning (Auto-ML) tools enable the automatic solution of real-world problems through machine learning techniques. These tools tend to be more time consum-ing than standard machine learning libraries, therefore, exploiting all the available resources to the full is a valuable feature. This paper presents a two-phase optimization system for solving classification problems. The system is designed to produce more robust classifiers by exploiting the different architectures that are generated while solving classi-fication problems with Auto-ML tools, particularly AutoGOAL. In the first phase, the system follows a probabilistic strategy to find the best combination of algorithms and hyperpa-rameters to generate a collection of base models according to certain diversity criteria; and in the second, it follows similar Auto-ML strategies to ensemble those models. The HAHA 2019 challenge corpus and the Adult dataset were used to evaluate the system. The experimental results show that: i) a better solution can be built by ensembling a subset of the already tested models; ii) the performance of ensemble methods depends on the col-lection of base models used; and, iii) ensuring diversity using the double-fault measure produces better results than the disagreement measure. The source code is available online for the research community. (c) 2022 Elsevier Inc. All rights reserved.

Intelligent ensembling of auto-ML system outputs for solving classification problems

Journal

INFORMATION SCIENCES

Publisher

ELSEVIER SCIENCE INC

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Intelligent ensembling of auto-ML system outputs for solving classification problems

Journal

INFORMATION SCIENCES

Publisher

ELSEVIER SCIENCE INC

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper