☆ 4.7 Article

AutoWeka4MCPS-AVATAR: Accelerating automated machine learning pipeline composition and optimisation

EXPERT SYSTEMS WITH APPLICATIONS (2021)

期刊

EXPERT SYSTEMS WITH APPLICATIONS

卷 185, 期 -, 页码 -

出版社

PERGAMON-ELSEVIER SCIENCE LTD

DOI: 10.1016/j.eswa.2021.115643

关键词

Automated machine learning; Pipeline composition and optimisation; Machine learning pipeline evaluation; AutoML; Configuration space reduction

类别

Computer Science, Artificial Intelligence Engineering, Electrical & Electronic Operations Research & Management Science

资金

University of Technology Sydney (UTS) , Australia

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

Automated machine learning pipeline composition and optimisation aim to automate the process of finding the most promising ML pipelines within allocated resources. However, existing methods that evaluate pipelines by executing them often require a significant amount of time and prevent exploration of complex pipelines. To address this issue, a surrogate model named AVATAR is proposed to evaluate ML pipelines without execution, allowing for quicker rejection of invalid pipelines and evaluation of more pipelines within the same time budget. Integrating AVATAR into the SMAC algorithm configuration results in finding better solutions compared to using SMAC alone.

Automated machine learning pipeline (ML) composition and optimisation aim at automating the process of finding the most promising ML pipelines within allocated resources (i.e., time, CPU and memory). Existing methods, such as Bayesian-based and genetic-based optimisation, which are implemented in Auto-Weka, Autosklearn and TPOT, evaluate pipelines by executing them. Therefore, the pipeline composition and optimisation of these methods frequently require a tremendous amount of time that prevents them from exploring complex pipelines to find better predictive models. To further explore this research challenge, we have conducted experiments showing that many of the generated pipelines are invalid in the first place, and attempting to execute them is a waste of time and resources. To address this issue, we propose a novel method to evaluate the validity of ML pipelines, without their execution, using a surrogate model (AVATAR). The AVATAR generates a knowledge base by automatically learning the capabilities and effects of ML algorithms on datasets' characteristics. This knowledge base is used for a simplified mapping from an original ML pipeline to a surrogate model which is a Petri net based pipeline. Instead of executing the original ML pipeline to evaluate its validity, the AVATAR evaluates its surrogate model constructed by capabilities and effects of the ML pipeline components and input/output simplified mappings. Evaluating this surrogate model is less resource-intensive than the execution of the original pipeline. As a result, the AVATAR enables the pipeline composition and optimisation methods to evaluate more pipelines by quickly rejecting invalid pipelines. We integrate the AVATAR into the sequential model-based algorithm configuration (SMAC). Our experiments show that when SMAC employs AVATAR, it finds better solutions than on its own. This is down to the fact that the AVATAR can evaluate more pipelines within the same time budget and allocated resources.

AutoWeka4MCPS-AVATAR: Accelerating automated machine learning pipeline composition and optimisation

期刊

EXPERT SYSTEMS WITH APPLICATIONS

出版社

PERGAMON-ELSEVIER SCIENCE LTD

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

AutoWeka4MCPS-AVATAR: Accelerating automated machine learning pipeline composition and optimisation

期刊

EXPERT SYSTEMS WITH APPLICATIONS

出版社

PERGAMON-ELSEVIER SCIENCE LTD

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文