☆ 4.7 Article

Automatic feature subset selection for decision tree-based ensemble methods in the prediction of bioactivity

CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS (2010)

Journal

CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS

Volume 103, Issue 2, Pages 129-136

Publisher

ELSEVIER

DOI: 10.1016/j.chemolab.2010.06.008

Keywords

Feature selection; Bagging; Boosting; Random Forest (RF); Classification and Regression Tree (CART); Ensemble learning

Funding

National Nature Foundation Committee of P.R. China [20875104, 10771217]
Ministry of science and technology of China [2007DFA40680]

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Abstract

In the structure-activity relationship (SAR) study, a learning algorithm is usually faced with the problem of selecting a compact subset of descriptors related to the property of interest, while ignoring the rest. This paper presents a new method of molecular descriptor selection utilizing three commonly used decision tree (DT)-based ensemble methods coupled with a backward elimination strategy (BES). Our proposed method eliminates descriptor redundancy automatically and searches for more compact descriptor subset tailored to DT-based ensemble methods. Six real SAR datasets related to different categorical bioactivities of compounds are used to evaluate the proposed method. The results obtained in this study indicate that DT-based ensemble methods coupled with BES, especially boosting tree model, yield better classification performance for compounds related to ADMET. (C) 2010 Elsevier B.V. All rights reserved.

Automatic feature subset selection for decision tree-based ensemble methods in the prediction of bioactivity

Journal

CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS

Publisher

ELSEVIER

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Automatic feature subset selection for decision tree-based ensemble methods in the prediction of bioactivity

Journal

CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS

Publisher

ELSEVIER

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper