4.4 Review

Variable Selection Methods in QSAR: An Overview

Journal

CURRENT TOPICS IN MEDICINAL CHEMISTRY
Volume 8, Issue 18, Pages 1606-1627

Publisher

BENTHAM SCIENCE PUBL LTD
DOI: 10.2174/156802608786786552

Keywords

Variable selection; PLS; GA; museum; golpe; NN; QSAR

Funding

  1. Xunta de Galicia [PGIDIT07PXIB314220PR, PGIDT04BTF301031PR]
  2. Cuban Higher Education Ministry
  3. Ministerio de Educacion y Ciencia [HP2006-0124]

Ask authors/readers for more resources

Variable selection is a procedure used to select the most important features to obtain as much information as possible from a reduced amount of features. The selection stage is crucial. The subsequent design of a quantitative structure-activity relationship (QSAR) model (regression or discriminant) would lead to poor performance if little significant features are selected. In drug design modern era, by the means of combinatorial chemistry and high throughput screening, an unprecedented amount of experimental information has been generated. In addition, many molecular descriptors have been defined in the last two decays. All this information can be analyzed by QSAR techniques using adequate statistical procedures. These techniques and procedures should be fast, automated, and applicable to large data sets of structurally diverse compounds. For that reason, the identification of the best one seems to be a very difficult task in view of the large variable selection techniques existing nowadays. The intention of this review is to summarize some of the present knowledge concerning to variable selection methods applied to some well-known statistical techniques such as linear regression, PLS, kNN, Artificial Neural Networks, etc, with the aim to disseminate the advances of this important stage of the QSAR building model.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.4
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available