4.7 Article Proceedings Paper

Innovative genetic algorithms for chemoinformatics

向作者/读者索取更多资源

In this paper, we report on the development of a genetic algorithm (GA) for pattern recognition analysis of multivariate chemical data. The GA identifies feature subsets that optimize the separation of the classes in a plot of the two or three largest principal components of the data. Because principal components maximize variance, the bulk of the information encoded by the selected features is about differences between classes in the data set. The principal component (PC) plot function as embedded information filter. Sets of features are selected based on their principal component plots, with a good principal component plot generated by features whose variance or information is primarily about differences between classes in the data set. This limits the GA to search for these types of feature subsets, significantly reducing the size of the search space. In addition, the pattern recognition GA focuses on those classes and/or samples that are difficult to classify by boosting their weights over successive generation using a perceptron to team the class and sample weights. Samples that consistently classify correctly are not as heavily weighted in the analysis as samples that are difficult to classify. The pattern recognition GA integrates aspects of artificial intelligence and evolutionary computations to yield a smart one-pass procedure for feature selection. The efficacy and efficiency of the pattern recognition GA is demonstrated via problems: from chemical communication and environmental analysis. (C) 2002 Elsevier Science B.V. All rights reserved.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据