4.7 Article Proceedings Paper

QSAR and QSPR studies of a highly structured physicochemical domain

向作者/读者索取更多资源

The relevance of terms other than linear when deriving quantitative structure-activity relationship/quantitative structure-property relationship (QSAR/QSPR) models has been rarely considered so far. In this study, the impact of quadratic and interacting terms has been taken into account. The first effect of including such highly structured terms is a significant extension of the parametric domain that moves from the initial N to N(N + 3)/2 parameters. This substantial enlargement over the conventional linear boundaries involves a higher computational cost due to the increased combinatorial number of resulting theoretical QSAR/QSPR models. To face this issue, novel genetic-algorithm-based software, MGZ (multigenetic zooming), was developed and used for both variable selection and model building. To speed up the entire process of domain searching, MGZ was supported with multiple independent evolving populations and genetic storms to further QSAR/QSPR analyses. In addition, a novel fitness function was developed to score models on the basis of their inner predictive capability, assessed on the training set, structure complexity, and presence of nonlinear terms. The models were further validated by monitoring model redundancy and performing intensive randomization runs. The Selwood data set was used as a reference set to derive QSAR models. Furthermore, a QSPR study was conducted on the solubility data set of a large array of organic compounds. The results reported in the present paper demonstrate that our approach is successful in finding linear models, which are at least as good as the models previously derived using standard statistical approaches, and in deriving new nonlinear models with good statistical figures.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据