4.4 Article

Regularized regression analysis of digitized molecular structures in organic reactions for quantification of steric effects

Journal

JOURNAL OF COMPUTATIONAL CHEMISTRY
Volume 38, Issue 21, Pages 1825-1833

Publisher

WILEY
DOI: 10.1002/jcc.24791

Keywords

CoMFA; QSPR; LASSO; Elastic Net; organic reactions; steric effects

Funding

  1. Grants-in-Aid for Scientific Research [15H03810] Funding Source: KAKEN

Ask authors/readers for more resources

In organic chemistry, Comparative Molecular Field Analysis (CoMFA) can be defined as a regression analysis between reaction outcomes and molecular fields, wherein we can extract and visualize important structural information from the coefficients of the constructed regression models. In CoMFA, partial least-squares (PLS) regression, which determines all coefficients in the model, is used for fitting the regression models. However, in organic reactions, steric effects are observed only near the reactive site, indicating that a large number of regression coefficients in the CoMFA of organic reactions should be assigned as 0. The regularized regression method, LASSO/Elastic Net, allows us to fit the regression model while assigning 0 values to unimportant coefficients. Although LASSO/Elastic Net should be suitable for CoMFA, there is no example of its use for organic reaction analysis. Herein, we examine the performance of LASSO/Elastic Net for the quantification of steric effects in CoMFA. We employ digitized molecular structures (the indicator field) as molecular fields that represent steric effects. LASSO/Elastic Net regressions provide highly interpretable models that include less noise than those from PLS regression. (c) 2017 Wiley Periodicals, Inc.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.4
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available