4.7 Article

pKa Prediction from Quantum Chemical Topology Descriptors

Journal

JOURNAL OF CHEMICAL INFORMATION AND MODELING
Volume 49, Issue 8, Pages 1914-1924

Publisher

AMER CHEMICAL SOC
DOI: 10.1021/ci900172h

Keywords

-

Funding

  1. EPSRC
  2. GlaxoSmithKline (GSK), Stevenage, Great Britain

Ask authors/readers for more resources

Knowing the pK(a) of it compound gives insight into many properties relevant to many industries, in particular the pharmaceutical industry during drug development processes. In light of this, we have used the theory of Quantum Chemical Topology (QCT), to provide ab initio descriptors that are able to accurately predict pK(a) values for 228 carboxylic acids. This Quantum Topological Molecular Similarity (QTMS) study involved the comparison of 5 increasingly more expensive levels of theory to conclude that HF/6-31G(d) and B3LYP/6-311+G(2d,p) provided an accurate representation of the compounds Studies. We created global and subset models for the carboxylic acids using Partial Least Square (PLS), Support Vector Machines (SVM), and Radial Basis Function Neural Networks (RBFNN). The models were extensively validated using 4-, 7-, and 10-fold cross-validation. with the validation sets selected based on systematic and random sampling. HF/6-31G(d) in conjunction with SVM provided the best statistics when taking into account the large increase in CPU time required to optimize the geometries at the B3LYP/6-311+G(2d,p) level. The SVM models provided an average q(2) value of 0.886 and an RMSE value of 0.293 for all the carboxylic acids, a q(2) of 0.825 and RMSE of 0.378 for the ortho-substituted acids, a q(2) of 0.923 and RMSE of 0.112 for the para- and meta-substituted acids, and a q(2) of 0.906 and RMSE of 0.268 for the aliphatic acids. Our method compares favorably to ACD/Laboratories, VCCLAB, SPARC, and ChemAxon's pK(a) prediction software based of the RMSE calculated by the leave-one-out method.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available