4.7 Article

Single-Point Extrapolation to the Complete Basis Set Limit through Deep Learning

Journal

JOURNAL OF CHEMICAL THEORY AND COMPUTATION
Volume 19, Issue 14, Pages 4474-4483

Publisher

AMER CHEMICAL SOC
DOI: 10.1021/acs.jctc.2c01298

Keywords

-

Ask authors/readers for more resources

In this study, a graph neural network model is developed and trained to correct the basis set incompleteness error between a small and large basis set at the RHF and B3LYP levels of theory. The results show that fitting an ML model to correct the BSIE is better at generalizing to systems not seen during training compared to fitting to the total potential. Acceptable performance is achieved when the training data sufficiently resemble the systems one wants to make predictions on.
Machinelearning (ML) offers an attractive method for making predictionsabout molecular systems while circumventing the need to run expensiveelectronic structure calculations. Once trained on ab initio data,the promise of ML is to deliver accurate predictions of molecularproperties that were previously computationally infeasible. In thiswork, we develop and train a graph neural network model to correctthe basis set incompleteness error (BSIE) between a small and largebasis set at the RHF and B3LYP levels of theory. Our results showthat, when compared to fitting to the total potential, an ML modelfitted to correct the BSIE is better at generalizing to systems notseen during training. We test this ability by training on single moleculeswhile evaluating on molecular complexes. We also show that ensemblemodels yield better behaved potentials in situations where the trainingdata is insufficient. However, even when only fitting to the BSIE,acceptable performance is only achieved when the training data sufficientlyresemble the systems one wants to make predictions on. The test errorof the final model trained to predict the difference between the cc-pVDZand cc-pV5Z potential is 0.184 kcal/mol for the B3LYP density functional,and the ensemble model accurately reproduces the large basis set interactionenergy curves on the S66x8 dataset.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available