Journal
INTERNATIONAL JOURNAL OF QUANTUM CHEMISTRY
Volume 115, Issue 16, Pages 1084-1093Publisher
WILEY
DOI: 10.1002/qua.24912
Keywords
machine learning; representation; descriptor; quantum chemistry; molecules
Categories
Funding
- Office of Science of the U.S. DOE [DE-AC02-06CH11357]
- LDRD
- Swiss National Science foundation [PP00P2_138932]
Ask authors/readers for more resources
We introduce a fingerprint representation of molecules based on a Fourier series of atomic radial distribution functions. This fingerprint is unique (except for chirality), continuous, and differentiable with respect to atomic coordinates and nuclear charges. It is invariant with respect to translation, rotation, and nuclear permutation, and requires no preconceived knowledge about chemical bonding, topology, or electronic orbitals. As such, it meets many important criteria for a good molecular representation, suggesting its usefulness for machine learning models of molecular properties trained across chemical compound space. To assess the performance of this new descriptor, we have trained machine learning models of molecular enthalpies of atomization for training sets with up to 10 k organic molecules, drawn at random from a published set of 134 k organic molecules with an average atomization enthalpy of over 1770 kcal/mol. We validate the descriptor on all remaining molecules of the 134 k set. For a training set of 10 k molecules, the fingerprint descriptor achieves a mean absolute error of 8.0 kcal/mol. This is slightly worse than the performance attained using the Coulomb matrix, another popular alternative, reaching 6.2 kcal/mol for the same training and test sets. (c) 2015 Wiley Periodicals, Inc.
Authors
I am an author on this paper
Click your name to claim this paper and add it to your profile.
Reviews
Recommended
No Data Available