4.7 Article

PubChemQC PM6: Data Sets of 221 Million Molecules with Optimized Molecular Geometries and Electronic Properties

Journal

JOURNAL OF CHEMICAL INFORMATION AND MODELING
Volume 60, Issue 12, Pages 5891-5899

Publisher

AMER CHEMICAL SOC
DOI: 10.1021/acs.jcim.0c00740

Keywords

-

Funding

  1. Japan Society for the Promotion of Science (JSPS KAKENHI) [18H03206]
  2. Grants-in-Aid for Scientific Research [18H03206] Funding Source: KAKEN

Ask authors/readers for more resources

We report on optimized molecular geometries and electronic properties calculated by the PM6 method for 94.0% of the 91.6 million molecules cataloged in PubChem Compounds retrieved on August 29, 2016. In addition to neutral states, we also calculated those for cationic, anionic, and spin flipped electronic states of 56.2%, 49.7%, and 41.3% of the molecules, respectively. Thus, the grand total of the PM6 calculations amounted to 221 million. We compared the resulting molecular geometries with B3LYP/6-31G* optimized geometries for 2.6 million molecules. The root-mean-square deviations in bond length and bond angle were approximately 0.016 angstrom and 1.7 degrees, respectively. Then, using linear regression to examine the HOMO energy levels E(HOMO) in the B3LYP and PM6 calculations, we found that E-B3LYP(HOMO) = 0.876E(PM6)(HOMO) + 1.975 (eV) and calculated the coefficient of determination to be 0.803. Likewise, we examined the LUMO energy levels and found E-B3LYP(LUMO) = 1.069E(PM6)(LUMO) - 0.420 (eV); the coefficient of determination was 0.842. We also generated four subdata sets, each of which was composed of molecules with molecular weights less than 500. Subdata set i contained C, H, O and N, ii contained C, H, N, O, P, and S, iii contained C, H, N, O, P, S, F, and Cl, and iv contained C, H, N, O, P, S, F, Cl, Na, K, Mg, and Ca. The data sets are available at http://pubchemqc.riken.jp/pm6_datasets.html under a Creative Commons Attribution 4.0 International license.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available