4.7 Article Data Paper

QMugs, quantum mechanical properties of drug-like molecules

Journal

SCIENTIFIC DATA
Volume 9, Issue 1, Pages -

Publisher

NATURE PORTFOLIO
DOI: 10.1038/s41597-022-01390-7

Keywords

-

Funding

  1. ETH RETHINK initiative
  2. Swiss National Science Foundation [205321_182176]
  3. Boehringer Ingelheim Pharma GmbH Co.
  4. Swiss Chemical Industry

Ask authors/readers for more resources

The QMugs dataset, with over 665k molecules of biological and pharmacological relevance, provides quantum mechanical properties and facilitates development of models for drug discovery and machine learning.
Machine learning approaches in drug discovery, as well as in other areas of the chemical sciences, benefit from curated datasets of physical molecular properties. However, there currently is a lack of data collections featuring large bioactive molecules alongside first-principle quantum chemical information. The open-access QMugs (Quantum-Mechanical Properties of Drug-like Molecules) dataset fills this void. The QMugs collection comprises quantum mechanical properties of more than 665 k biologically and pharmacologically relevant molecules extracted from the ChEMBL database, totaling similar to 2 M conformers. QMugs contains optimized molecular geometries and thermodynamic data obtained via the semi-empirical method GFN2-xTB. Atomic and molecular properties are provided on both the GFN2-xTB and on the density-functional levels of theory (DFT, omega B97X-D/def2-SVP). QMugs features molecules of significantly larger size than previously-reported collections and comprises their respective quantum mechanical wave functions, including DFT density and orbital matrices. This dataset is intended to facilitate the development of models that learn from molecular data on different levels of theory while also providing insight into the corresponding relationships between molecular structure and biological activity.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available