4.8 Article

One class classification as a practical approach for accelerating π-π co-crystal discovery

Journal

CHEMICAL SCIENCE
Volume 12, Issue 5, Pages 1702-1719

Publisher

ROYAL SOC CHEMISTRY
DOI: 10.1039/d0sc04263c

Keywords

-

Funding

  1. Engineering and Physical Sciences Research Council (EPSRC) [EP/S026339/1]
  2. EPSRC [EP/R018472/1]
  3. Leverhulme Trust
  4. Leverhulme Research Centre for Functional Materials Design via the Leverhulme Research Centre for Functional Materials Design [RC-2015-036]
  5. Cambridge Crystallographic Data Centre
  6. EPSRC [EP/R018472/1, EP/S026339/1] Funding Source: UKRI

Ask authors/readers for more resources

The study introduces a one-class classification method to address the issue of data imbalance in materials design, and has successfully applied it in discovering new materials.
The implementation of machine learning models has brought major changes in the decision-making process for materials design. One matter of concern for the data-driven approaches is the lack of negative data from unsuccessful synthetic attempts, which might generate inherently imbalanced datasets. We propose the application of the one-class classification methodology as an effective tool for tackling these limitations on the materials design problems. This is a concept of learning based only on a well-defined class without counter examples. An extensive study on the different one-class classification algorithms is performed until the most appropriate workflow is identified for guiding the discovery of emerging materials belonging to a relatively small class, that being the weakly bound polyaromatic hydrocarbon co-crystals. The two-step approach presented in this study first trains the model using all the known molecular combinations that form this class of co-crystals extracted from the Cambridge Structural Database (1722 molecular combinations), followed by scoring possible yet unknown pairs from the ZINC15 database (21 736 possible molecular combinations). Focusing on the highest-ranking pairs predicted to have higher probability of forming co-crystals, materials discovery can be accelerated by reducing the vast molecular space and directing the synthetic efforts of chemists. Further on, using interpretability techniques a more detailed understanding of the molecular properties causing co-crystallization is sought after. The applicability of the current methodology is demonstrated with the discovery of two novel co-crystals, namely pyrene-6H-benzo[c]chromen-6-one (1) and pyrene-9,10-dicyanoanthracene (2).

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.8
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available