☆ 4.7 Article

PI1M: A Benchmark Database for Polymer Informatics

JOURNAL OF CHEMICAL INFORMATION AND MODELING (2020)

Journal

JOURNAL OF CHEMICAL INFORMATION AND MODELING

Volume 60, Issue 10, Pages 4684-4690

Publisher

AMER CHEMICAL SOC

DOI: 10.1021/acs.jcim.0c00726

Keywords

Funding

NSF [TG-CTS100078]

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Abstract

Open-source data on large scale are the cornerstones for data-driven research, but they are not readily available for polymers. In this work, we build a benchmark database, called PI1M (referring to similar to 1 million polymers for polymer informatics), to provide data resources that can be used for machine learning research in polymer informatics. A generative model is trained on similar to 12 000 polymers manually collected from the largest existing polymer database PolyInfo, and then the model is used to generate similar to 1 million polymers. A new representation for polymers, polymer embedding (PE), is introduced, which is then used to perform several polymer informatics regression tasks for density, glass transition temperature, melting temperature, and dielectric constants. By comparing the PE trained by the PolyInfo data and that by the PI1M data, we conclude that the PI1M database covers similar chemical space as PolyInfo, but significantly populate regions where PolyInfo data are sparse. We believe that PI1M will serve as a good benchmark database for future research in polymer informatics.

PI1M: A Benchmark Database for Polymer Informatics

Journal

JOURNAL OF CHEMICAL INFORMATION AND MODELING

Publisher

AMER CHEMICAL SOC

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

PI1M: A Benchmark Database for Polymer Informatics

Journal

JOURNAL OF CHEMICAL INFORMATION AND MODELING

Publisher

AMER CHEMICAL SOC

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper