4.6 Article

Chemical transformer compression for accelerating both training and inference of molecular modeling

期刊

出版社

IOP Publishing Ltd
DOI: 10.1088/2632-2153/ac99ba

关键词

model compression; transformer; molecular modeling

资金

  1. European Research council [ERC2017-StG-757733]

向作者/读者索取更多资源

This study proposes a new lightweight chemical Transformer model, DeLiCaTe, by reducing the size of Transformer models in molecular science using cross-layer parameter sharing and knowledge distillation. DeLiCaTe achieves comparable performance in QSAR and VS while being more efficient in terms of parameters and layers.
Transformer models have been developed in molecular science with excellent performance in applications including quantitative structure-activity relationship (QSAR) and virtual screening (VS). Compared with other types of models, however, they are large and need voluminous data for training, which results in a high hardware requirement to abridge time for both training and inference processes. In this work, cross-layer parameter sharing (CLPS), and knowledge distillation (KD) are used to reduce the sizes of transformers in molecular science. Both methods not only have competitive QSAR predictive performance as compared to the original BERT model, but also are more parameter efficient. Furthermore, by integrating CLPS and KD into a two-state chemical network, we introduce a new deep lite chemical transformer model, DeLiCaTe. DeLiCaTe accomplishes 4x faster rate for training and inference, due to a 10- and 3-times reduction of the number of parameters and layers, respectively. Meanwhile, the integrated model achieves comparable performance in QSAR and VS, because of capturing general-domain (basic structure) and task-specific knowledge (specific property prediction). Moreover, we anticipate that the model compression strategy provides a pathway to the creation of effective generative transformer models for organic drugs and material design.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据