4.6 Article

Deep learning-based automatic action extraction from structured chemical synthesis procedures

期刊

PEERJ COMPUTER SCIENCE
卷 9, 期 -, 页码 -

出版社

PEERJ INC
DOI: 10.7717/peerj-cs.1511

关键词

Deep learning; Synthesis procedures; Data mining; Data science; Artificial intelligence; Machine learning; Organic chemistry; Text classification; Text generation; Natural language processing

向作者/读者索取更多资源

This article proposes a methodology that uses machine learning algorithms to extract actions from structured chemical synthesis procedures. The proposed pipeline combines ML algorithms and scripts to extract relevant data from patents, helping transform experimental procedures into structured actions. The developed pipeline enables the creation of a dataset of chemical reactions and their procedures in a structured format, facilitating the application of AI-based approaches to streamline synthetic pathways and optimize experimental conditions.
This article proposes a methodology that uses machine learning algorithms to extract actions from structured chemical synthesis procedures, thereby bridging the gap between chemistry and natural language processing. The proposed pipeline combines ML algorithms and scripts to extract relevant data from USPTO and EPO patents, which helps transform experimental procedures into structured actions. This pipeline includes two primary tasks: classifying patent paragraphs to select chemical procedures and converting chemical procedure sentences into a structured, simplified format. We employ artificial neural networks such as long short-term memory, bidirectional LSTMs, transformers, and fine-tuned T5. Our results show that the bidirectional LSTM classifier achieved the highest accuracy of 0.939 in the first task, while the Transformer model attained the highest BLEU score of 0.951 in the second task. The developed pipeline enables the creation of a dataset of chemical reactions and their procedures in a structured format, facilitating the application of AI-based approaches to streamline synthetic pathways, predict reaction outcomes, and optimize experimental conditions. Furthermore, the developed pipeline allows for creating a structured dataset of chemical reactions and procedures, making it easier for researchers to access and utilize the valuable information in synthesis procedures.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据