4.7 Article

Development of a baseline model for MAX/MXene synthesis recipes extraction via pre-trained model with domain knowledge

Journal

JOURNAL OF MATERIALS RESEARCH AND TECHNOLOGY-JMR&T
Volume 22, Issue -, Pages 2262-2274

Publisher

ELSEVIER
DOI: 10.1016/j.jmrt.2022.12.076

Keywords

Natural language processing; Text mining; MAX phases; MXenes; Materials synthesis

Ask authors/readers for more resources

MAX/MXenes, with a unique combination of metallic and ceramic properties, have garnered significant attention. This study proposes a baseline model utilizing natural language processing to extract synthesis conditions for MAX/MXenes from literature. The developed model serves as an auxiliary tool for future research and also provides a pre-trained model for extracting synthesis routes of MAX/MXenes.
Due to their unique combination of metallic-and ceramic-like properties, MAX phases have attracted a lot of attentions. By selectively etching A-site atoms, MXenes with unique two-dimensional structures can be potentially generated. Due to their extraordinary properties, MXenes have currently made their way to the forefront of various research areas including electronics, photonics and catalysis. Therefore, the development of novel synthesis strategies for MAX/MXene is a key issue for the further development of MAX/ MXene. Distilling insights from scientific literatures could accelerate the exploration of novel synthesis recipes; however, manually extracting scattered information from thou-sands of journal articles is laborious. In this study, we present an annotated corpus incorporating domain knowledge about MAX/MXene synthesis processes, deriving from experimental sections within 110 papers on MAX/MXene research; and based on that, a baseline model (including named entity recognition (NER) and relation extraction (RE) parts) is proposed for distilling information about MAX/MXene synthesis conditions from litera-tures using pre-trained natural language processing (NLP) models. We also demonstrate the efficacy of the proposed pipeline owning to the joint effort of domain knowledge (about MAX/MXene) and machine learning; where the entity recognition model possessing opti-mized setting could detect the entities with F1 score of 0.8452, and for relation extraction model with F1 score of 0.8476. It is hoped that the current work would provide an auxiliary for the future research and development of novel MAX/MXenes. In addition, the developed model could serve as a pre-trained model of MAX/MXenes synthesis routes extraction for future data augment.(c) 2022 The Author(s). Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available