☆ 4.8 Article

LIBAC: An Annotated Corpus for Automated Reading of the Lithium-Ion Battery Research Literature

CHEMISTRY OF MATERIALS (2023)

期刊

CHEMISTRY OF MATERIALS

卷 35, 期 5, 页码 1849-1857

出版社

AMER CHEMICAL SOC

DOI: 10.1021/acs.chemmater.2c01356

关键词

类别

Chemistry, Physical Materials Science, Multidisciplinary

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

The research literature on lithium-ion batteries (LIBs) has grown rapidly. To improve utilization of this valuable information, special tools, such as named entity recognition (NER), are needed. We have created a high-quality annotated corpus for LIBs, which was used to train and evaluate models, achieving high accuracy. This is a crucial step towards developing a large-scale information extraction system for LIB research literature.

The lithium-ion battery (LIB) research literature has increased very rapidly of late. While this is an immense source of valuable knowledge and facts for the community, these are also partly buried in the literature. To truly make the most possible use of the information available and automate reading, special tools are required. Named entity recognition (NER) is one such tool, which uses supervised machine learning for information extraction. To enable efficient NER, however, a large and high-quality annotated corpus is crucial. Here, we report on our generated, semi-automatically annotated lithium-ion battery annotated corpus, LIBAC, for 28 different entities of LIBs, which was used for training and evaluating Tok2vec and Transformer-based models, resulting in high general accuracies for these with F1-scores of 81 and 83%, respectively. LIBAC itself was created from 6985 paragraphs randomly chosen from ca. 11,000 LIB research papers and contains 73,300 annotated spans (627,428 tokens). This is the prime stepping-stone needed to develop a large-scale information extraction system designed for the LIB research literature.

LIBAC: An Annotated Corpus for Automated Reading of the Lithium-Ion Battery Research Literature

期刊

CHEMISTRY OF MATERIALS

出版社

AMER CHEMICAL SOC

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

LIBAC: An Annotated Corpus for Automated Reading of the Lithium-Ion Battery Research Literature

期刊

CHEMISTRY OF MATERIALS

出版社

AMER CHEMICAL SOC

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文