4.7 Article Data Paper

Automated Construction of a Photocatalysis Dataset for Water-Splitting Applications

期刊

SCIENTIFIC DATA
卷 10, 期 1, 页码 -

出版社

NATURE PORTFOLIO
DOI: 10.1038/s41597-023-02511-6

关键词

-

向作者/读者索取更多资源

This study presents an automatically generated dataset of 15,755 records extracted from 47,357 papers. These records contain information about water-splitting activity in the presence of certain photocatalysts, as well as additional details about the chemical reaction conditions. The dataset achieved good precision and recall, and the extraction of such a wide range of chemical reaction attributes required the development of novel techniques in knowledge extraction and interdependency resolution.
We present an automatically generated dataset of 15,755 records that were extracted from 47,357 papers. These records contain water-splitting activity in the presence of certain photocatalysts, along with additional information about the chemical reaction conditions under which this activity was recorded. These conditions include any co-catalysts and additives that were present during water splitting, the length of time for which the photocatalytic experiment was conducted, and the type of light source used, including its wavelength. Despite the text extraction of such a wide range of chemical reaction attributes, the dataset afforded good precision (71.2%) and recall (36.3%). These figures-of-merit were calculated based on a random sample of open-access papers from the corpus. Mining such a complex set of attributes required the development of novel techniques in knowledge extraction and interdependency resolution, leveraging inter- and intra-sentence relations, which are also described in this paper. We present a new version (version 2.2) of the chemistry-aware text-mining toolkit ChemDataExtractor, in which these new techniques are included.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据