4.6 Review

Challenges and Advances in Information Extraction from Scientific Literature: a Review

期刊

JOM
卷 73, 期 11, 页码 3383-3400

出版社

SPRINGER
DOI: 10.1007/s11837-021-04902-9

关键词

Information extraction; Text mining; Scientific data

资金

  1. US Department of Commerce, National Institute of Standards and Technology as part of the Center for Hierarchical Materials Design (CHiMaD) [70NANB19H005]
  2. US Department of Energy, Office of Science, Advanced Scientific Computing Research [DE-AC02-06CH11357]
  3. Joint Center for Energy Storage Research (JCESR), an Energy Innovation Hub - US Department of Energy (DOE), Office of Science, Office of Basic Energy Sciences

向作者/读者索取更多资源

Scientific articles have long been the primary means of disseminating scientific discoveries, but extracting information from these papers has been a tedious and time-consuming task. Significant progress has been made in automated information extraction techniques by the computer science community, yet applying these techniques to scientific literature still faces technical and logistical challenges.
Scientific articles have long been the primary means of disseminating scientific discoveries. Over the centuries, valuable data and potentially groundbreaking insights have been collected and buried deep in the mountain of publications. In materials engineering, such data are spread across technical handbooks specification sheets, journal articles, and laboratory notebooks in myriad formats. Extracting information from papers on a large scale has been a tedious and time-consuming job to which few researchers have wanted to devote their limited time and effort, yet is an activity that is essential for modern data-driven design practices. However, in recent years, significant progress has been made by the computer science community on techniques for automated information extraction from free text. Yet, transformative application of these techniques to scientific literature remains elusive-due not to a lack of interest or effort but to technical and logistical challenges. Using the challenges in the materials science literature as a driving motivation, we review the gaps between state-of-the-art information extraction methods and the practical application of such methods to scientific texts, and offer a comprehensive overview of work that can be undertaken to close these gaps.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据