4.5 Article

Contextualised segment-wise citation function classification

期刊

SCIENTOMETRICS
卷 128, 期 9, 页码 5117-5158

出版社

SPRINGER
DOI: 10.1007/s11192-023-04778-3

关键词

Citation context analysis; Citation function classification; Deep learning; SciBERT; Ensemble

向作者/读者索取更多资源

Much effort has been devoted to citation function classification, but notable challenges remain. Limited data size and inadequate representativeness of scientific domains pose difficulties in annotation. The current state-of-the-art deep learning-based methods fail to leverage the full potential of citation modelling options. To address these issues, this paper focuses on contextualised citation function classification and proposes new models based on strong SciBERT. Additionally, a comprehensive analysis of performance and per-class analysis is conducted to evaluate the effectiveness of citation function classification for downstream applications.
Much effort has been made in the past decades to citation function classification, but noteworthy issues exist. Annotation difficulty resulted in limited data size, especially for minority classes, and inadequate representativeness of the underlying scientific domains. Concerning algorithmic classification, state-of-the-art deep learning-based methods are flawed by generating a feature vector for the whole citation context (or sentence) and failing to exploit the full realm of citation modelling options. Responding to these issues, this paper studied contextualised citation function classification. Specifically, a large new citation context dataset was created by merging and re-annotating six datasets about computational linguistics. A variety of strong SciBERT-based citation function classification models were proposed, and new states of the art were achieved. Through deeper performance analysis, this study focused on answering several research questions about the effective ways of performing citation function classification. More specifically, the study justified the necessity of modelling in-text citations in context and confirmed the superiority of doing citation function classification at citation (segment) level. A particular emphasis was placed on in-depth per-class performance analysis to understand whether citation function classification is robust enough to suit various popular downstream applications and what further efforts are required to meet such analytic needs. Finally, a naive ensemble classifier was proposed, which greatly improved citation function classification performance.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.5
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据