☆ 4.5 Article

Contextualised segment-wise citation function classification

SCIENTOMETRICS (2023)

期刊

SCIENTOMETRICS

卷 128, 期 9, 页码 5117-5158

出版社

SPRINGER

DOI: 10.1007/s11192-023-04778-3

关键词

Citation context analysis; Citation function classification; Deep learning; SciBERT; Ensemble

类别

Computer Science, Interdisciplinary Applications Information Science & Library Science

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

Much effort has been devoted to citation function classification, but notable challenges remain. Limited data size and inadequate representativeness of scientific domains pose difficulties in annotation. The current state-of-the-art deep learning-based methods fail to leverage the full potential of citation modelling options. To address these issues, this paper focuses on contextualised citation function classification and proposes new models based on strong SciBERT. Additionally, a comprehensive analysis of performance and per-class analysis is conducted to evaluate the effectiveness of citation function classification for downstream applications.

Much effort has been made in the past decades to citation function classification, but noteworthy issues exist. Annotation difficulty resulted in limited data size, especially for minority classes, and inadequate representativeness of the underlying scientific domains. Concerning algorithmic classification, state-of-the-art deep learning-based methods are flawed by generating a feature vector for the whole citation context (or sentence) and failing to exploit the full realm of citation modelling options. Responding to these issues, this paper studied contextualised citation function classification. Specifically, a large new citation context dataset was created by merging and re-annotating six datasets about computational linguistics. A variety of strong SciBERT-based citation function classification models were proposed, and new states of the art were achieved. Through deeper performance analysis, this study focused on answering several research questions about the effective ways of performing citation function classification. More specifically, the study justified the necessity of modelling in-text citations in context and confirmed the superiority of doing citation function classification at citation (segment) level. A particular emphasis was placed on in-depth per-class performance analysis to understand whether citation function classification is robust enough to suit various popular downstream applications and what further efforts are required to meet such analytic needs. Finally, a naive ensemble classifier was proposed, which greatly improved citation function classification performance.

Contextualised segment-wise citation function classification

期刊

SCIENTOMETRICS

出版社

SPRINGER

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Contextualised segment-wise citation function classification

期刊

SCIENTOMETRICS

出版社

SPRINGER

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文