4.5 Article

Contextualised segment-wise citation function classification

Journal

SCIENTOMETRICS
Volume 128, Issue 9, Pages 5117-5158

Publisher

SPRINGER
DOI: 10.1007/s11192-023-04778-3

Keywords

Citation context analysis; Citation function classification; Deep learning; SciBERT; Ensemble

Ask authors/readers for more resources

Much effort has been devoted to citation function classification, but notable challenges remain. Limited data size and inadequate representativeness of scientific domains pose difficulties in annotation. The current state-of-the-art deep learning-based methods fail to leverage the full potential of citation modelling options. To address these issues, this paper focuses on contextualised citation function classification and proposes new models based on strong SciBERT. Additionally, a comprehensive analysis of performance and per-class analysis is conducted to evaluate the effectiveness of citation function classification for downstream applications.
Much effort has been made in the past decades to citation function classification, but noteworthy issues exist. Annotation difficulty resulted in limited data size, especially for minority classes, and inadequate representativeness of the underlying scientific domains. Concerning algorithmic classification, state-of-the-art deep learning-based methods are flawed by generating a feature vector for the whole citation context (or sentence) and failing to exploit the full realm of citation modelling options. Responding to these issues, this paper studied contextualised citation function classification. Specifically, a large new citation context dataset was created by merging and re-annotating six datasets about computational linguistics. A variety of strong SciBERT-based citation function classification models were proposed, and new states of the art were achieved. Through deeper performance analysis, this study focused on answering several research questions about the effective ways of performing citation function classification. More specifically, the study justified the necessity of modelling in-text citations in context and confirmed the superiority of doing citation function classification at citation (segment) level. A particular emphasis was placed on in-depth per-class performance analysis to understand whether citation function classification is robust enough to suit various popular downstream applications and what further efforts are required to meet such analytic needs. Finally, a naive ensemble classifier was proposed, which greatly improved citation function classification performance.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.5
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available