4.6 Article

Citation Intent Classification Using Word Embedding

期刊

IEEE ACCESS
卷 9, 期 -, 页码 9982-9995

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/ACCESS.2021.3050547

关键词

Metadata; Citation analysis; Computational modeling; Licenses; Context modeling; Task analysis; Semantics; Citation intent; citation analysis; citation context; citation motivation; citation function classification; word embedding; scholarly dataset

资金

  1. State Key Laboratory of Computer Architecture (ICT, CAS) Open Project [CARCHB202019]

向作者/读者索取更多资源

Citation analysis is an active research field, with a shift towards using deep neural network algorithms instead of traditional statistical approaches. Existing scholarly datasets are suitable for statistical methods but lack citation context information, prompting researchers to propose automated techniques for labeling citation intent.
Citation analysis is an active area of research for various reasons. So far, statistical approaches are mainly used for citation analysis, which does not look into the internal context of the citations. Deep analysis of citation may reveal interesting findings by utilizing deep neural network algorithms. The existing scholarly datasets are best suited for statistical approaches but lack citation context, intent, and section information. Furthermore, the datasets are too small to be used with deep learning approaches. For citation intent analysis, the datasets must have a citation context labeled with different citation intent classes. Most of the datasets either do not have labeled context sentences, or the sample is too small to be generalized. In this study, we critically investigated the available datasets for citation intent and proposed an automated citation intent technique to label the citation context with citation intent. Furthermore, we annotated ten million citation contexts with citation intent from Citation Context Dataset (C2D) dataset with the help of our proposed method. We applied Global Vectors (GloVe), Infersent, and Bidirectional Encoder Representations from Transformers (BERT) word embedding methods and compared their Precision, Recall, and F1 measures. It was found that BERT embedding performs significantly better, having an 89% Precision score. The labeled dataset, which is freely available for research purposes, will enhance the study of citation context analysis. Finally, It can be used as a benchmark dataset for finding the citation motivation and function from in-text citations.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据