4.5 Article

Important citation identification by exploiting the syntactic and contextual information of citations

Journal

SCIENTOMETRICS
Volume 125, Issue 3, Pages 2109-2129

Publisher

SPRINGER
DOI: 10.1007/s11192-020-03677-1

Keywords

Importance citation identification; Binary citation classification; Syntactic characteristics; Contextual characteristics

Funding

  1. National Natural Science Foundation of China [71473034, 717D1063]
  2. Heilongjiang Provincial Natural Science Foundation of China [LH2019G001]
  3. Postdoctoral Scientific Research Developmental Fund of Heilongjiang Province [LBH-Q16003]
  4. Heilongjiang Province Art Planning Project: Research on Discipline Theme Evolution Based on Multi-source Data Fusion [2019C027]

Ask authors/readers for more resources

Citations are not equally important. Researchers presented different models and techniques to identify important citations. However, the features used in these work are relatively limited, so they cannot achieve good recognition performance. This paper proposed a new machine learning framework to distinguish important and non-important citations by examining the syntactic and contextual information of citations. Among them, syntactic features reflect the statistical perspective characteristics brought by citation behavior, such as the cited frequency and citation position of the cited article in the citing ones. Contextual features reflect the semantic content characteristics brought by citations, such as the intent and polarity of citations. Three feature selection algorithms, Pearson correlation coefficient, relief-F and entropy weight method, were used to calculate the contribution of each index on distinguishing different kinds of citations. On this basis, key features that can better identify the important citations were screened out. Three classifiers of support vector machine, KNN and random forest were used to test the classification performance of these key features. The experiment was performed on two annotated benchmark datasets. It showed that the framework proposed in this paper can achieve better classification performance compared with contemporary state-of-the-art research. The syntactic and contextual features of citation are of great value in identifying important citations.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.5
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available