4.5 Article

Is it all bafflegab? - Linguistic and meta characteristics of research articles in prestigious economics journals

Journal

JOURNAL OF INFORMETRICS
Volume 16, Issue 2, Pages -

Publisher

ELSEVIER
DOI: 10.1016/j.joi.2022.101284

Keywords

Research impact; SJR indicator; NLP; Readability; Gradient boosting; GLMLSS

Ask authors/readers for more resources

This paper takes an alternative approach to studying the factors associated with scientific prestige by examining the relationship between linguistic and meta characteristics of academic papers and the rankings of the journals they appear in. The study uses text mining tools to extract features from a large corpus of economics journal articles and estimates regression models to analyze the relationship between these features and journal rankings. The results identify several predictors, including paper length, coreference chain span, writing style, density of the article, collaboration in research teams, and references cited, as the most informative drivers of scientific prestige.
In competitive research environments, scholars have a natural interest to maximize the prestige associated with their scientific work. In order to identify factors that might help them address this goal more effectively, the scientometric literature has tried to link linguistic and meta characteristics of academic papers to the associated degree of scientific prestige, conceptualized as cumulative citation counts. In this paper, we take an alternative approach that instead understands scientific prestige in terms of the rankings of the journals that the articles appeared in, as such rankings are routinely used as surrogate research quality indicators. For the purpose of determining the most important drivers of suchlike prestige, we use state-of-the-art text mining tools to extract 344 interpretable features from a large corpus of over 200,000 journal articles in economics. We then estimate beta regression models to investigate the relationship between these predictors and a cross-sectionally standardized version of SCImago Journal Rank (SJR) in multiple topically homogeneous clusters. In so doing, we also reinvestigate the bafflegab theory, according to which more prestigious research papers tend to be less readable, in a methodologically novel way. Our results show the consistently most informative predictors to be associated with the length of the paper, the span of coreference chains in its full text, the deployment of a personal and moderately informal writing style, the density of the article in terms of sentences per page, international and institutional collaboration in research teams and the references cited in the paper. Moreover, we identify various linguistic intricacies that matter in the association between readability and scientific prestige, which suggest this relationship to be more complicated than previously assumed.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.5
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available