4.7 Article

Global meta-analysis of evolution patterns for lake topics over centurial scale: A natural language understanding-based deep clustering approach with 130,000 studies

Journal

JOURNAL OF HYDROLOGY
Volume 614, Issue -, Pages -

Publisher

ELSEVIER
DOI: 10.1016/j.jhydrol.2022.128597

Keywords

Evolution pattern; Deep learning; Lake topics; Keywords clustering; Centurial scale

Ask authors/readers for more resources

This study proposes a Natural Language Understanding-Based Deep Clustering (NLU-DC) approach for global meta-analysis of lake topics. By analyzing a large literature dataset, it is found that lake topics have become more abundant and concentrated towards central ones over the century. Six evolution patterns are identified, and it is observed that emerging topics in the past twenty years attract less attention while the dependency between topics is receiving more attention.
As complicated microcosms, lakes have been attracting exponentially increasing attention, resulting in plentiful interdisciplinary academic publications of more than 10(5). It is thereby challengeable to explore the massive unstructured text information of publications to understand lake topics from the global- and centurial-scale perspectives. However, conventional bibliometrics suffer from the limitations of non-understanding the literature. A novel approach, Natural Language Understanding-Based Deep Clustering (NLU-DC) for large text clustering, was proposed in this study for global meta-analysis of evolution patterns for lake topics. The validated NLU-DC elevated the available keywords from 24% to 70%, correcting the statistical bias in the traditional evidence synthesis. Its high performance derives from the integration of a deep learning model, cosine distance, DBSCAN clustering and changing hyperparameters. This approach is of great accuracy and efficiency for large text datasets. This study thereby identified the centurial-scale topics related to lakes using large literature datasets covering >130,000 studies. The results showed that the topics became more and more abundant but were concentrated stably towards central ones. Six evolution patterns, consisting of fluctuating, emerging in 1970, emerging after 1970, trending-up, stable and trending-down patterns, were identified with generalized linear model (GLM). We found that, in recent twenty years, few emerging topics attract significant academic attention; and the dependency between topics is catching more attention than before. To prolong the lake studies, it is essential to strengthen the integrated studies on multi-pattern topics, in particular, over emerging and trending-down topics. Our study verified that the NLU-DC, consisting of state-of-the-art deep learning models in natural language processing and efficient clustering algorithm in machine learning, is a powerful method for global meta-analysis of water-related research fields and have huge potential to be applied in all fields of academical studies.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available