期刊
TRANSACTIONS IN GIS
卷 26, 期 3, 页码 1256-1279出版社
WILEY
DOI: 10.1111/tgis.12902
关键词
-
类别
资金
- National Natural Science Foundation of China [42050101, U1711267, 41871311, 41871305, 42101467]
- Hubei Key Laboratory of Intelligent Geo-Information Processing [KLIGIP-2021A01]
- China Postdoctoral Science Foundation [2021M702991]
- Fundamental Research Funds for the Central Universities, China University of Geosciences (Wuhan) [CUG2106116]
This article proposes a weakly supervised Chinese toponym recognition (ChineseTR) architecture, which automatically generates training datasets and utilizes a bidirectional recurrent neural network for toponym recognition. Experimental results show that ChineseTR achieves good performance in toponym recognition.
Toponym recognition is used to extract toponyms from natural language texts, which is a fundamental task of ubiquitous geographic information applications. Existing toponym recognition methods with state-of-the-art performance mainly leverage supervised learning (i.e., deep-learning-based approaches) with parameters learned from massive, labeled datasets that must be annotated manually. This is a great inconvenience when model training needs to fit different domain texts, especially those of social media messaging. To address this issue, this article proposes a weakly supervised Chinese toponym recognition (ChineseTR) architecture that leverages a training dataset creator that generates training datasets automatically based on word collections and associated word frequencies from various texts and an extension recognizer that employs a basic bidirectional recurrent neural network based on particular features designed for toponym recognition. The results show that the proposed ChineseTR achieves a 0.76 F1 score in a corpus with a 0.718 out-of-vocabulary rate and a 0.903 in-vocabulary rate. All comparative experiments demonstrate that ChineseTR is an effective and scalable architecture that recognizes toponyms.
作者
我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。
推荐
暂无数据