4.7 Article

Active Collection of Land Cover Sample Data from Geo-Tagged Web Texts

Journal

REMOTE SENSING
Volume 7, Issue 5, Pages 5805-5827

Publisher

MDPI
DOI: 10.3390/rs70505805

Keywords

-

Funding

  1. Ministry of Science and Technology of China [2012BAK12B01]
  2. National Science Foundation of China [41231172, 41301412]

Ask authors/readers for more resources

Sample data plays an important role in land cover (LC) map validation. Traditionally, they are collected through field survey or image interpretation, either of which is costly, labor-intensive and time-consuming. In recent years, massive geo-tagged texts are emerging on the web and they contain valuable information for LC map validation. However, this kind of special textual data has seldom been analyzed and used for supporting LC map validation. This paper examines the potential of geo-tagged web texts as a new cost-free sample data source to assist LC map validation and proposes an active data collection approach. The proposed approach uses a customized deep web crawler to search for geo-tagged web texts based on land cover-related keywords and string-based rules matching. A data transformation based on buffer analysis is then performed to convert the collected web texts into LC sample data. Using three provinces and three municipalities directly under the Central Government in China as study areas, geo-tagged web texts were collected to validate artificial surface class of China's 30-meter global land cover datasets (GlobeLand30-2010). A total of 6283 geo-tagged web texts were collected at a speed of 0.58 texts per second. The collected texts about built-up areas were transformed into sample data. User's accuracy of 82.2% was achieved, which is close to that derived from formal expert validation. The preliminary results show that geo-tagged web texts are valuable ancillary data for LC map validation and the proposed approach can improve the efficiency of sample data collection.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available