4.7 Article

Effectiveness of Semi-Supervised Learning and Multi-Source Data in Detailed Urban Landuse Mapping with a Few Labeled Samples

Journal

REMOTE SENSING
Volume 14, Issue 3, Pages -

Publisher

MDPI
DOI: 10.3390/rs14030648

Keywords

urban landuse; small sample learning; semi-supervised classification; sampling strategy; multi-source geospatial data

Funding

  1. Natural Science Foundation of China (NSFC) General Program [41971386]
  2. Hong Kong Research Grant Council (RGC) General Research Fund [HKBU 12301820]
  3. Shenzhen Science and Technology Innovation Committee General Project [JCYJ20210324101406019]

Ask authors/readers for more resources

This study explores the effectiveness of a semi-supervised classification framework with multi-source data for detailed urban landuse classification with a few labeled samples. The results show that the classification accuracy of the semi-supervised method is generally on par with that of traditional supervised methods, and less labeled samples are needed to achieve a comparable result. The study also found that the classification accuracy using multi-source data is significantly higher than that with any single data source being applied.
Detailed urban landuse information plays a fundamental role in smart city management. A sufficient sample size has been identified as a very crucial pre-request in machine learning algorithms for urban landuse classification. However, it is often difficult to recognize and label landuse categories from remote sensing images alone. Alternatively, field investigation is time-consuming with a high demand in human resources and monetary cost. Therefore, previous studies on urban landuse classification have often relied on a small size of labeled samples with very uneven spatial distribution. This study aims to explore the effectiveness of a semi-supervised classification framework with multi-source data for detailed urban landuse classification with a few labeled samples. A disagreement-based semi-supervised learning approach, the Co-Forest, was employed and compared with traditional supervised methods (e.g., random forest and XGBoost). Multi-source geospatial data were utilized including optical and nighttime light remote sensing and geospatial big data, which present the physical and socio-economic features of landuse categories. Taking urban landuse classification in Shenzhen City as a case, results show that the classification accuracy of the semi-supervised method are generally on par with that of traditional supervised methods, and less labeled samples are needed to achieve a comparable result under different training set ratios. Given a small sample size, the accuracy tends to be stable with training samples no less than 5% in total. Our results also indicate that the classification accuracy by using multi-source data is significantly higher than that with any single data source being applied. Among these data, map POI and high-resolution optical remote sensing data make larger contributions on the classification, followed by mobile data and nighttime light remote sensing data.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available