4.7 Article

Towards interpreting multi-temporal deep learning models in crop mapping

期刊

REMOTE SENSING OF ENVIRONMENT
卷 264, 期 -, 页码 -

出版社

ELSEVIER SCIENCE INC
DOI: 10.1016/j.rse.2021.112599

关键词

Crop mapping; Interpretation; Feature importance; Multi-temporal classification; Long short-term memory; Attention; Deep learning; Corn and soybean

资金

  1. National Natural Science Foundation of China [32071894]
  2. Zhejiang University

向作者/读者索取更多资源

Multi-temporal deep learning methods show promising performance in large-scale crop mapping by transforming remote sensing data into high-dimensional features for crop identification. The study demonstrates the importance of complete time series input, key growth periods, and critical bands for crop discrimination. By analyzing input feature importance and hidden features, deep learning models extract refined information for effective crop classification.
Multi-temporal deep learning approaches have exhibited excellent classification performance in large-scale crop mapping. These approaches efficiently and automatically transform remote sensing time series into highdimensional feature representations to identify crop types. The lack of interpretation, however, is regarded as a major drawback of these high-performance approaches. Interpreting deep learning approaches in multitemporal crop mapping is critical for verifying their reliability. This study aims to quantify the impact of multi-temporal information in input time series on classification performance and develop a multi-perspective interpretation pipeline for deep learning models. The pipeline involves three interpretation approaches: evaluating input feature importance, analyzing hidden features, and monitoring temporal changes in model's soft output. An experiment is conducted to classify corn and soybean in the U.S corn belt in 2018. The study area consists of three sites each encompassing millions of pixel-level samples at 30 m resolution. The Landsat Analysis Ready Data are used as the input remote sensing time series and Cropland Data Layer is used as the ground reference. Attention-based Long Short-Term Memory (AtLSTM) and Transformer models are built as multitemporal deep learning models, and compared to Random Forest (RF). Complete time series input in the correct order achieves a higher overall accuracy of 97.8% than using single-window or out-of-order inputs, indicating multi-temporal information facilitates crop classification. An assessment of the input feature importance demonstrates that the AtLSTM, Transformer, and RF models all consider the period from weeks 11 to 20 (earlyJuly to late-August) as a key growth period and the shortwave infrared band as the critical band for corn and soybean discrimination. Hidden feature analysis suggests that the AtLSTM model accumulates the useful information over the growth period, while the Transformer model extracts the temporal dependencies that contribute important information to high-level feature learning. The learned features contain more effective and refined information than the raw input features and thus are better suited for crop classification. The soft output analysis in the in-season classification scenario demonstrates that increased length of input time series improves the model's confidence in the classification results. The further comparison of input feature importance in different sites and years demonstrates the applicability of the interpretation approach at larger spatiotemporal extents with heterogeneous landscapes and interannual variability. This study provides a multi-perspective evaluation to identify key features in multi-spectral and multi-temporal remote sensing data, and yields a practical approach to integrate agronomy knowledge in deep learning-based crop mapping.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据