4.6 Article

The Bangkok Urbanscapes Dataset for Semantic Urban Scene Understanding Using Enhanced Encoder-Decoder With Atrous Depthwise Separable A1 Convolutional Neural Networks

Journal

IEEE ACCESS
Volume 10, Issue -, Pages 59327-59349

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/ACCESS.2022.3176712

Keywords

Object recognition; urbanscapes dataset; deep convolutional neural networks; semantic image segmentation; Thailand

Funding

  1. Chulalongkorn University (CU) Graduate School Thesis Grant
  2. Science and Technology Research Partnership for Sustainable Development (SATREPS) Project of The Japan Science and Technology Agency (JST)
  3. Japan International Cooperation Agency (JICA) Smart transport strategy for Thailand 4.0 realizing the better quality of life and low-carbon society'' (Chair: Prof. Yoshitsugu Hayashi, Chubu University, Japan) [JPMJSA1704]
  4. Japan Society for the Promotion of Science (JSPS) [20K11873]
  5. Chubu University Grant
  6. Grants-in-Aid for Scientific Research [20K11873] Funding Source: KAKEN

Ask authors/readers for more resources

This paper presents a semantic segmentation method for autonomous driving systems, which enhances the DeepLab-V3-A1 architecture by modifying the Xception model and utilizing a different number of 1x1 convolution layers. Experimental results show that our proposed method performs comparably to the baseline methods on various measurement units. Additionally, we contribute the Bangkok Urbanscapes dataset, aiming to improve autonomous driving systems in cities with unique traffic and driving conditions.
Semantic segmentation is one of the computer vision tasks which is widely researched at present. It plays an essential role to adapt and apply for real-world use-cases, including the application with autonomous driving systems. To further study self-driving cars in Thailand, we provide both the proposed methods and the proposed dataset in this paper. In the proposed method, we contribute Deeplab-V3-A1 with Xception, which is an extension of DeepLab-V3+ architecture. Our proposed method as DeepLab-V3-A1 with Xception is enhanced by the different number of 1 x 1 convolution layers on the decoder side and refining the image classification backbone with modification of the Xception model. The experiment was conducted on four datasets: the proposed dataset and three public datasets i.e., the CamVid, the cityscapes, and IDD datasets, respectively. The results show that our proposed strategy as DeepLab-V3-A1 with Xception performs comparably to the baseline methods for all corpora including measurement units such as mean IoU, F1 score, Precision, and Recall. In addition, we benchmark DeepLab-V3-A1 with Xception on the validation set of the cityscapes dataset with a mean IoU of 78.86%. For our proposed dataset, we first contribute the Bangkok Urbanscapes dataset, the urban scenes in Southeast Asia. This dataset contains the pair of input images and annotated labels for 701 images. Our dataset consists of various driving environments in Bangkok, as shown for eleven semantic classes (Road, Building, Tree, Car, Footpath, Motorcycle, Pole, Person, Trash, Crosswalk, and Misc). We hope that our architecture and our dataset would help self-driving autonomous developers improve systems for driving in many cities with unique traffic and driving conditions similar to Bangkok and elsewhere in Thailand. Our implementation codes and dataset are available at https://kaopanboonyuen.github.io/bkkurbanscapes.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available