An Improved Deeplabv3+ Model for Semantic Segmentation of Urban Environments Targeting Autonomous Driving

Wang Wang; Hua He; Changsong Ma

doi:10.15837/ijccc.2023.6.5879

Authors

Wang Wang Geely University of China, Chengdu, Sichuan, China
Hua He Chongqing Technology and Business University, Chongqing, China / International College Krirk University Bangkok, Thailand
Changsong Ma Geely University of China, Chengdu, Sichuan, China / International College Krirk University Bangkok, Thailand

DOI:

https://doi.org/10.15837/ijccc.2023.6.5879

Keywords:

Autonomous Driving; Semantic Segmentation; Urban Environments;Improved Deeplabv3+ Model

Abstract

This paper proposes an improved Deeplabv3+ model for semantic segmentation of urban scenes targeting autonomous driving applications. A high-quality semantic segmentation dataset is constructed from 2,967 manually labeled aerial images captured at 200m height with a 5-eye camera. The images contain 5 classes - buildings, vegetation, ground, lake and playgrounds. The improved Deeplabv3+ network enriches high-level semantics by replacing max pooling with depthwise separable convolutions. Dilated convolutions extract multi-scale features to avoid overfitting. Experiments demonstrate that the model achieves an overall mean IoU of 0.87 on the test set, with IoU scores of 0.90, 0.92 and 0.94 on buildings, vegetation and water respectively. The model shows promising results for extracting semantic information from complex urban environments to support navigation for autonomous vehicles.

References

A. Nüchter and J. Hertzberg, "Towards semantic maps for mobile robots," Robotics and Autonomous Systems, vol. 56, pp. 915-926, Nov. 2008.

https://doi.org/10.1016/j.robot.2008.08.001

L. Teng and Y. Qiao, "BiSeNet-oriented context attention model for image semantic segmentation," ComSIS, vol. 19, no. 3, pp. 1409-1426, 2022.

https://doi.org/10.2298/CSIS220321040T

F. Zeng, B. Yang, M. Zhao, Y. Xing, and Y. Ma, "MASANet: Multi-Angle Self-Attention Network for Semantic Segmentation of Remote Sensing Images," Tehnički Vjesnik, vol. 29, pp. 1567-1575, Jan. 2022. Publisher: Faculty of Mechanical Engineering in Slavonski Brod, Faculty of Electrical Engineering in Osijek, Faculty of Civil Engineering in Osijek.

J. Zhang, X. Yu, X. Lei, and C. Wu, "A novel deep LeNet-5 convolutional neural network model for image recognition," Computer Science and Information Systems, vol. 19, pp. 36-36, Jan. 2022.

https://doi.org/10.2298/CSIS220120036Z

X. Ma, Z. Li, and L. Zhang, "An Improved ResNet-50 for Garbage Image Classification," Tehnički Vjesnik, vol. 29, pp. 1552-1559, Jan. 2022. Publisher: Faculty of Mechanical Engineering in Slavonski Brod, Faculty of Electrical Engineering in Osijek, Faculty of Civil Engineering in Osijek.

M. Ozkahraman and M. Ozkahraman, "Artificial Intelligence in Foreign Object Classification in Fenceless Robotic Work Cells Using 2-D Safety Cameras | Request PDF."

L. Teng and Y. Q. , "Classification of Beef by Using Artificial Intelligence."

D. M. Asriny and R. Jayadi, "Transfer Learning VGG16 for Classification Orange Fruit Images," vol. 13, no. 1, 2023.

S. Sarp and M. Kuzlu, "[PDF] A comparison of deep learning algorithms on image data for detecting floodwater on roadways | Semantic Scholar."

Y. Kim, H. Song, J. Han, and Konyang, "A Deepfake-Based Deep Learning Algorithm for Medical Data Manipulation Detection," 2022.

Y. Lee, "A Study on Abnormal Behavior Detection in CCTV Images through the Supervised Learning Model of Deep Learning," vol. 9, no. 2, 2022.

M. M. Gomaa, E. R. Mohamed, A. M. Zaki, and A. Elnashar, "Deep Learning to Detect Image Forgery Based on Image Classification," vol. 12, no. 6, 2022.

Y.-H. Cho and M.-D. Shahbe, "Vision-based In-room Fall Detection Application," vol. 9, no. 4, 2022.

B. L. G. Floros, "[PDF] Joint 2D-3D temporally consistent semantic segmentation of street scenes | Semantic Scholar."

D. L. and F. Jurie, "Combining appearance models and Markov Random Fields for category level object segmentation | IEEE Conference Publication | IEEE Xplore."

J. Long, E. Shelhamer, and T. Darrell, "Fully Convolutional Networks for Semantic Segmentation," Mar. 2015. arXiv:1411.4038 [cs].

https://doi.org/10.1109/CVPR.2015.7298965

A. Paszke, A. Chaurasia, S. Kim, and E. Culurciello, "ENet: A Deep Neural Network Architecture for Real-Time Semantic Segmentation," June 2016. arXiv:1606.02147 [cs].

V. B. A. K. R. Cipolla, "SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation | IEEE Journals & Magazine | IEEE Xplore."

F. Yu and V. Koltun, "Multi-Scale Context Aggregation by Dilated Convolutions," Apr. 2016. arXiv:1511.07122 [cs].

G. L. A. M. C. S. I. Reid, "RefineNet: Multi-path Refinement Networks for High-Resolution Semantic Segmentation | IEEE Conference Publication | IEEE Xplore."

L. V. G. C. K. I. W. J. W. A. Z. Mark Everingham, S. M. Ali Eslami, "The PASCAL Visual Object Classes Challenge: A Retrospective - University of Edinburgh Research Explorer."

M. Cordts, M. Omran, S. Ramos, T. Rehfeld, M. Enzweiler, R. Benenson, U. Franke, S. Roth, and B. Schiele, "The Cityscapes Dataset for Semantic Urban Scene Understanding," Apr. 2016. arXiv:1604.01685 [cs].

https://doi.org/10.1109/CVPR.2016.350

H. Ibrahem, A. Salem, and H.-S. Kang, "DTS-Net: Depth-to-Space Networks for Fast and Accurate Semantic Object Segmentation," Sensors, vol. 22, Jan. 2022.

https://doi.org/10.3390/s22010337

W. Wang, J. Dai, Z. Chen, Z. Huang, Z. Li, X. Zhu, X. Hu, T. Lu, L. Lu, H. Li, X. Wang, and Y. Qiao, "InternImage: Exploring Large-Scale Vision Foundation Models with Deformable Convolutions," Nov. 2022.

https://doi.org/10.1109/CVPR52729.2023.01385

T. Qin, Y. Zheng, T. Chen, Y. Chen, and Q. Su, "RoadMap: A Light-Weight Semantic Map for Visual Localization towards Autonomous Driving," June 2021. arXiv:2106.02527 [cs].

https://doi.org/10.1109/ICRA48506.2021.9561663

H. Ranjbar, P. Forsythe, A. A. F. Fini, M. Maghrebi, and T. S. Waller, "Addressing practical challenge of using autopilot drone for asphalt surface monitoring: Road detection, segmentation, and following," Results in Engineering, vol. 18, p. 101130, June 2023.

https://doi.org/10.1016/j.rineng.2023.101130

H. T. , H. Xu, and J. Dai, "BSIRNet: A Road Extraction Network with Bidirectional Spatial Information Reasoning."

L.-C. Chen, G. Papandreou, I. Kokkinos, K. Murphy, and A. L. Yuille, "DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs," IEEE Trans Pattern Anal Mach Intell, vol. 40, pp. 834-848, Apr. 2018.

https://doi.org/10.1109/TPAMI.2017.2699184

Y. Wang, L. Mo, H. Ma, and J. Yuan, "OccGAN: Semantic image augmentation for driving scenes," Pattern Recognition Letters, vol. 136, pp. 257-263, Aug. 2020.

https://doi.org/10.1016/j.patrec.2020.06.011

I. Papadeas, L. Tsochatzidis, A. Amanatiadis, and I. Pratikakis, "Real-Time Semantic Image Segmentation with Deep Learning for Autonomous Driving: A Survey," Applied Sciences, vol. 11, p. 8802, Sept. 2021.

https://doi.org/10.3390/app11198802

G. L. A. M. C. S. I. Reid, "SegFast-V2: Semantic image segmentation with less parameters in deep learning for autonomous driving - Manipal Academy of Higher Education, Manipal, India."

H. Wang, Y. Chen, Y. Cai, L. Chen, Y. Li, M. A. Sotelo, and Z. Li, "SFNet-N: An Improved SFNet Algorithm for Semantic Segmentation of Low-Light Autonomous Driving Road Scenes," IEEE Transactions on Intelligent Transportation Systems, vol. 23, pp. 21405-21417, Nov. 2022. Conference Name: IEEE Transactions on Intelligent Transportation Systems.

https://doi.org/10.1109/TITS.2022.3177615

G. Rizzoli, F. Barbato, and P. Zanuttigh, "Multimodal Semantic Segmentation in Autonomous Driving: A Review of Current Approaches and Future Perspectives," Technologies, vol. 10, p. 90, July 2022.

https://doi.org/10.3390/technologies10040090

Q. S. , R. Priyadarshini, and A. Vidyarthi, "Intelligent Semantic Segmentation for Self-Driving Vehicles Using Deep Learning."

H. S. and T. Wang, "Semantic segmentation in autonomous driving-an example of FCN | Proceedings of the 2023 7th International Conference on Innovation in Artificial Intelligence."

L. P. Tchapmi, C. B. Choy, I. Armeni, J. Gwak, and S. Savarese, "SEGCloud: Semantic Segmentation of 3D Point Clouds," Oct. 2017. arXiv:1710.07563 [cs].

https://doi.org/10.1109/3DV.2017.00067

R. Q. C. H. S. M. K. L. J. Guibas, "PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation | IEEE Conference Publication | IEEE Xplore."

A. Boulch, J. Guerry, B. Le Saux, and N. Audebert, "SnapNet: 3D point cloud semantic labeling with 2D deep segmentation networks," Computers & Graphics, vol. 71, pp. 189-198, Apr. 2018.

https://doi.org/10.1016/j.cag.2017.11.010

L.-C. Chen, Y. Zhu, G. Papandreou, F. Schroff, and H. Adam, "Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation," in Computer Vision - ECCV 2018 (V. Ferrari, M. Hebert, C. Sminchisescu, and Y. Weiss, eds.), Lecture Notes in Computer Science, (Cham), pp. 833-851, Springer International Publishing, 2018.

https://doi.org/10.1007/978-3-030-01234-2_49