Classification algorithm of garbage images based on novel spatial attention mechanism and transfer learning

GAO Ming, CHEN Yuhan, ZHANG Zehui, FENG Yu, FAN Weiguo

Systems Engineering - Theory & Practice ›› 2021, Vol. 41 ›› Issue (2) : 498-512.

PDF(2014 KB)
PDF(2014 KB)
Systems Engineering - Theory & Practice ›› 2021, Vol. 41 ›› Issue (2) : 498-512. DOI: 10.12011/SETP2020-1645

Classification algorithm of garbage images based on novel spatial attention mechanism and transfer learning

  • GAO Ming1,2, CHEN Yuhan1, ZHANG Zehui3,1, FENG Yu4,1, FAN Weiguo5,1
Author information +
History +

Abstract

As governments at all levels in China have started to promote mandatory garbage classification, in order to meet the standardization and automated garbage classification in all aspects of classification and recycling need a fine-grained image classification model suitable for cloud deployment with high accuracy and low latency. This article takes advantage of deep transfer learning to establish an end-to-end transfer learning network architecture GANet (garbage neural network). Aiming at the challenges of category confusion and background interference in garbage classification, this paper proposes a new pixel-level spatial attention mechanism PSATT (pixel-level spatial attention). In order to overcome the challenges of multi-class and sample imbalance, this paper proposes a label smoothing regularization loss function. In order to improve convergence speed, model stability and generalization, this paper proposes a stepped OneCycle learning rate control method, and gives a combined use strategy combining Rectified Adam (RAdam) optimization method and stochastic weight averaging. Experiments used the training data which are marked by the Shenzhen garbage classification standard and provided by the “Huawei cloud artificial intelligence competition · garbage classification challenge cup”, and verified the significant effect of GANet in the garbage classification problem, and won the national second prize (2nd place). At the same time, the proposed PSATT mechanism is superior to the comparison methods with improvement on different backbone network architectures, and has good versatility. The GANet architecture, PSATT mechanism and training strategies proposed in this paper not only have important engineering reference value, but also have good academic value.

Key words

attention mechanism / transfer learning / garbage classification / fine-grained image classification

Cite this article

Download Citations
GAO Ming , CHEN Yuhan , ZHANG Zehui , FENG Yu , FAN Weiguo. Classification algorithm of garbage images based on novel spatial attention mechanism and transfer learning. Systems Engineering - Theory & Practice, 2021, 41(2): 498-512 https://doi.org/10.12011/SETP2020-1645

References

[1] 罗建豪, 吴建鑫. 基于深度卷积特征的细粒度图像分类研究综述[J]. 自动化学报, 2017, 43(8):1306-1318. Luo J H, Wu J X. A survey on fine-grained image categorization using deep convolutional features[J]. Acta Automatica Sinica, 2017, 43(8):1306-1318.
[2] 郭礼华, 牛新亚, 马军, 等. DTCNN的人脸识别算法的Map-Reduce并行化实现研究[J]. 系统工程理论与实践, 2014, 34(S1):283-286. Guo L H, Niu X Y, Ma J, et al. Research of face recognition algorithm using the deep tiled convolutional neural networks and Map-Reduce method[J]. Systems Engineering-Theory & Practice, 2014, 34(S1):283-286.
[3] Pan S J, Yang Q. A survey on transfer learning[J]. IEEE Transactions on Knowledge and Data Engineering, 2010, 22(10):1345-1359.
[4] 吴俊杰, 刘冠男, 王静远, 等. 数据智能:趋势与挑战[J]. 系统工程理论与实践, 2020, 40(8):2116-2149. Wu J J, Liu G N, Wang J Y, et al. Data intelligence:Trends and challenges[J]. Systems Engineering-Theory & Practice, 2020, 40(8):2116-2149.
[5] Tan M, Le Q V. EfficientNet:Rethinking model scaling for convolutional neural networks[C]//International Conference on Machine Learning, 2019:6105-6114.
[6] Szegedy C, Vanhoucke V, Ioffe S, et al. Rethinking the inception architecture for computer vision[C]//Conference on Computer Vision and Pattern Recognition, 2016:2818-2826.
[7] Lin T Y, Goyal P, Girshick R, et al. Focal loss for dense object detection[J]. IEEE Transactions on Pattern Analysis & Machine Intelligence, 2020, 42(2):318-327.
[8] Garipov T, Izmailov P, Podoprikhin D, et al. Loss surfaces, mode connectivity, and fast ensembling of DNNs[C]//Neural Information Processing Systems, 2018:8789-8798.
[9] Izmailov P, Podoprikhin D, Garipov T, et al. Averaging weights leads to wider optima and better generalization[C]//Uncertainty in Artificial Intelligence, 2018:876-885.
[10] Liu L, Jiang H, He P, et al. On the variance of the adaptive learning rate and beyond[C]//International Conference on Learning Representations, 2020.
[11] Rabano S L, Cabatuan M K, Sybingco E, et al. Common garbage classification using mobilenet[C]//International Conference on Humanoid, 2018.
[12] He K, Zhang X, Ren S, et al. Deep residual learning for image recognition[C]//IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016:770-778.
[13] Huang G, Liu Z, Laurens V D M, et al. Densely connected convolutional networks[C]//IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017:2261-2269.
[14] Mahajan D, Girshick R, Ramanathan V, et al. Exploring the limits of weakly supervised pretraining[C]//European Conference on Computer Vision, 2018:185-201.
[15] Aral R A, Keskin S R, Kaya M, et al. Classification of trashnet dataset based on deep learning models[C]//IEEE International Conference on Big Data, 2018:2058-2062.
[16] Donahue J, Jia Y, Vinyals O, et al. DeCAF:A deep convolutional activation feature for generic visual recognition[C]//International Conference on Machine Learning, 2014:647-655.
[17] Zhang N, Donahue J, Girshick R, et al. Part-based R-CNNs for fine-grained category detection[C]//European Conference on Computer Vision, 2014:834-849.
[18] Tsutsui S, Fu Y, Crandall D J, et al. Meta-Reinforced synthetic data for one-shot fine-grained visual recognition[C]//Neural Information Processing Systems, 2019:3057-3066.
[19] Fu J, Zheng H, Mei T, et al. Look closer to see better:Recurrent attention convolutional neural network for fine-grained image recognition[C]//IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017:4476-4484.
[20] Wang F, Jiang M, Qian C, et al. Residual attention network for image classification[C]//IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017:6450-6458.
[21] Hu J, Shen L, Sun G, et al. Squeeze-and-excitation networks[C]//IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018:7132-7141.
[22] Wang X, Girshick R, Gupta A, et al. Non-local neural networks[C]//IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018:7794-7803.
[23] Bello I, Zoph B, Le Q V, et al. Attention augmented convolutional networks[C]//IEEE International Conference on Computer Vision, 2019:3286-3295.
[24] Srivastava N, Hinton G, Krizhevsky A, et al. Dropout:A simple way to prevent neural networks from overfitting[J]. The Journal of machine learning research, 2014, 15(1):1929-1958.
[25] Smith L N, Topin N. Super-convergence:Very fast training of neural networks using large learning rates[C]//Artificial Intelligence & Machine Learning for Multi-domain Operations Applications, 2019.
[26] Calvo R A, Lee J M. Coping with the news:The machine learning way[C]//Proceedings of Ausweb 2003 Conference, Gold Coast, 2003.
[27] Zhou Z. A brief introduction to weakly supervised learning[J]. National Science Review, 2018, 5(1):44-53.
[28] Caron M, Bojanowski P, Joulin A, et al. Deep clustering for unsupervised learning of visual features[C]//European Conference on Computer Vision, 2018:139-156.
[29] Goodfellow I J, Pouget-Abadie J, Mirza M, et al. Generative adversarial networks[J]. Advances in Neural Information Processing Systems, 2014, 3:2672-2680.

Funding

National Natural Science Foundation of China (71831003, 71772033); Natural Science Foundation of Liaoning Province, China (Joint Funds for Key Scientific Innovation Bases, 2020-KF-11-11); Scientific Research Project of the Education Department of Liaoning Province, China (LN2019Q14)
PDF(2014 KB)

1193

Accesses

0

Citation

Detail

Sections
Recommended

/