Fast and accurate detection of kiwifruits in the natural environment using improved YOLOv4

Jinpeng Wang, Lei Xu, Song Mei, Haoruo Hu, Jialiang Zhou, Qing Chen

Abstract


Real-time detection of kiwifruits in natural environments is essential for automated kiwifruit harvesting. In this study, a lightweight convolutional neural network called the YOLOv4-GS algorithm was proposed for kiwifruit detection. The backbone network CSPDarknet-53 of YOLOv4 was replaced with GhostNet to improve accuracy and reduce network computation. To improve the detection accuracy of small targets, the upsampling of feature map fusion was performed for network layers 151 and 154, and the spatial pyramid pooling network was removed to reduce redundant computation. A total of 2766 kiwifruit images from different environments were used as the dataset for training and testing. The experiment results showed that the F1-score, average accuracy, and Intersection over Union (IoU) of YOLOv4-GS were 98.00%, 99.22%, and 88.92%, respectively. The average time taken to detect a 416×416 kiwifruit image was 11.95 ms, and the model’s weight was 28.8 MB. The average detection time of GhostNet was 31.44 ms less than that of CSPDarknet-53. In addition, the model weight of GhostNet was 227.2 MB less than that of CSPDarknet-53. YOLOv4-GS improved the detection accuracy by 8.39% over Faster R-CNN and 8.36% over SSD-300. The detection speed of YOLOv4-GS was 11.3 times and 2.6 times higher than Faster R-CNN and SSD-300, respectively. In the indoor picking experiment and the orchard picking experiment, the average speed of the YOLOv4-GS processing video was 28.4 fps. The recognition accuracy was above 90%. The average time spent for recognition and positioning was 6.09 s, accounting for about 29.03% of the total picking time. The overall results showed that the YOLOv4-GS proposed in this study can be applied for kiwifruit detection in natural environments because it improves the detection speed without compromising detection accuracy.
Keywords: kiwifruits, fruit recognition, natural environments, YOLOv4
DOI: 10.25165/j.ijabe.20241705.7658

Citation: Wang J P, Xu L, Mei S, Hu H R, Zhou J L, Chen Q. Fast and accurate detection of kiwifruits in the natural environment using improved YOLOv4. Int J Agric & Biol Eng, 2024; 17(5): 222-230.

Keywords


kiwifruits, fruit recognition, natural environments, YOLOv4

Full Text:

PDF

References


Xiao X, Li M. Fusion of data-driven model and mechanistic model for kiwifruit flesh firmness prediction. Computers and Electronics in Agriculture, 2022; 193: 106651.

Yang C, Lee W S, Gader P. Hyperspectral band selection for detecting different blueberry fruit maturity stages. Computers and Electronics in Agriculture, 2014; 109: 23–31.

Song Z Z, Zhou Z X, Wang W Q, Gao F F, Fu L S, Li R, et al. Canopy segmentation and wire reconstruction for kiwifruit robotic harvesting. Computers and. Electronics in Agriculture, 2021; 181: 105933.

Mu L T, Liu H Z, Cui Y J, Fu L S, Gejima Y. Mechanized technologies for scaffolding cultivation in the kiwifruit industry: A review. Information Processing in Agriculture, 2018; 5(4): 401–410.

Zhang Z, Igathinathane C, Li J, Cen H, Lu Y, Flores P. Technology progress in mechanical harvest of fresh market apples. Computers and Electronics in Agriculture, 2020; 175: 105606.

Jia W K, Zhang Z H, Shao W J, Hou S J, Ji Z, Liu G L, et al. FoveaMask: A fast and accurate deep learning model for green fruit instance segmentation. Computers and Electronics in Agriculture, 2021; 191: 106488.

Cui Y J, Su S, Wang X X, Tian Y F, Li P P, et al. Recognition and feature extraction of kiwifruit in natural environment based on machine vision. Transactions of CSAM, 2013; 44(5): 247–252. (in Chinese)

Tian K, Li J H, Zeng J F, Evans A, Zhang L N. Segmentation of tomato leaf images based on adaptive clustering number of K-means algorithm. Computers and Electronics in Agriculture, 2019; 165: 104962.

Maldonado W, Barbosa J C. Automatic green fruit counting in orange trees using digital images. Computers and Electronics in Agriculture, 2016; 127: 572–581.

Wiatowski T, Bölcskei H. A mathematical theory of deep convolutional neural networks for feature extraction. IEEE Transactions on Information Theory, 2018; 64(3): 1845–1866.

Majeed Y, Karkee M, Zhang Q. Estimating the trajectories of vine cordons in full foliage canopies for automated green shoot thinning in vineyards. Computers and Electronics in Agriculture, 2020; 176: 105671.

Zhou J, Fu X Q, Zhou S Q, Zhou J F, Ye H, Nguyen H T. Automated segmentation of soybean plants from 3D point cloud using machine learning. Computers and Electronics in Agriculture, 2019; 162: 143–153.

Xu C Y, Liu Y, Ding F L, Zhuang Z L. Recognition and grasping of disorderly stacked wood planks using a local image patch and point pair feature method. Sensors, 2020; 20(21): 6235.

Jin X J, Sun Y X, Yu J L, Chen Y. Weed recognition in vegetable at seedling stage based on deep learning and image processing. Journal of Jilin University (Engineering and Technology Edition), 2023; 53(8): 2421–2419. (in Chinese)

Ni C, Li Z Y, Zhang X, Zhao L, Zhu T T, Jiang X S. Film sorting algorithm in seed cotton based on near-infrared hyperspectral image and deep learning. Transactions of the CSAM, 2019; 50(12): 170–179. (in Chinese)

Girshick R, Donahue J, Darrell T, Malik J. Rich feature hierarchies for accurate object detection and semantic segmentation. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, 2014; pp.580–587. doi: 10.1109/CVPR.2014.81.

Ren S Q, He K, Girshick R, Sun J. Faster R-CNN: Towards real-time object detection with region proposal networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015; 39(6): 1137–1149.

Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu C Y, Berg A C. SSD: Single shot multibox detector. In: Proceedings of the European Conference on Computer Vision – ECCV 2016, 2016; pp.21–37. doi: 10.1007/978-3-319-46448-0_2.

Redmon J, Divvala S, Girshick R, Farhadi A. You Only Look Once: Unified, Real-Time Object Detection. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas: IEEE, 2016; pp.779–788. doi: 10.1109/CVPR.2016.91.

Xiong J T, Liu Z, Tang L Y, Lin R, Bu R B, Peng H X. Visual detection technology of green citrus under natural environment. Transactions of the CSAM, 2018; 49(4): 45–52. (in Chinese)

Song Z Z, Fu L S, Wu J Z, Liu Z H, Li R, Cui Y J. Kiwifruit detection in field images using Faster R-CNN with VGG16. IFAC-PapersOnLine, 2019; 52(30): 76–81.

Li S J, Hu D Y, Gao S M, Lin J H, An X S, Zhu M. Real-time classification and detection of citrus based on improved single short multibox detecter. Transactions of the CSAE, 2019; 35(24): 307–313. (in Chinese)

Liu G X, Nouaze J C, Touko Mbouembe P L, Kim J H. YOLO-Tomato: A robust algorithm for tomato detection based on YOLOv3. Sensors, 2020; 20(7): 2145.

Wang J P, Gao K, Jiang H Z, Zhou H P. Method for detecting dragon fruit based on improved lightweight convolutional neural network. Transactions of CSAE, 2020; 36(20): 218–225. (in Chinese)

Bochkovskiy A, Wang C Y, Liao H Y M. YOLOv4: Optimal speed and accuracy of object detection. arXiv 2020. arXiv: 2004.10934.

Han K, Wang Y H, Tian Q, Guo J Y, Xu C J, Xu C. GhostNet: More features from cheap operations. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020; pp.1577–1586. doi: 10.1109/CVPR42600.2020.00165.

Howard A, Sandler M, Chen B, Wang W J, Chen L-C, Tan M X, et al. Searching for MobileNetV3. In: 2019 IEEE/CVF International Conference on Computer Vision (ICCV), Seoul: IEEE, 2019; 1314–1324. doi: 10.1109/ICCV.2019.00140.

Howard A G, Zhu M L, Chen B, Kalenichenko D, Wang W J, Weyand T, et al. Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv 2017, arXiv: 1704.04861.

Zhang X Y, Zhou X Y, Lin M X, Sun J. ShuffleNet: An extremely efficient convolutional neural network for mobile devices. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City: IEEE, 2018; pp.6848–6856. doi: 10.1109/CVPR.2018.00716.

Wu B C, Dai X L, Zhang P Z, Wang Y H, Sun F, Wu Y M, et al. FBNet: Hardware-aware efficient ConvNet design via differentiable neural architecture search. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2019; pp.10726–10734. doi: 10.1109/CVPR.2019.01099.

Tian Y N, Yang G D, Wang Z, Wang H, Li E, Liang Z Z. Apple detection during different growth stages in orchards using the improved YOLO-V3 model. Computer and Electronics in Agriculture, 2019; 157: 417–426.

Wang D D, He D J. Recognition of apple targets before fruits thinning by robot based on R-FCN deep convolution neural network. Transactions of the CSAE, 2019; 35(3): 156–163. (in Chinese)

Mu L T, Gao Z B, Cui Y J, Li K, Liu H Z, Fu L S. Kiwifruit detection of far-view and occluded fruit based on improved AlexNet. Transactions of the CSAM, 2019; 50(10): 24–34. (in Chinese)




Copyright (c) 2024 International Journal of Agricultural and Biological Engineering

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.

2023-2026 Copyright IJABE Editing and Publishing Office