YOLOv8np-RCW: A multi-task deep learning model for comprehensive visual information in tomato harvesting robot
DOI:
https://doi.org/10.25165/ijabe.v%25vi%25i.9719Keywords:
tomato bunch detection, maturity detection, keypoint detection, harvesting robotsAbstract
In greenhouse environments, using automated machines for tomato harvesting to reduce labor consumption is a future development trend. Accurate and effective visual recognition is essential to accomplish harvesting tasks. However, most current studies use various models to gain harvesting information in multiple steps, resulting in heavy calculation costs, poor real-time availability, and weak recognition precision. In this study, an improved YOLOv8np-RCW end-to-end model based on YOLOv8n pose is proposed to simultaneously detect tomato bunches, maturity, and keypoints using a decoupled-head structure. The model integrates a ResNet-enhanced RepVGG architecture for a balance of accuracy and speed, employs the CARAFE upsampling algorithm for a larger receptive field with lightweight design, and optimizes the loss function with WIoU loss to enhance bounding box prediction, maturity detection, and keypoint extraction. Experimental results indicate that mAP50 of YOLOv8np-RCW model for bounding box and keypoints is 87.3% and 86.8% respectively, which is 6.2% and 5.5% higher than YOLOv8n pose model. Completing the tasks of bunch detection, maturity assessment, and keypoint localization requires only 9.8 ms. Euclidean distance error is less than 20 pixels in detecting keypoints. Based on this model, a method is proposed to quickly determine the orientation of tomato bunches using geometric cross-product and cross-multiplication calculations from keypoint 2D information, providing guidance for the motion planning of the end-effector. In field experiments, the robot achieved a harvesting success rate of 68%, with an average time of 10.8366 seconds per tomato bunch. Keywords: tomato bunch detection, maturity detection, keypoint detection, harvesting robots DOI: 10.25165/j.ijabe.20251805.9719 Citation: Ai X Y, Zhang T X, Yuan T, Zheng X J, Xiong Z M, Yuan J C. YOLOv8np-RCW: A multi-task deep learning model for comprehensive visual information in tomato harvesting robot. Int J Agric & Biol Eng, 2025; 18(5): 246–258.References
Liu J Z, Peng Y, Faheem M. Experimental and theoretical analysis of fruit plucking patterns for robotic tomato harvesting. Comput Electron Agric, 2020; 173: 105330.
Zhang F, Gao J, Zhou H, Zhang J X, Zou K L, Yuan T. Three-dimensional pose detection method based on keypoints detection network for tomato bunch. Comput Electron Agric, 2022; 195: 106824.
Maureira F, Rajagopalan K, Stöckle C O. Evaluating tomato production in open-field and high-tech greenhouse systems. J Clean Prod, 2022; 337: 130459.
Zhou H Y, Wang X, Au W, Kang H W, Chen C. Intelligent robots for fruit harvesting: recent developments and future challenges. Precis Agric, 2022; 23: 1856–1907.
Zheng X J, Rong J C, Zhang Z Q, Yang Y, Li W, Yuan T. Fruit growing direction recognition and nesting grasping strategies for tomato harvesting robots. J Field Robot, 2024; 41: 300–313.
Wu J Q, Fan S Z, Gong L, Yuan J, Zhou Q, Liu C L. Research status and development direction of design and control technology of fruit and vegetable picking robot system. Smart Agric, 2020; 2(4): 17–40.
Gao J, Zhang J X, Zhang F, Gao J F. LACTA: A lightweight and accurate algorithm for cherry tomato detection in unstructured environments. Expert Syst Appl, 2024; 238: 122073.
Rapado-Rincón D, van Henten E J, Kootstra G. Development and evaluation of automated localisation and reconstruction of all fruits on tomato plants in a greenhouse based on multi-view perception and 3D multi-object tracking. Biosyst Eng, 2023; 231: 78–91.
Xiong Y, Ge Y, From P J. An obstacle separation method for robotic picking of fruits in clusters. Comput Electron Agric, 2020; 175: 105397.
Kim J, Pyo H, Jang I, Kang J, Ju B, Ko K. Tomato harvesting robotic system based on Deep-ToMaToS: Deep learning network using transformation loss for 6D pose estimation of maturity classified tomatoes with side-stem. Comput Electron Agric, 2022; 201: 107300.
Li H P, Li C Y, Li G B, Chen L X. A real-time table grape detection method based on improved YOLOv4-tiny network in complex background. Biosyst Eng, 2021; 212: 347–359.
Li T H, Sun M, He Q H, Zhang G S, Shi G Y, Ding X M, et al. Tomato recognition and location algorithm based on improved YOLOv5. Comput Electron Agric, 2023; 208: 107759.
Zhang J X, Xie J Y, Zhang F, Gao J, Yang C, Song C Y, et al. Greenhouse tomato detection and pose classification algorithm based on improved YOLOv5. Comput Electron Agric, 2024; 216: 108519.
Yoshida T, Fukao T, Hasegawa T. Cutting point detection using a robot with point clouds for tomato harvesting. J Robot Mechatron, 2020; 32(2): 437–444.
Qi J T, Liu X N, Liu K, Xu F R, Guo H, Tian X L, et al. An improved YOLOv5 model based on visual attention mechanism: Application to recognition of tomato virus disease. Comput Electron Agric, 2022; 194: 106780.
Rong Q J, Hu C H, Hu X D, Xu M X. Picking point recognition for ripe tomatoes using semantic segmentation and morphological processing. Comput Electron Agric, 2023; 210: 107923.
Fu L H, Wu F Y, Zou X J, Jiang Y L, Lin J Q, Yang Z, et al. Fast detection of banana bunches and stalks in the natural environment based on deep learning. Comput Electron Agric, 2022; 194: 106800.
Zhu Y J, Li S S, Du W S, Du Y P, Liu P, Li X. Identification of table grapes in the natural environment based on an improved YOLOv5 and localization of picking points. Precis Agric, 2023; 24: 1333–1354.
Chen J Q, Ma A Q, Huang L X, Li H W, Zhang H Y, Huang Y, et al. Efficient and lightweight grape and picking point synchronous detection model based on key point detection. Comput Electron Agric, 2024; 217: 108612.
Ukwuoma C C, Zhiguang Q, Bin Heyat M B, Ali L, Almaspoor Z, Monday H N. Recent advancements in fruit detection and classification using deep learning techniques. Math Probl Eng, 2022; 2022(1): 9210947.
Koirala A, Walsh K B, Wang Z, McCarthy C. Deep learning – Method overview and review of use for fruit detection and yield estimation. Comput Electron Agric, 2019; 162: 219–234.
Redmon J, Farhadi A. YOLO9000: Better, faster, stronger. 2016; Available: https://doi.org/10.48550/arXiv.1612.08242. Accessed on [2024-11-17].
Bochkovskiy A, Wang C-Y, Liao H-Y M. YOLOv4: Optimal speed and accuracy of object detection. 2020; Available: https://doi.org/10.48550/arXiv.2004.10934. Accessed on [2024-11-17].
Wang C-Y, Bochkovskiy A, Liao H-Y M. YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. 2022; Available: https://doi.org/10.48550/arXiv.2207.02696. Accessed on [2024-11-17].
Ding X H, Zhang X Y, Ma N N, Han J G, Ding G G, Sun J. RepVGG: making VGG-style ConvNets great again. 2021. Available: https://doi.org/10.48550/arXiv.2101.03697. Accessed on [2024-09-23].
He K M, Zhang X Y, Ren S Q, Sun J. Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA: IEEE, 2016; pp.770–778. doi: 10.1109/CVPR.2016.90.
Ioffe S, Szegedy C. Batch normalization: Accelerating deep network training by reducing internal covariate shift. 2015; Available: https://doi.org/10.48550/arXiv.1502.03167. Accessed on [2025-01-23].
Wang J Q, Chen K, Xu R, Liu Z W, Loy C C, Lin D H. CARAFE: Content-aware reassembly of FEatures. 2019; Available: https://doi.org/10.48550/arXiv.1905.02188. Accessed on [2024-11-03].
Tong Z J, Chen Y H, Xu Z W, Yu R. Wise-IoU: Bounding box regression loss with dynamic focusing mechanism. 2023; Available: https://doi.org/10.48550/arXiv.2301.10051. Accessed on [2024-12-12].
Downloads
Published
How to Cite
Issue
Section
License
IJABE is an international peer reviewed open access journal, adopting Creative Commons Copyright Notices as follows.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).