Efficient and comprehensive visual solution for a smart apple harvesting robot in complex settings via multi-class instance segmentation
DOI:
https://doi.org/10.25165/ijabe.v18i4.9619Keywords:
apple harvesting, instance segmentation, multi-weather condition, star operation, edge computing deviceAbstract
To enable efficient and low-cost automated apple harvesting, this study presented a multi-class instance segmentation model, SCAL (Star-CAA-LADH), which utilizes a single RGB sensor for image acquisition. The model achieves accurate segmentation of fruits, fruit-bearing branches, and main branches using only a single RGB image, providing comprehensive visual inputs for robotic harvesting. A Star-CAA module was proposed by integrating Star operation with a Context-Anchored Attention mechanism (CAA), enhancing directional sensitivity and multi-scale feature perception. The Backbone and Neck networks were equipped with hierarchically structured SCA-T/F modules to improve the fusion of high- and low-level features, resulting in more continuous masks and sharper boundaries. In the Head network, a Segment_LADH module was employed to optimize classification, bounding box regression, and mask generation, thereby improving segmentation accuracy for small and adherent targets. To enhance robustness in adverse weather conditions, a Chain-of-Thought Prompted Adaptive Enhancer (CPA) module was integrated, thereby increasing model resilience in degraded environments. Experimental results demonstrate that SCAL achieves 94.9% AP_M and 95.1% mAP_M, outperforming YOLOv11s by 6.6% and 4.6%, respectively. Under multi-weather testing conditions, the CPA-SCAL variant consistently outperforms other comparison models in accuracy. After INT8 quantization, the model size was reduced to 14.5 MB, with an inference speed of 47.2 frames per second (FPS) on the NVIDIA Jetson AGX Xavier. Experiments conducted in simulated orchard environments validate the effectiveness and generalization capabilities of the SCAL model, demonstrating its suitability as an efficient and comprehensive visual solution for intelligent harvesting in complex agricultural settings. Keywords: apple harvesting, instance segmentation, multi-weather condition, star operation, edge computing device DOI: 10.25165/j.ijabe.20251804.9619 Citation: Wen S W, Ge Y H, Wang Y K, Wei N S, Zhou J G, Hu G R, et al. Efficient and comprehensive visual solution for a smart apple harvesting robot in complex settings via multi-class instance segmentation. Int J Agric & Biol Eng, 2025; 18(4): 200-215.References
Zhou J G, Wang Y K, Chen J, Luo T Y, Hu G R, Jia J L, et al. Research hotspots and development trends of harvesting robots based on bibliometric analysis and knowledge graphs. Int J Agric & Biol Eng, 2024; 17(6): 1-10. doi: 10.25165/j.ijabe.20241706.8739.
Jia W K, Zhang Y, Lian J, Zheng Y J, Zhao D, Li C J. Apple harvesting robot under information technology: A review. International Journal of Advanced Robotics Systems, 2020; 17(3): 255688461. doi: 10.1177/1729881420925310.
Zhang Q. Opinion paper: Precision agriculture, smart agriculture, or digital agriculture. Computers and Electronics in Agriculture, 2023; 211: 107982. doi: 10.1016/j.compag.2023.107982.
Mhamed M, Zhang Z, Yu J F, Li Y F, Zhang M. Advances in apple’s automated orchard equipment: A comprehensive research. Computers and Electronics in Agriculture, 2024; 221: 108926. doi: 10.1016/j.compag.2024.108926.
Wang Z H, Xun Y, Wang Y K, Yang Q H. Review of smart robots for fruit and vegetable picking in agriculture. Int J Agric & Biol Eng, 2022; 15(1): 33-54. doi: 10.25165/j.ijabe.20221501.7232.
Nath S. A vision of precision agriculture: Balance between agricultural sustainability and environmental stewardship. Agronomy Journal, 2024; 116(3): 1126-1143. doi: 10.1002/agj2.21405.
Du X Q, Meng Z C, Ma Z H, Zhao L J, Lu W W, Cheng H C, et al. Comprehensive visual information acquisition for tomato picking robot based on multitask convolutional neural network. Biosystems Engineering, 2024; 238: 51-61. doi: 10.1016/j.biosystemseng.2023.12.017.
Zhang F, Hou Z Y, Gao J, Zhang J X, Deng X. Detection method for the cucumber robotic grasping pose in clutter scenarios via instance segmentation. Int J Agric & Biol Eng, 2023; 16(6): 215-225. doi: 10.25165/j.ijabe.20231606.7542.
Gao A, Du Y H, Li Y Q, Song Y P, Ren L L. Apple flower phenotype detection method based on YOLO-FL and application of intelligent flower thinning robot. Int J Agric & Biol Eng, 2025; 18(3): 236-246. doi: 10.25165/j.ijabe.20251803.9110.
Wen S W, Zhou J G, Hu G R, Zhang H, Tao S, Wang Z Y, et al. PcMNet: An efficient lightweight apple detection algorithm in natural orchards. Smart Agricultural Technology, 2024; 9: 100623. doi: 10.1016/j.atech.2024.100623.
Li T F, Fang W T, Zhao G A, Gao F F, Wu Z C, Li R, et al. An improved binocular localization method for apple based on fruit detection using deep learning. Information Processing in Agriculture, 2023; 10(2): 276-287. doi: 10.1016/j.inpa.2021.12.003.
Rong J C, Dai G L, Wang P B. A peduncle detection method of tomato for autonomous harvesting. Complex & Intelligent Systems, 2021; 8: 2955-2969. doi: 10.1007/s40747-021-00522-7.
Gu W, C Bai S, Kong L X. A review on 2D instance segmentation based on deep neural networks. Image and Vision Computing, 2022; 120: 104401. doi: 10.1016/j.imavis.2022.104401.
Dong L Z, Zhu L C, Zhao B, Wang R X, Ni J P, Liu S C, et al. Semantic segmentation-based observation pose estimation method for tomato harvesting robots. Computers and Electronics in Agriculture, 2025; 230: 109895. doi: 10.1016/j.compag.2025.109895.
Wang D D, He D J. Fusion of Mask RCNN and attention mechanism for instance segmentation of apples under complex background. Computers and Electronics in Agriculture, 2022; 196: 106864. doi: 10.1016/j.compag.2022.106864.
Tong S Y, Yue Y, Li W B, Wang Y X, Kang F, Feng C. Branch identification and junction points location for apple trees based on deep learning. Remote Sensing, 2022; 14(18): 4495. doi: 10.3390/rs14184495.
Sapkota R, Ahmed D, Karkee M. Comparing YOLOv8 and Mask R-CNN for instance segmentation in complex orchard environments. Artificial Intelligence in Agriculture, 2024; 13: 84-99. doi: 10.1016/j.aiia.2024.07.001.
Yan B, Liu Y, Yan W H. A novel fusion perception algorithm of tree branch/trunk and apple for harvesting robot based on improved YOLOv8s. Agronomy, 2024; 14(9): 1895. doi: 10.3390/agronomy14091895.
Rong Q J, Hu C H, Hu X D, Xu M X. Picking point recognition for ripe tomatoes using semantic segmentation and morphological processing. Computers and Electronics in Agriculture, 2023; 210: 107923. doi: 10.1016/j.compag.2023.107923.
Kang H W, Chen C. Fruit detection, segmentation and 3D visualisation of environments in apple orchards. Computers and Electronics in Agriculture, 2020; 171: 105302. doi: 10.1016/j.compag.2020.105302.
Molina J M, Llerena J P, Usero L, Patricio M A. Advances in instance segmentation: Technologies, metrics and applications in computer vision. Neurocomputing, 2025; 625: 129584. doi: 10.1016/j.neucom.2025.129584.
Yang C. H, Xiong L Y, Wang Z, Wang Y, Shi G, Kuremot T, et al. Integrated detection of citrus fruits and branches using a convolutional neural network. Computers and Electronics in Agriculture, 2020; 174: 105469. doi: 10.1016/j.compag.2020.105469.
Karim S, Tong G, Li J, Qadir A, Farooq U, Yu, Y. Current advances and future perspectives of image fusion: A comprehensive review. Information Fusion, 2023, 90: 185-217, doi: 10.1016/j.inffus.2022.09.019.
Ma X, Dai X Y, Bai Y, Wang Y Z, Fu Y. Rewrite the stars. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle: IEEE, 2024; pp.5694-5703, doi: 10.1109/CVPR52733.2024.00544.
Cai X H, Lai Q X, Wang Y W, Wang W G, Sun Z R, Yao Y Z. Poly kernel inception network for remote sensing detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle: IEEE, 2024; pp.27706-27716. doi: 10.1109/CVPR52733.2024.02617.
Zhang J R, Chen Z H, Yan G X, Wang Y, Hu B. Faster and lightweight: An improved YOLOv5 object detector for remote sensing images. Remote Sensing, 2023; 15(20): 4974. doi: 10.3390/rs15204974.
Zhang Y W, Wu Y, Liu Y M, Peng X Y. CPA-enhancer: Chain-of-thought prompted adaptive enhancer for object detection under unknown degradations. arXiv Preprint, 2024: arXiv:2403.11220. doi: 10.48550/arXiv.2403.11220.
Selvaraju R R, Cogswell M, Das A, Vedantam R, Parikh D, Batra D. Grad-CAM: Visual explanations from deep networks via gradient-based localization. International Journal of Computer Vision, 2020; 128: 336-359. doi: 10.1007/s11263-019-01228-7.
Jocher G, Chaurasia A, Stoken A, Borovec J, NanoCode012, Kwon Y, et al. Ultralytics/YOLOv5. 2020. Available: https://zenodo.org/records/7347926. Accessed on [2024-12-25]. doi: 10.5281/zenodo.3908559.
Jocher G, Chaurasia A, Qiu J. Ultralytics YOLOv8. 2023. Available: https://github.com/ultralytics/ultralytics. Accessed on [2024-12-28].
Wang A, Chen H, Liu L H, Chen K, Lin Z J, Han J G, et al. YOLOv10: Real-time end-to-end object detection. In: NIPS '24: Proceedings of the 38th International Conference on Neural Information Processing Systems, 2024; 37: 107984-108011.
Khanam R, Hussain M. YOLOv11: An overview of the key architectural enhancements. arXiv Preprint, 2024: arXiv:2410.17725. doi: 10.48550/arXiv.2410.17725.
Li B, Liu X, Hu P, Wu Z, Lv J, Peng X. All-in-one image restoration for unknown corruption. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022; pp.17452-17462.
Hua W J, Zhang Z, Zhang W Q, Liu X H, Hu C, He Y C, et al. Key technologies in apple harvesting robot for standardized orchards: A comprehensive review of innovations, challenges, and future directions. Computers and Electronics in Agriculture, 2025; 235: 110343. doi: 10.1016/j.compag.2025.110343.
Huang T L, Pan H H, Sun W C, Gao H J. Sine resistance network-based motion planning approach for autonomous electric vehicles in dynamic environments. IEEE Transactions on Transportation Electrification, 2022; 8(2): 2862-2873. doi: 10.1109/TTE.2022.3151852.
Hu G R, Chen C, Chen J, Sun L J, Sugirbay A, Chen Y, et al. Simplified 4-DOF manipulator for rapid robotic apple harvesting. Computers and Electronics in Agriculture, 2022; 199: 107177. doi: 10.1016/j.compag.2022.107177.
Yuan J J, Wu F J, Zhao L M, Zhang Q X, Chen Y H. IMFF: A dual-space optimization network via multi-level feature fusion and boundary-aware learning for high-resolution remote sensing scene classification. Expert Systems with Applications, 2025; 296(PartC): 129163. doi: 10.1016/j.eswa.2025.129163.
Downloads
Published
How to Cite
Issue
Section
License
IJABE is an international peer reviewed open access journal, adopting Creative Commons Copyright Notices as follows.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).