Improved YOLOv8 network using multi-scale feature fusion for detecting small tea shoots in complex environments

Authors

  • Yatao Li 1. School of Mechanical Engineering, Zhejiang Sci-Tech University, Hangzhou 310018, China 2. State Key Laboratory of Tea Plant Germplasm Innovation and Resource Utilization, Anhui Agricultural University, Hefei 230036, China 3. Fujian Key Laboratory of Big Data Application and Intellectualization for Tea Industry (Wuyi University), Wuyishan 354300, Fujian, China
  • Liuhuan Tan 1. School of Mechanical Engineering, Zhejiang Sci-Tech University, Hangzhou 310018, China
  • Zhenghao Zhong 1. School of Mechanical Engineering, Zhejiang Sci-Tech University, Hangzhou 310018, China
  • Leiying He 1. School of Mechanical Engineering, Zhejiang Sci-Tech University, Hangzhou 310018, China 4. Key Laboratory of Transplanting Equipment and Technology of Zhejiang Province, Hangzhou 310018, China
  • Jianneng Chen 1. School of Mechanical Engineering, Zhejiang Sci-Tech University, Hangzhou 310018, China 4. Key Laboratory of Transplanting Equipment and Technology of Zhejiang Province, Hangzhou 310018, China
  • Chuanyu Wu 1. School of Mechanical Engineering, Zhejiang Sci-Tech University, Hangzhou 310018, China 4. Key Laboratory of Transplanting Equipment and Technology of Zhejiang Province, Hangzhou 310018, China
  • Zhengmin Wu 2. State Key Laboratory of Tea Plant Germplasm Innovation and Resource Utilization, Anhui Agricultural University, Hefei 230036, China

DOI:

https://doi.org/10.25165/ijabe.v18i5.9475

Keywords:

tea shoot segmentation, multi-scale fusion, attention mechanism, reparameterization technique, YOLOv8-seg

Abstract

Tea shoot segmentation is crucial for the automation of high-quality tea plucking. However, accurate segmentation of tea shoots in unstructured and complex environments presents significant challenges due to the small size of the targets and the similarity in color between the shoots and their background. To address these challenges and achieve accurate recognition of tea shoots in complex settings, an advanced tea shoot segmentation network model is proposed based on You Only Look Once version 8 segmentation (YOLOv8-seg) network model. Firstly, to enhance the model’s segmentation capability for small targets, this study designed a feature fusion network that incorporates shallow, large-scale features extracted by the backbone network. Subsequently, the features extracted at different scales by the backbone network are fused to obtain both global and local features, thereby enhancing the overall information representation capability of the features. Furthermore, the Efficient Channel Attention mechanism was integrated into the feature fusion process and combined with a reparameterization technique to refine and improve the efficiency of the fusion process. Finally, Wise-IoU with a dynamic non-monotonic aggregation mechanism was employed to assign varying gradient gains to anchor boxes of differing qualities. Experimental results demonstrate that the improved network model increases the AP50 of box and mask by 4.33% and 4.55%, respectively, while maintaining a smaller parameter count and reduced computational demand. Compared to other classical segmentation algorithms models, the proposed model excels in tea shoot segmentation. Overall, the advancements proposed in this study effectively segment tea shoots in complex environments, offering significant theoretical and practical contributions to the automated plucking of high-quality tea. Keywords: tea shoot segmentation, multi-scale fusion, attention mechanism, reparameterization technique, YOLOv8-seg DOI: 10.25165/j.ijabe.20251805.9475 Citation: Li Y T, Tan L H, Zhong Z H, He L Y, Chen J N, Wu C Y, et al. Improved YOLOv8 network using multi-scale feature fusion for detecting small tea shoots in complex environments. Int J Agric & Biol Eng, 2025; 18(5): 223–233.

Author Biographies

Zhenghao Zhong, 1. School of Mechanical Engineering, Zhejiang Sci-Tech University, Hangzhou 310018, China

School of Mechanical Engineering, Zhejiang Sci-Tech University

Leiying He, 1. School of Mechanical Engineering, Zhejiang Sci-Tech University, Hangzhou 310018, China 4. Key Laboratory of Transplanting Equipment and Technology of Zhejiang Province, Hangzhou 310018, China

School of Mechanical Engineering, Zhejiang Sci-Tech University

Jianneng Chen, 1. School of Mechanical Engineering, Zhejiang Sci-Tech University, Hangzhou 310018, China 4. Key Laboratory of Transplanting Equipment and Technology of Zhejiang Province, Hangzhou 310018, China

School of Mechanical Engineering, Zhejiang Sci-Tech University

Chuanyu Wu, 1. School of Mechanical Engineering, Zhejiang Sci-Tech University, Hangzhou 310018, China 4. Key Laboratory of Transplanting Equipment and Technology of Zhejiang Province, Hangzhou 310018, China

Zhejiang Ocean University

Zhengmin Wu, 2. State Key Laboratory of Tea Plant Germplasm Innovation and Resource Utilization, Anhui Agricultural University, Hefei 230036, China

State Key Laboratory of Tea Plant Germplasm Innovation Resource Utilization, Anhui Agricultural University

References

Yu T J, Chen J N, Chen Z W, Li Y T, Tong J H, Du X Q. DMT: A model detecting multispecies of tea buds in multi-seasons. Int J Agric & Biol Eng, 2024; 17(1): 199–208.

Yang J W, Li X, Wang X, Fu L Y, Li S W. Vision-Based Localization Method for Picking Points in Tea-Harvesting Robots. Sensors, 2024; 24(21): 6777.

Zheng H, Fu T, Xue X L, Ye Y X, Yu G H. Research status and prospect of tea mechanized picking technology. Journal of Chinese Agricultural Mechanization, 2023; 44(9): 28–35. (in Chinese)

Zhao L L, Deng H B, Zhou Y C, Miao T, Zhao K, Yang J, et al. Instance segmentation model of maize seedling images based on automatic generated labels. Transactions of the Chinese Society of Agricultural Engineering, 2023; 39(11): 201–211. (in Chinese)

Zhang L, Zou L, Wu C Y, Jia J M, Chen J N. Method of famous tea sprout identification and segmentation based on improved watershed algorithm. Computers and Electronics in Agriculture, 2021; 184: 106108.

Fan P, Lang G D, Yan B, Lei X Y, Guo P J, Liu Z J, et al. A method of segmenting apples based on gray-centered rgb color space. Remote Sensing, 2021; 13(6): 1211.

Akbar J U M, Kamarulzaman S F, Muzahid A J F, Rahman M A, Uddin M. A comprehensive review on deep learning assisted computer vision techniques for smart greenhouse agriculture. IEEE ACCESS, 2024; 12: 4485–4522.

Kang H W, Chen C. Fruit detection, segmentation and 3D visualisation of environments in apple orchards. Computers and Electronics in Agriculture, 2020; 171: 105302.

Liao J C, Babiker I, Xie W F, Li W, Cao L B. Dandelion segmentation with background transfer learning and RGB-attention module. Computers and Electronics in Agriculture, 2022; 202: 107355.

Xu W K, Zhao L G, Li J, Shang S Q, Ding X P, Wang T W. Detection and classification of tea buds based on deep learning. Computers and Electronics in Agriculture, 2022; 192: 106547.

Redmon J, Farhadi A. Yolov3: An incremental improvement. arXiv: 1804.02767. 2018; In press. doi: 10.48550/arXiv.1804.02767.

Gui Z Y, Chen J N, Li Y, Chen Z W, Wu C Y, Dong C W. A lightweight tea bud detection model based on Yolov5. Computers and Electronics in Agriculture, 2023; 205: 107636.

Han K, Wang Y H, Tian Q, Guo J Y, Xu C J, Xu C. GhostNet: More features from cheap operations. arXiv: 1911.11907, 2019; In press. doi: 10.48550/arXiv.1911.11907.

Li Y T, He L Y, Jia J M, Lv J, Chen J N, Qiao X, et al. In-field tea shoot detection and 3D localization using an RGB-D camera. Computers and Electronics in Agriculture, 2021; 185: 106149.

Wu H Y, Wang Y S, Zhao P F, Qian M B. Small-target weed-detection model based on YOLO-V4 with improved backbone and neck structures. Precision Agriculture, 2023; 24(6): 2149–2170.

Xu H, Zhong S, Zhang T X, Zou X. Multiscale multilevel residual feature fusion for real-time infrared small target detection. IEEE Transactions on Geoscience and Remote Sensing, 2023; 61: 1–16.

Liu Q H, Zhang Y, Yang G P. Small unopened cotton boll counting by detection with MRF-YOLO in the wild. Computers and Electronics in Agriculture, 2023; 204: 107576.

Lin T Y, Dollár P, Girshick R, He K, Hariharan B, Belongie S. Feature pyramid networks for object detection. arXiv: 1612.03144, 2016; In press. doi: 10.48550/arXiv.1612.03144.

Liu S, Qi L, Qin H F, Shi J P, Jia J Y. Path aggregation network for instance segmentation. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA: IEEE, 2018; pp.8759–8768. doi: 10.48550/arXiv.1803.01534.

Tan M X, Pang R M, Le Q V. EfficientDet: Scalable and efficient object detection. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Seattle, WA, USA: IEEE, 2020; pp.10778–10787. doi: 10.1109/CVPR42600.2020.01079.

Zhang WQ, Huang Z L, Luo G Z, Chen T, Wang X G, Liu W Y, et al. TopFormer: Token pyramid transformer for mobile semantic segmentation. In: 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA: IEEE, 2022; 12083–12093. doi: 10.1109/CVPR52688.2022.01177.

Wang Y W, Ren Y Q, Kang S Y, Yin C B, Shi Y, Men H. Identification of tea quality at different picking periods: A hyperspectral system coupled with a multibranch kernel attention network. Food Chemistry, 2024; 433: 137307.

Qian H M, Wang H L, Feng S, Yan S Y. FESSD: SSD target detection based on feature fusion and feature enhancement. J Real-Time Image Process, 2023; 20: 2.

Song M X, Liu C, Chen L Q, Liu L C, Ning J M, Yu C Y. Recognition of tea buds based on an improved YOLOv7 model. Int J Agric & Biol Eng, 2024; 17(6): 238–244.

Wang C-Y, Liao H-Y M, Wu Y-H, Chen P-Y, Hsieh J-W, Yeh I-H. CSPNet: A new backbone that can enhance learning capability of CNN. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Seattle, WA, USA: IEEE, 2020; pp.1571–1580. doi: 10.48550/arXiv.1911.11929.

Zheng Z H, Wang P, Ren D W, Liu W, Ye R G, Hu Q H, et al. Enhancing geometric factors in model learning and inference for object detection and instance segmentation. IEEE Transactions on Cybernetics, 2022; 52(8): 8574–8586.

Din X H, Zhang X Y, Ma N N, Han J G, Ding G G, Sun J. Repvgg: Making vgg-style convnets great again. In: 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA: IEEE, 2021; pp.13728–13737. doi: 10.48550/arXiv.2101.03697.

Wang Q L, Wu B G, Zhu F P, Li P H, Zuo W M, Hu Q H. ECA-Net: Efficient channel attention for deep convolutional neural networks. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA: IEEE, 2020; pp.11531–11539. doi: 10.1109/CVPR42600.2020.01155.

Tong Z J, Chen Y H, Xu Z W, Yu R. Wise-IoU: Bounding box regression loss with dynamic focusing mechanism. arXiv: 2301.10051, 2023; In press. doi: 10.48550/arXiv.2301.10051.

Selvaraju R R, Cogswell M, Das A, Vedantam R, Parikh D, Batra D. Grad-CAM: Visual explanations from deep networks via gradient-based localization. arXiv: 1610.02391, 2016; In press. doi: 10.48550/arXiv.1610.02391.

Downloads

Published

2025-10-27

How to Cite

Li, Y., Tan, L., Zhong, Z., He, L., Chen, J., Wu, C., & Wu, Z. (2025). Improved YOLOv8 network using multi-scale feature fusion for detecting small tea shoots in complex environments. International Journal of Agricultural and Biological Engineering, 18(5), 223–233. https://doi.org/10.25165/ijabe.v18i5.9475

Issue

Section

Information Technology, Sensors and Control Systems