改进DeepLabV3+算法提取无作物田垄导航线 Extracting the navigation lines of crop-free ridges using improved DeepLabV3+期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

按检索

改进DeepLabV3+算法提取无作物田垄导航线

引用本文：	俞高红,王一淼,甘帅汇,徐惠民,陈逸津,王磊.改进DeepLabV3+算法提取无作物田垄导航线[J].农业工程学报,2024,40(10):168-175.

作者姓名：	俞高红王一淼甘帅汇徐惠民陈逸津王磊

作者单位：	浙江理工大学机械工程学院，杭州 310018;浙江省种植装备技术重点实验室，杭州 310018

基金项目：	国家重点研发计划项目(2022YFD2001800)；国家自然科学基金项目(52305290，52075497)；浙江省重点研发计划项目(2021C02021)；上海市科技兴农项目(沪农科创字(2021)第4-1号)

摘要：	机器视觉导航是智慧农业的重要部分，无作物田垄的导航线检测是旱地移栽导航的关键。针对无作物田垄颜色信息相近、纹理差距小，传统图像处理方法适用性差、准确率低，语义分割算法检测速度慢、实时性差的问题，该研究提出一种基于改进DeepLabV3+的田垄分割模型。首先对传统DeepLabV3+网络进行轻量化设计，用MobileNetV2网络代替主干网络Xception，以提高算法的检测速度和实时性；接着引入CBAM（convolutional block attention module，CBAM）注意力机制，使模型能够更好地处理垄面边界信息；然后利用垄面边界信息获得导航特征点，对于断垄情况，导航特征点会出现偏差，因此利用四分位数对导航特征点异常值进行筛选，并采用最小二乘法进行导航线拟合。模型评估结果显示，改进后模型的平均像素精确度和平均交并比分别为96.27%和93.18%，平均检测帧率为84.21帧/s，优于PSPNet、U-Net、HRNet、Segformer以及DeepLabV3+网络。在不同田垄环境下，最大角度误差为1.2°，最大像素误差为9，能够有效从不同场景中获取导航线。研究结果可为农业机器人的无作物田垄导航提供参考。
关键词：	机器视觉导航语义分割导航路径最小二乘法无作物田垄
收稿时间：	2024/1/29 0:00:00
修稿时间：	2024/3/16 0:00:00
Extracting the navigation lines of crop-free ridges using improved DeepLabV3+

YU Gaohong,WANG Yimiao,GAN Shuaihui,XU Huimin,CHEN Yijin,WANG Lei.Extracting the navigation lines of crop-free ridges using improved DeepLabV3+[J].Transactions of the Chinese Society of Agricultural Engineering,2024,40(10):168-175.

Authors:	YU Gaohong WANG Yimiao GAN Shuaihui XU Huimin CHEN Yijin WANG Lei

Institution:	School of Mechanical Engineering, Zhejiang Sci-Tech University, Hangzhou 310018, China;Zhejiang Province Key Laboratory of Transplanting Equipment and Technology, Hangzhou 310018, China

Abstract:	Vegetables have the largest planting areas besides grains. Machine vision navigation has been one of the most crucial indicators of mechanization, automation, and intelligence in modern agriculture. Most vegetable transplanters are still manually driven at present. It is also necessary to detect the ridge mounds before the navigation of the transplanter. Since the ridge mounds are often free of crops before transplanting, it is still challenging to find references. Therefore, it is a high demand to extract the navigation lines with crop-free ridges mounds under complex scenes. There was also similar color information and small texture difference in crop-free ridges rows. Traditional image processing cannot fully meet large-scale production. In this study, a ridge row segmentation model was proposed using an improved version of DeepLabV3+. The real-time performance of semantic segmentation was also achieved with the high applicability, accuracy, and detection speed. The traditional DeepLabV3+ network was simplified to replace the Xception backbone network with the MobileNetV2 network. The speed of detection and the real-time performance were obtained after that. The DeepLabV3+ model incorporated the Convolutional Block Attention Module attention mechanism, in order to better treat the ridge boundary information. The important details of the ridge boundary were focused to accurately detect and classify the target objects. Navigational feature points were obtained using the ridge boundary information. In cases where the seedlingless ridges were present, the navigational feature points were deviated from the intended positions. Accordingly, the feature points were adjusted for the guidance of accurate navigation. The quartiles were utilized to filter out any outliers among the navigation feature points. Any data points were identified and removed to deviate significantly from the norm. In addition, the least squares method was used to fit the navigation line using the fitted feature points. A reliable reference of the navigation line was then obtained to compensate for any deviations from the seedlingless ridges. Overall, the simplified DeepLabV3+ network with the MobileNetV2 backbone was incorporated with the CBAM attention mechanism. There were the high detection speed, real-time performance and accurate navigations, even in the challenging scenarios with the ridge boundaries. Two locations were also selected from the images, in order to improve the applicability of the model in the environments of crop-free ridges. The challenge remained on the different soil qualities, lighting conditions and seedlingless ridges in the field test. The dataset consisted of 1350 images in the training set, and 150 images in the validation set. The images were then expanded using data enhancement. The results indicate that the improved model was achieved with an average pixel accuracy of 96.27%, an average intersection and merger ratio of 93.18%, and an average detection frame rate of 84.21 frames per second. The average intersection and merger ratio, the average pixel accuracy, and the frame rate were improved by 1.78 percentage points, 0.83 percentage points, and 29.32, respectively, compared with the pre-improvement period. The best performance was then achieved in the MobileNetV2, thus meeting the requirements for navigation recognition, compared with the various backbone networks. On average, the pixel accuracies of the model were improved by 0.83 and 3.28 percentage points, respectively, compared with the Xception and MobileNetV3. Furthermore, the improved model also demonstrated better average accuracy, average intersection ratio, and frame rate than PSPNet, U-Net, HRNet, Segformer, and DeepLabV3+. The Hough transform and Random Sample Consensus were much less effective in obtaining the navigation lines from different scenes, compared with the maximum angular error of 1.2° and the maximum pixel distance error of 9 in various ridges environments. These findings can serve as a strong reference for crop-free ridge navigation in agricultural robots, thus promoting the development of intelligent agricultural equipment.

Keywords:	machine vision navigation semantic segmentation navigation path least square method crop-free ridges

	点击此处可从《农业工程学报》浏览原始摘要信息
	点击此处可从《农业工程学报》下载免费的PDF全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏