首页 | 本学科首页   官方微博 | 高级检索  
     检索      

改进SSD的灵武长枣图像轻量化目标检测方法
引用本文:王昱潭,薛君蕊.改进SSD的灵武长枣图像轻量化目标检测方法[J].农业工程学报,2021,37(19):173-182.
作者姓名:王昱潭  薛君蕊
作者单位:宁夏大学机械工程学院,银川 750021
基金项目:国家自然科学基金(No.31660239)
摘    要:针对加载预训练模型的传统SSD(Single Shot MultiBox Detector)模型不能更改网络结构,设备内存资源有限时便无法使用,该研究提出一种不使用预训练模型也能达到较高检测精度的灵武长枣图像轻量化目标检测方法。首先,建立灵武长枣目标检测数据集。其次,以提出的改进DenseNet网络为主干网络,并将Inception模块替换SSD模型中的前3个额外层,同时结合多级融合结构,得到改进SSD模型。然后,通过对比试验证明改进DenseNet网络和改进SSD模型的有效性。在灵武长枣数据集上的试验结果表明,不加载预训练模型的情况下,改进SSD模型的平均准确率(mAP,mean Average Precision)为96.60%,检测速度为28.05帧/s,参数量为1.99×106,比SSD模型和SSD模型(预训练)的mAP分别高出2.02个百分点和0.05个百分点,网络结构参数量比SSD模型少11.14×106,满足轻量化网络的要求。即使在不加载预训练模型的情况下,改进SSD模型也能够很好地完成灵武长枣图像的目标检测任务,研究结果也可为其他无法加载预训练模型的目标检测任务提供新方法和新思路。

关 键 词:图像处理  目标检测  灵武长枣  预训练模型  SSD模型  DenseNet网络  Inception模块
收稿时间:2021/8/10 0:00:00
修稿时间:2021/9/14 0:00:00

Lightweight object detection method for Lingwu long jujube images based on improved SSD
Wang Yutan,Xue Junrui.Lightweight object detection method for Lingwu long jujube images based on improved SSD[J].Transactions of the Chinese Society of Agricultural Engineering,2021,37(19):173-182.
Authors:Wang Yutan  Xue Junrui
Institution:School of Mechanical Engineering, Ningxia University, Yinchuan 750021, China
Abstract:The complex working environment of picking robots has limited the picking speed and equipment memory resources in the intelligent harvesting of Lingwu long jujubes. Therefore, it is necessary to meet the requirements of lighter network structure and higher detection accuracy, particularly for the visual recognition system. A pre-train model has widely been loaded almost all the object detection at present, due to high initialization performance and convergence speed. However, two challenges are still remained: 1) The network structure cannot be changed on the limited memory resources of the device; 2) There may be great differences between the ImageNet dataset and the dataset to be trained, leading to the low training effect. Taking the SSD model as the basic framework, this research aims to propose a lightweight object detection for the images of Lingwu long jujubes. The excellent performance was achieved without loading the pre-train model. Firstly, data augmentation is performed on the collected 1 000 images to obtain 5 000 images. Data augmentation operations include random cropping, random vertical or horizontal flipping, random brightness adjustment, random contrast adjustment, and random saturation adjustment. Secondly, the Lingwu long jujube dataset was established, including 3 500 training images and 1 500 test images. The resolution of images consisted of 3 016×4 032, 4 068×3 456, and 2 448×3 264. The models of smartphones for image acquisition included HUAWEI TRT-AL00A, Vivo Y79A, and Xiaomi 2014501. The images were uniformly scaled to the resolution of 300×300, in order to meet the input requirements of image size in the SSD object detection. Data augmentation included random cropping, random vertical or horizontal flipping, as well as random adjustment of brightness, contrast, and saturation. The format of the PASCAL VOC dataset was also adopted. Labelling software was used to label the images, and then the marked images were stored in the label folder in XML format. Secondly, the improved DenseNet was utilized the Convolutional Block Attention Modules and two dense blocks with convolution groups of 6 and 8. Taking the improved DenseNet as the backbone network, the improved SSD model was obtained to combine with the multi-level fusion structure, where the first three additional layers were replaced in the SSD model with the Inception module. In the improved SSD model without loading the pre-train model, the mAP was 96.60%, the detection speed was 28.05 frames/s, and the number of parameters was 1.99×106, particularly 2.02 percentage points and 0.05 percentage points higher than that of the SSD and SSD model (pre-train), respectively. Correspondingly, the parameter of the improved SSD model was 11.14×106 lower than the SSD model, fully meeting the requirements of the lightweight network without loading the pre-train model. This finding can provide a strong visual technical support for the intelligent harvesting of Lingwu long jujubes, even medical and multispectral images detection tasks.
Keywords:images processing  object detection  Lingwu long jujubes  pre-train model  SSD model  DenseNet  Inception module
点击此处可从《农业工程学报》浏览原始摘要信息
点击此处可从《农业工程学报》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号