基于多特征融合和CNN模型的树种图像识别研究 Image recognition of tree species based on multi feature fusion and CNN model期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

基于多特征融合和CNN模型的树种图像识别研究

引用本文：	刘嘉政,王雪峰,王甜. 基于多特征融合和CNN模型的树种图像识别研究[J]. 北京林业大学学报, 2019, 41(11): 76-86. DOI: 10.13332/j.1000-1522.20180366

作者姓名：	刘嘉政王雪峰王甜

作者单位：	中国林业科学研究院资源信息研究所，北京 100091

基金项目：	国家重点研发计划项目（2017YFC0504106）

摘要：	目的在树种图像识别时会存在类内差异、类间相似的现象，因此导致基于单一人工特征的传统识别方法难以达到理想的识别效果。针对这一问题，本文基于卷积神经网络，提出一种将图像深层特征和人工特征融合的树种图像深度学习识别方法。方法将6类常见树种（樟子松、山杨、白桦、落叶松、雪松和白皮松）图像作为研究对象。首先，通过裁剪、水平翻转、旋转等操作，对原始树种图像集进行数量扩增，并划分为训练集和测试集，建立本次树种识别实验的图像库；其次，将本文模型设计为3路并列网络，分别选取RGB图像、HSV图像、LBP-HOG图像，从图像像素、色彩、纹理和形状的角度出发，对上述树种图像进行识别。一方面构建适合本文实验的CNN深度学习模型，将训练集样本中RGB图像和相对应的HSV图像作为第1路和第2路CNN模型的输入，进行树种图像深层特征提取；另一方面，对训练集进行高斯滤波去噪和人工提取LBP-HOG特征来代表纹理、形状特征，作为第3路CNN模型的输入。然后，将3路模型各自得到的特征在最后一层全连接层进行汇总，作为softmax分类器的最终分类依据。最后，为检验本文方法的可行性，利用上述特征和训练集对SVM分类器、BP神经网络以及现有的深度学习LeNet-5模型、VGG-16模型进行训练，对测试集进行识别验证，来比较最终的识别效果。结果本文提出的多特征融合CNN模型，训练准确率为96.13%，平均验证识别准确率为91.70%。基于单路训练的CNN树种识别模型中，RGB图像作为训练输入值时，识别率最高，为75.21%，HSV特征识别率次之，LBP-HOG特征最差；多特征融合情况下，基于RGB + H通道 + LBP条件下，验证识别准确率最高，达到93.50%；RGB + HSV + LBP + HOG组合识别率不增反降，识别率为89.50%。同样的特征或特征组合条件下，SVM、BP神经网络、LeNet-5模型和VGG-16模型所获得的识别率均低于本文模型的识别率。结论基于RGB + H通道 + LBP特征融合条件下，运用3路并列CNN模型，对本文6类树种图像进行识别的识别率最高，克服了在单一特征情况下识别率低的问题，识别效果也非常理想，实现了从大量不同树种图像中自动识别出具体类别。
关键词：	树种图像识别特征融合深度学习卷积神经网络
收稿时间：	2018-11-12
Image recognition of tree species based on multi feature fusion and CNN model

Affiliation:	Research Institute of Resource Information Techniques, Chinese Academy of Forestry, Beijing 100091, China

Abstract:	ObjectiveThere are intra-class differences and inter-class similarities in tree species image recognition, which makes it difficult for traditional methods based on single artificial features to achieve ideal recognition results. In order to solve these problems, a tree image depth learning recognition method based on convolution neural network was proposed, which combines deep features of the image with artificial features.MethodSix kinds of common tree species, including Pinus sylvestris var. mongolica, Populus davidiana, Betula platyphylla, Larix gmelinii, Cedrus deodara and Pinus alba, were studied. Firstly, the original tree species image set was expanded by clipping, horizontal flipping, rotation and other operations, and was divided into training set and test set to establish the image database of this tree species recognition experiment; secondly, the model was designed as three parallel channels. The network selected RGB image, HSV image and LBP-HOG image, respectively, and recognized the above tree image from the point of view of pixel value, color, texture and shape. On the one hand, a CNN depth learning model suitable for this experiment was constructed. The RGB image and the corresponding HSV image in the training set were used as the input of the first and second CNN models to extract the deep features of tree image. On the other hand, the training set was de-noised by Gaussian filtering, and LBP-HOG features were extracted artificially to represent texture and shape features as the input of the third CNN model. Finally, the features obtained by each of the three models were summarized in the last layer of the fully connected layer as the final classification basis of the soft Max classifier. Finally, in order to verify the feasibility of the proposed method, the SVM classifier, BP neural network, the existing depth learning LeNet-5 model and VGG-16 model were trained by the above features and training set, and the test set was identified and verified to compare the final recognition effect.ResultThe training accuracy of the multi-feature fusion CNN model was 96.13%, and the average recognition accuracy was 91.70%. In the CNN tree species recognition model based on one-way training, the recognition rate of RGB image as training input value was the highest, which was 75.21%, followed by HSV feature recognition rate, and LBP-HOG feature was the worst; in the case of multi-feature fusion, the combination recognition rate of RGB + HSV + LBP + HOG was the highest, which reached 93.50%; in the case of RGB + H channel + LBP + HOG, the recognition rate of RGB + HSV + LBP + HOG was the highest. The recognition rate was 89.50%. Under the same condition of feature or feature combination, the recognition rate of SVM, BP neural network, LeNet-5 model and VGG-16 model was lower than that of the model in this paper.ConclusionBased on RGB + H channel + LBP feature fusion, the three-way parallel CNN model is used to get the highest recognition rate for the six types of tree images in this paper, which overcomes the problem of low recognition rate in the case of a single feature, and the recognition effect is also very ideal. It realizes automatic recognition of specific categories from a large number of different tree images.

Keywords:
本文献已被 CNKI 等数据库收录！
	点击此处可从《北京林业大学学报》浏览原始摘要信息
	点击此处可从《北京林业大学学报》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏