首页 | 本学科首页   官方微博 | 高级检索  
     检索      

基于多特征提取和Stacking集成学习的金线莲品系分类
引用本文:谢文涌,柴琴琴,甘勇辉,陈舒迪,张勋,王武.基于多特征提取和Stacking集成学习的金线莲品系分类[J].农业工程学报,2020,36(14):203-210.
作者姓名:谢文涌  柴琴琴  甘勇辉  陈舒迪  张勋  王武
作者单位:福州大学电气工程与自动化学院,福州 350108;福建省医疗器械和医药技术重点实验室,福州 350108;漳州职业技术学院食品工程学院,漳州 363000;福建中医药大学药学院,福州 350122
基金项目:国家自然科学基金项目(61773124);福建省科技厅高校产学合作项目(No.2019Y4009);福建省食品药品监督管理局金线莲质量标准提升专项([3500]FJJF[DY]2018008)。
摘    要:针对传统中药鉴定、分子鉴定、生物技术鉴定及光谱检测技术的主观性强、耗时、操作复杂等不足,以及金线莲整个叶片形态区分度小、单一分类器鉴别精度不高的问题,该研究提出了基于机器视觉的叶片子区间多特征提取方法和基于多模型融合的Stacking集成学习算法实现金线莲的品系分类。试验采集6个品系的金线莲叶片图像数据,进行图像预处理后提取叶片子区间内纹理、颜色共114个特征,基于这些特征,构建堆叠式两阶段集成学习框架,以逻辑回归、K最近邻、随机森林和梯度提升决策树(Gradient Boosting Decision Tree,GBDT)作为基分类器,GBDT作为元分类器进行学习。试验结果表明,Stacking集成学习模型的整体识别综合评价指标F值达93.91%,分类正确率达94.49%,分别比逻辑回归、K最近邻、随机森林和GBDT这4个单一分类模型高出4.40、11.87、11.01、12.94个百分点和5.36、11.34、6.93、12.13个百分点。因此,该研究能够有效识别金线莲品系,为形状大小相似、形状特征难以利用的植物叶片识别提供参考。

关 键 词:机器视觉  模型  金线莲  子区间分割  特征提取  Stacking集成学习  植物叶片
收稿时间:2020/4/3 0:00:00
修稿时间:2020/6/17 0:00:00

Strains classification of Anoectochilus roxburghii using multi-feature extraction and Stacking ensemble learning
Xie Wenyong,Chai Qinqin,Gan Yonghui,Chen Shudi,Zhang Xun,Wang Wu.Strains classification of Anoectochilus roxburghii using multi-feature extraction and Stacking ensemble learning[J].Transactions of the Chinese Society of Agricultural Engineering,2020,36(14):203-210.
Authors:Xie Wenyong  Chai Qinqin  Gan Yonghui  Chen Shudi  Zhang Xun  Wang Wu
Institution:1.College of Electrical Engineering and Automation, Fuzhou University, Fuzhou 350108, China; 2. Ministry of Education Key Laboratory of Medical Instrument and Pharmaceutical Technology, Fuzhou University, Fuzhou 350108, China;;3.School of Food Engineering, Zhangzhou Institute of Technology, Zhangzhou 363000,;4. College of Pharmacy, Fujian University of Traditional Chinese Medicine, Fuzhou 350122, China
Abstract:Abstract: Anoectochilus roxburghii (A.roxburghii) is a rare medicinal herb that mainly distributed in China. It is necessary to identify strains of A.roxburghii for the guidance of clinical medication, due to different strains distinctly vary in medicinal values. However, similar leaf morphology has made difficult to discern different strains directly by naked eyes. In this study, a sub-interval segmentation method was proposed to identify the different strains of A.roxburghii, based on leaf identification methods. Firstly, 6 strains of A.roxburghii were selected, including Taiwan, Hongxia, Xiaoyuanye, Jianye, Yizhu, Dayuanye. A total of 317 images with the resolution of 800×800 pixels were taken, while two filtering methods were used to remove noise. The maximum inner variance algorithm was used for automatic threshold segmentation, in order to obtain the binary image. In the binary image, the leaf contour was drawn, and the mass center of the leaf was calculated. The square area with 150 pixels centered on the mass center was selected as the sub-interval of the leaf, to obtain the target image with the same position and size. Secondly, a combination of texture and color features was applied for the target image, in which texture features were derived by local binary patterns (LBP), gray level co-occurrence matrix (GLCM) and gabor filters, whereas, the color feature was composed of the first, second and third moments. After that, 114 merged features were obtained. Thirdly, the stacking ensemble learning was proposed to improve the accuracy of traditional single classifier. The stacking framework consisted of a base classifier, and a meta-classifier. Logistic regression (LR), K nearest neighbor (KNN), random forest (RF), and gradient boosting decision tree (GBDT) were used as the base classifiers, whereas, GBDT was used as the meta-classifier for stacking. Finally, the cross-validation method different from conventional model was used to divide the data set. The original data was normalized and randomly segmented, where 60% for training and 40% for testing. The training data set was randomly divided into 5 training subsets, and then testing subset for training each base classifier. The prediction results of base classifiers were used as the input vectors of the GBDT. The final prediction result was output by GBDT. The experiment results showed that the average recognition Accuracy of the stacking reached 94.49%, while that of LR, KNN, RF and GBDT was 89.13%, 83.15%, 87.56%, 82.36%, respectively. Moreover, the Precision, Recall, and F1-Score of the stacking model for the identification of Taiwan, Hongxia, and Dayuanye were all 100%. The Recall performance of stacking model was better than any of the single classifiers for identification of the Xiaoyuanye, just slightly worse than that of the LR and KNN models. The F1-Score of stacking model reached the maximum in each strain identification, showing the excellent overall performance of the model. Therefore, the proposed method can significantly improve the classification performances of A.roxburghii with different strains. The findings can provide a promising application method to recognize leaves of different plants using shape features. A further research is still necessary to select proper configuration, in order to improve learning efficiency of stacking model.
Keywords:computer vision  models  Anoectochilus roxburghii  sub-interval segmentation  feature extraction  Stacking ensemble learning  plant leaf
本文献已被 CNKI 万方数据 等数据库收录!
点击此处可从《农业工程学报》浏览原始摘要信息
点击此处可从《农业工程学报》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号