首页 | 本学科首页   官方微博 | 高级检索  
     检索      

基于支持向量机和ReliefF算法的玉米品种抗倒伏预测
引用本文:张天亮,张东兴,崔涛,杨丽,丁友强,解春季,杜兆辉,钟翔君.基于支持向量机和ReliefF算法的玉米品种抗倒伏预测[J].农业工程学报,2021,37(20):226-233.
作者姓名:张天亮  张东兴  崔涛  杨丽  丁友强  解春季  杜兆辉  钟翔君
作者单位:1. 中国农业大学工学院,北京 100083;2. 农业部土壤-机器-植物系统技术重点实验室,北京 100083
基金项目:国家重点研发计划项目(2016YFD0300302);玉米产业技术体系建设项目(CARS-02)
摘    要:针对目前玉米品种抗倒伏鉴定方法费时、费力,玉米抗倒伏品种选育周期长的问题,该研究采用高光谱成像技术结合统计学习方法在玉米营养生长期开展品种抗倒伏预测。于2018年和2019年开展田间试验采集不同抗倒伏的8个玉米品种的高光谱成像数据,基于区域识别方法提取感兴趣区域(Region of Interest,ROI)的光谱曲线,分析抗倒样本和不抗倒样本的数据特性;然后分别采用过滤式特征选择算法ReliefF(Relevant Features)和主成分分析(Principal Component Analysis,PCA)结合ReliefF算法的方式,挖掘抗倒品种和不抗倒品种的光谱分类特征;最后使用交叉验证的方式,对ReliefF方法选择的原始光谱数据特征数量和PCAReliefF方法选择的主成分特征数量进行优化,分别建立ReliefF-SVM和PCAReliefF-SVM支持向量机(Support Vector Machines,SVM)分类模型,并对SVM模型的惩罚参数和核参数进行优化,以获得更好的模型预测效果。结果表明:经过特征优化,2018年试验和2019年试验分别选择了40和50个特征参与建模,且使用PCAReliefF方法选择的主成分特征与使用ReliefF方法选择的原始光谱数据特征相比,几乎不含有冗余特征;通过对支持向量机模型的惩罚参数和核参数进行优化,2018年试验ReliefF-SVM和PCAReliefF-SVM模型对预测集样本的抗倒伏分类预测正确率分别为84.17%和85 %,2019年试验模型分类预测正确率分别为84.17%和85.83%。可见,采用高光谱成像数据和统计学习方法可以实现对玉米品种抗倒伏的早期预测,使用PCAReliefF-SVM模型比ReliefF-SVM分类模型综合性能更优,试验为玉米抗倒伏品种的高效筛选提供了方法和借鉴。

关 键 词:主成分分析  品种  支持向量机  玉米  抗倒  ReliefF
收稿时间:2020/10/24 0:00:00
修稿时间:2021/9/10 0:00:00

Lodging resistance prediction of maize varieties based on support vector machine and ReliefF algorithm
Zhang Tianliang,Zhang Dongxing,Cui Tao,Yang Li,Ding Youqiang,Xie Chunji,Du Zhaohui,Zhong Xiangjun.Lodging resistance prediction of maize varieties based on support vector machine and ReliefF algorithm[J].Transactions of the Chinese Society of Agricultural Engineering,2021,37(20):226-233.
Authors:Zhang Tianliang  Zhang Dongxing  Cui Tao  Yang Li  Ding Youqiang  Xie Chunji  Du Zhaohui  Zhong Xiangjun
Institution:1. College of Engineering, China Agricultural University, Beijing 100083, China; 2. Key Laboratory of Soil-Machine-Plant System Technology of Ministry of Agriculture, Beijing 100083, China
Abstract:Abstract: Maize is one of the main food crops in the world. The lodging of maize has posed a serious challenge on the yield and mechanized harvesting in modern agriculture. Current identification methods cannot fully meet the lodging resistance and long breeding cycle of maize varieties, due to the time-consuming and laborious tasks. In this study, hyperspectral imaging technology was combined with statistical learning to predict the lodging resistance of maize varieties during the vegetative growth period. A field trial was also carried out in 2018 and 2019. The hyperspectral images were then collected for the top leaves of 8 corn varieties with and without lodging resistance at the 9-leaf stage. The experimental procedure was as follows. A threshold segmentation was first utilized to identify the leaf area. The K-means clustering was then used to divide the leaf into three areas: normal reflection, dark reflection, and leaf vein area. The average spectral curve was finally extracted in the normal reflection area, in order to analyze the data characteristics of lodging-resistant and lodging samples. The Kennard Stone was selected to sort the sample data of each species. Two parts of the set sample were also divided, including the training and test set at a ratio of 3:1. The division of each variety was integrated into the final training and test set data, in order to obtain an evenly distributed dataset of each variety. As such, there were 378 training and 120 test set samples in the 2018 test, while there were 383 training and 120 test set samples in the 2019 test. The filtering feature selection Relevant Features (ReliefF) and Principal Component Analysis (PCA) were selected to mine the spectral classification features of lodging-resistant varieties and lodging varieties. Specifically, a different number of the nearest neighbors in ReliefF was set to determine some features, according to the stability of feature variables. The redundant features were often selected with a high correlation in adjacent bands. Correspondingly, the PCA was first performed on the spectral data, thereby selecting principal components without redundant features using the ReliefF. The classification models of ReliefF- Support Vector Machine (SVM) and PCAReliefF-SVM were established, where the original spectral data features were selected by the ReliefF, and the principal component features were selected by the PCAReliefF. The grid search was also selected to optimize the penalty and kernel parameters in the SVM model for a better prediction of the model. First, cross-validation was used on the training set data to optimize the number of selected features. 40 and 50 features in the trials in 2018 and 2019 were selected to build the model, in order to balance the accuracy of the model and the complexity of calculation. All the samples were then used in the training set, where the final parameters were used for model training. The accuracy rates of prediction in the PCAReliefF-SVM model were 85% and 85.83% in 2018 and 2019, respectively. In the ReliefF-SVM model, the prediction accuracy rates were 84.17% and 84.17% in 2018 and 2019, respectively. It indicated that the PCAReliefF-SVM model performed better prediction. The ROC curve was also used to evaluate the performance of the model. It was found that the ROC curve in the PCAReliefF-SVM modeling almost completely "enclosed" the ROC curve in the ReliefF-SVM, indicating a better performance of the PCAReliefF-SVM model. As such, hyperspectral imaging was used for the early classification of maize varieties, particularly for the overwhelm resistance. Consequently, the findings can provide a reliable idea for the maize resistance to overwhelm using spectral extraction, feature analysis, and modeling prediction.
Keywords:Principal component analysis  variety  Support vector machine  maize  lodging resistant  ReliefF
点击此处可从《农业工程学报》浏览原始摘要信息
点击此处可从《农业工程学报》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号