首页 | 本学科首页   官方微博 | 高级检索  
     检索      

基于机器学习的中西太平洋黄鳍金枪鱼渔场预报模型
引用本文:张聪,周为峰,唐峰华,石永闯,樊伟.基于机器学习的中西太平洋黄鳍金枪鱼渔场预报模型[J].农业工程学报,2022,38(15):330-338.
作者姓名:张聪  周为峰  唐峰华  石永闯  樊伟
作者单位:1. 中国水产科学研究院东海水产研究所,上海 200090; 2. 中国农业科学院研究生院,北京 100081;
基金项目:国家重点研发计划项目(2019YFD0901405);农业农村部南海渔业资源开发利用重点实验室开放基金项目(LOF 2022-05);中央级公益性科研院所基本科研业务费项目(2019T09);核电厂冷源安全保障相关技术开发项目(21FW018)
摘    要:为提供准确的中西太平洋黄鳍金枪鱼渔场预报信息,该研究利用2008-2019年中国水产集团43艘远洋延绳钓渔船在中西太平洋海域(0°~30°S;110°E~170°W)作业的渔业数据,通过方差膨胀因子筛选、归一化处理,选取时空因子、海洋环境因子及大尺度气候数据等共35种特征因子,构建了一种随机森林和极端梯度提升决策树相结合的XGBRF模型,并利用五折交叉验证法确定最佳参数,选择逻辑回归、分类与回归树、K最近邻、自适应增强、梯度提升决策树、极端梯度提升决策树和随机森林等模型作为对照,建立8种黄鳍金枪鱼渔场预测模型并进行模型间的比较分析。结果表明,XGBRF模型对中西太平洋黄鳍金枪鱼渔场的预测性能比其他模型更好,其准确率、渔场召回率、渔场F1得分、非渔场查准率和曲线下面积值AUC均最高,分别为75.39%、87.36%、82.64%、66.32%和79.48%,且模型的受试者工作特征曲线ROC更靠近左上角;海表温度是影响中西太平洋黄鳍金枪鱼渔场分布最重要的环境因子,其他因子依次是300 m水层温度、50 m水层盐度、叶绿素a浓度、南方涛动指数以及表层盐度因子,时空因子和其余大尺度气候因子的影响程度较低;基于XGBRF预报模型得到的渔场预测结果与实际作业范围总体一致。XGBRF集成模型对中西太平洋海域黄鳍金枪鱼的渔场预报具有较好的效果,可为渔场预报提供参考。

关 键 词:机器学习  模型  中西太平洋  黄鳍金枪鱼  渔场预报
收稿时间:2022/4/29 0:00:00
修稿时间:2022/7/19 0:00:00

Forecasting models for yellowfin tuna fishing ground in the central and western Pacific based on machine learning
Zhang Cong,Zhou Weifeng,Tang Fenghu,Shi Yongchuang,Fan Wei.Forecasting models for yellowfin tuna fishing ground in the central and western Pacific based on machine learning[J].Transactions of the Chinese Society of Agricultural Engineering,2022,38(15):330-338.
Authors:Zhang Cong  Zhou Weifeng  Tang Fenghu  Shi Yongchuang  Fan Wei
Institution:1. East China Sea Fisheries Research Institute, Chinese Academy of Fishery Sciences, Shanghai 200090, China; 2. Graduate School of Chinese Academy of Agricultural Sciences , Beijing 100081, China;
Abstract:An accurate forecast can be greatly contributed to the yellowfin tuna fishing ground in the western and Central Pacific. However, a large amount of fishery data, and high feature dimension have posed a great over-fitting on the various classification in recent years. The random forest parallel integration can be expected to achieve the excellent performance of the extreme gradient boosting decision tree algorithm. In this study, a hybrid integration model was proposed to combine the xgboost with random forest (XGBRF) with the random forest and extreme gradient lifting decision tree. The fishery production data was also collected from the operation data of 43 distant-water longline fishing vessels of China Aquatic Group in the western and Central Pacific (0°-30°S; 110°E-170°W) from 2008 to 2019, including catch information, such as amount, job date, as well as the job latitude and longitude. A comparison was performed on the fishery data, including the concentration of chlorophyll a, eddy kinetic energy, sea surface height anomalies, temperature and salinity of the 0-500 m mixed water layer. A total of 36 variable combinations were used as the original data set, including the Southern Oscillation Index (SOI), the Arctic Oscillation Index (AOI), the Pacific Decadal Oscillation Index (PDOI), and North Pacific Gyre Oscillation Index (NPGOI). The original data set was divided into the training set and test set after the screening and normalization of the variance expansion factor, accounting for 80% and 20%, respectively. The training set was used to train eight models, including classification and regression, logistic regression, k-nearest neighbor, adaptive boosting, gradient boosting decision tree, xgboost, random forest, and XGBRF. The five-fold cross-validation was used for each model to determine the optimal parameters. Finally, the model was verified to superimpose the actual fishing ground of the test set. The experimental results showed that: 1) There was a significant correlation between the catch per unit fishing effort and various variable factors. There was also a great decrease in the degree of collinearity between the variables that were filtered by variance inflation factor. 2) The XGBRF hybrid ensemble model also significantly improved the performance of XGBoost and RF models. Specifically, the highest accuracy rate and area under curve (AUC) were 75.39%, and 79.48%, respectively. The receiver operator characteristic (ROC) curve of the XGBRF model was closer to the upper left, indicating the best performance of the forecasting model than before. 3) The sea surface temperature was the most important factor to dominate the distribution of yellowfin tuna fishing ground, accounting for 7.573%. The temperature of the 300 m water layer was equally important for the yellowfin tuna, which was 7.369%. In addition, the greater impact was also found in the salinity of the 50-meter water layer, the SOI, the concentration of chlorophyll a, and the surface salinity. There was a relatively low influence of other large-scale climatic factors, except for the SOI. 4) There was only a small deviation between the fishing ground predicted by the XGBRF model and the actual fishing ground, indicating the high accuracy and reliability of the prediction. Overall, the XGBRF ensemble model performed the best on the fishing ground forecast of yellowfin tuna in the western and Central Pacific. The finding can also provide a strong reference for the fishing ground forecast.
Keywords:machine learning  model  western and central Pacific  yellowfin tuna  fishing ground forecast
点击此处可从《农业工程学报》浏览原始摘要信息
点击此处可从《农业工程学报》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号