首页 | 本学科首页   官方微博 | 高级检索  
     检索      

特色农产品销售评价大数据的弱监督分析方法
引用本文:易文龙,张丽,刘木华,程香平.特色农产品销售评价大数据的弱监督分析方法[J].农业工程学报,2024,40(12):183-192.
作者姓名:易文龙  张丽  刘木华  程香平
作者单位:江西农业大学软件学院,南昌 330045;江西农业大学工学院,南昌 330045;江西省科学院应用物理研究所,南昌 330096
基金项目:国家重点研发计划项目(2022YFD1600601);江西省自然科学基金项目(20212BAB202015);江西省03专项及5G项目(20232ABC03A18)。
摘    要:针对特色农产品评价大数据多维度分析中,可信标签不足以及挖掘消费者各维度真实情感语义困难等问题。该研究提出了一种基于弱监督训练的深度学习方法。首先,通过主题模型分析大规模评论,提取产品评价主题和关键词。然后,结合句法依存和情感词典为评论生成不同维度的伪标签。最后,构建多标签多分类深度网络,在伪标签上进行弱监督学习。结果表明,该方法在红心柚评论数据集上取得89.2%的准确率和80.3%的F1值,比随机森林算法提升了7.1个百分点的准确率和11.5个百分点的F1值。相比Transformer模型,准确率提高5.6个百分点,F1值提高2个百分点,参数量减少了92%。该方法能从海量评论中高效提取产品评价维度和消费者关注点,为完善农产品质量和销售服务提供数据支持。

关 键 词:农产品  弱监督学习  多任务模型  情感分析  深度学习  大数据分析
收稿时间:2024/1/1 0:00:00
修稿时间:2024/4/5 0:00:00

Weakly supervised analysis method for featured agricultural product sales evaluation big data
YI Wenlong,ZHANG Li,LIU Muhu,CHENG Xiangping.Weakly supervised analysis method for featured agricultural product sales evaluation big data[J].Transactions of the Chinese Society of Agricultural Engineering,2024,40(12):183-192.
Authors:YI Wenlong  ZHANG Li  LIU Muhu  CHENG Xiangping
Institution:School of Software, Jiangxi Agricultural University, Nanchang 330045, China;School of Engineering, Jiangxi Agricultural University, Nanchang 330045, China; Institute of Applied Physics, Jiangxi Academy of Sciences, Nanchang 330096, China
Abstract:Extensive data analysis can greatly contribute to the evaluation of featured agricultural products, in order to improve and optimize the agricultural products and marketing strategies. Since there are fewer open-source strongly labeled datasets in Chinese, it is still challenging to find strongly labeled datasets in the domains. In addition, manual labeling is costly and time-consuming at present. In this study, a weakly supervised deep learning was proposed to evaluate big data on featured agricultural products from different dimensions. Firstly, the primary process was used to crawl consumers'' evaluation information of some featured agricultural products from the online sales platform by incremental crawler; Secondly, a theme model was selected to define the implicit themes and theme keywords in the evaluation big data; Thirdly, the pseudo-labels were generated on different evaluation dimensions for the big data, according to a combination of syntactic dependency and lexicon-based sentiment judgment; Finally, a multi-label multi-categorization deep learning model was constructed to propose a weakly supervised framework in the evaluation big data with different evaluation dimensions. The pseudo-labeled dataset was utilized to perform the weakly supervised learning. The trained model was used to directly evaluate agricultural products. Only one model was needed to predict the consumers'' emotional attitudes on different evaluation dimensions, due to the multitasking structure of the model. In the experiment, a large amount of store and evaluation information was first collected from websites related to specialty agricultural products. The incremental crawlers were adopted to form a multi-source heterogeneous extensive dataset and then stored in the database. Different websites were employed to make the dataset more representative and better eliminate the bias of different user groups, compared with a single source. Heterogeneity indicated that the data from different platforms in the dataset shared different focuses and data composition structures. The heterogeneous data was transformed from multiple sources to obtain an extensive dataset of characteristic agricultural products. Subsequently, "Hongxin pomelo" and "Purple garlic" were used as keywords to retrieve the comments from the database. The experimental dataset was obtained to verify the final prediction and comparative analysis of the model. The results showed that the improved model was achieved in 89.2% accuracy and 80.3% F1-score on the Hongxin pomelo dataset, respectively, whereas, there was an increase in the 7.1 percentage points accuracy and 11.5 percentage points F1-score over Random Forest. Compared with the Transformer model, the accuracy increased by 5.6 percentage points and F1-score by 2 percentage points, respectively, while parameters were reduced by 92%. The product evaluation dimensions and consumer concerns were efficiently extracted from massive reviews. The findings can provide the data support to improve agricultural product quality and sales service.
Keywords:agricultural product  weak supervision  multi-task model  sentiment analysis  deep learning  big data analytics
点击此处可从《农业工程学报》浏览原始摘要信息
点击此处可从《农业工程学报》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号