首页 | 本学科首页   官方微博 | 高级检索  
     检索      

基于BERT-LEAM模型的食品安全法规问题多标签分类
引用本文:郑丽敏,乔振铎,田立军,杨璐.基于BERT-LEAM模型的食品安全法规问题多标签分类[J].农业机械学报,2021,52(7):244-250,158.
作者姓名:郑丽敏  乔振铎  田立军  杨璐
作者单位:中国农业大学
基金项目:国家重点研发计划项目(2017YFC1601803)
摘    要:在食品安全法规问答系统中,食品安全法规问题的单标签文本分类不能完全概括问题所包含的有效信息,为了改进单标签文本分类效果,根据问题所涉及食品安全角度和层次的不同,提出一种基于BERT-LEAM(Bidirectional encoder representational from transformers-label embedding attentive model)的多标签文本分类方法。采用多角度、分层次的多标签标注方法将单个问题文本赋予多个标签,并引入BERT预训练语言模型表示上下文特征信息, 通过Attention机制学习标签与文本的依赖关系,进行Word embedding的聚合,将标签应用到文本分类过程中。实验表明,在粗粒度多标签数据集上的分类效果明显优于细粒度多标签数据集上的分类效果,BERT进行文本特征表示的方法优于Word2Vec方法,采用BERT-LEAM模型的分类方法在粗粒度多标签数据集与细粒度多标签数据集的F1-W值分别为93.35%和79.81%,其分类效果优于其他分类模型。

关 键 词:食品安全法规  多标签分类  BERT  BERT-LEAM
收稿时间:2020/9/29 0:00:00

Multi-label Classification of Food Safety Regulatory Issues Based on BERT-LEAM
ZHENG Limin,QIAO Zhenduo,TIAN Lijun,YANG Lu.Multi-label Classification of Food Safety Regulatory Issues Based on BERT-LEAM[J].Transactions of the Chinese Society of Agricultural Machinery,2021,52(7):244-250,158.
Authors:ZHENG Limin  QIAO Zhenduo  TIAN Lijun  YANG Lu
Institution:China Agricultural University
Abstract:Effective classification of food safety regulatory issues is the key to the realization of the food safety regulatory question and answer system. In order to improve the effect of single label text classification, a multi-label text classification method based on bidirectional encoder representational from transformers-label embedding attentive model (BERT-LEAM) was proposed according to the different food safety perspectives and levels involved in the problem. A multi-angle and hierarchical multi-label labeling method was used to assign multiple labels to a single question text, and the pre-training language model of BERT was introduced to represent the context feature information. The dependency between the label and the text was learned by attention mechanism, the word was processed by embedding aggregation, and the tag was applied to the text classification process. The experimental results showed that the classification effect on the coarse-grained multi-label data set was better than that on the fine-grained multi-label data set. The method of text feature representation by BERT model was better than that of Word2Vec. The F1-W values of coarse-grained multi-label data set and fine-grained multi-label data set were 93.35% and 79.81%, respectively, which was better than other classification methods model. The problem classification based on food safety regulations question answering system was realized effectively by using the method of BERT-LEAM classification, which laid the foundation for the implementation of the follow-up question answering system.
Keywords:food safety regulations  multi-label classification  BERT  BERT-LEAM
点击此处可从《农业机械学报》浏览原始摘要信息
点击此处可从《农业机械学报》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号