首页 | 本学科首页   官方微博 | 高级检索  
     检索      

采用改进CNN对生猪异常状态声音识别
引用本文:耿艳利,宋朋首,林彦伯,季燕凯,杨淑才.采用改进CNN对生猪异常状态声音识别[J].农业工程学报,2021,37(20):187-193.
作者姓名:耿艳利  宋朋首  林彦伯  季燕凯  杨淑才
作者单位:1. 河北工业大学人工智能与数据科学学院,天津 300130;2. 智能康复装置与检测技术教育部工程研究中心,天津 300130;3. 天津魔界客智能科技有限公司,天津 300130
基金项目:河北省重点研发计划项目
摘    要:猪只声音能够体现出其生长状态,该研究针对人工监测猪只声音造成的猪只疾病误判以及耗时耗力等问题,研究基于卷积神经网络(Convolutional Neural Network,CNN)的生猪异常状态声音识别方法。该研究首先设计猪只声音实时采集系统,并利用4G通讯技术将声音信息上传至云服务器,基于专业人员指导制作猪只异常声音(生病、打架、饥饿等)数据集,提取猪只异常声音的梅尔谱图特征信息;其次引入多种注意力机制对CNN进行改进,并对CBAM(Convolutional Block Attention Module)注意力机制进行优化,提出_CBAM-CNN网络模型;最后将_CBAM-CNN网络模型分别与引入SE_NET(Squeeze and Excitation Network)、ECA_NET(Efficient Channel Attention Networks)和CBAM注意力机制的CNN神经网络进行对比,试验结果表明该文提出的_CBAM-CNN网络模型在最优参数为128维梅尔频率、2 048点FFT(Fast Fourier Transform)点数、512点窗移下的梅尔谱图特征下相较于其他模型对猪只异常声音识别效果最佳,识别率达到94.46%,验证了算法的有效性。该研究有助于生猪养殖过程中对猪只异常行为的监测,并对智能化、现代化猪场的建设具有重要意义。

关 键 词:声音信号处理  动物  异常声音  卷积神经网络  SE_NET  CBAM  ECA_NET
收稿时间:2021/7/20 0:00:00
修稿时间:2021/8/31 0:00:00

Voice recognition of abnormal state of pigs based on improved CNN
Geng Yanli,Song Pengshou,Lin Yanbo,Ji Yankai,Yang Shucai.Voice recognition of abnormal state of pigs based on improved CNN[J].Transactions of the Chinese Society of Agricultural Engineering,2021,37(20):187-193.
Authors:Geng Yanli  Song Pengshou  Lin Yanbo  Ji Yankai  Yang Shucai
Institution:1. School of Artificial Intelligence and Data Science, Hebei University of Technology, Tianjin 300130, China; 2. Engineering Research Center of Intelligent Rehabilitation, Ministry of Education, Tianjin 300130, China;; 3. Tianjin Mojieke Intelligent Technology Co., Ltd.,Tianjin 300130, China
Abstract:Abstract: Sound has been widely used to monitor the health and body conditions of pigs. But the manual monitoring cannot meet the high demand in modern agriculture at present, including zoonotic diseases, misjudgments of pig diseases, and time- and labor-consuming. In this study, a real-time collection module of pig sound was designed to rapidly recognize the abnormal state using an improved convolutional neural network (CNN). A 4G communication was used to upload the collected pig sound into the cloud server. A TCP/IP communication protocol was also selected, where the acquisition end was set as a TCP client and the uninterrupted data to the server. Specifically, the TCP cloud server was utilized to block the specified port, and then start the transfer data after the client was connected successfully. The server also sent a restart command to the client, to ensure data alignment. The sound acquisition was realized via a single channel, where the sampling frequency was 32 kHz, while the quantization digit was 16 bits. Correspondingly, the raw data of various abnormal sounds of pigs (sickness, fighting, and Hunger) were collected, according to the experts of pig breeding. Some operations were used to preprocess the data, including framing, windowing, de-nosing, and endpoint detection. As such, a voice data set of abnormal status was built. Subsequently, the Mel spectrogram of various sounds was extracted under the parameters of 128-dimensional mel frequency, 2048 points of Fast Fourier Transform (FFT) points, and 512 points of window shift. A classification model of the signal acquisition was then constructed using the feature of Mel spectrogram for pig sound signals. Therefore, a local feature learning unit was designed using an improved CNN, indicating fewer weights and lower network complexity than fully connected networks. Four layers of local feature units were constructed, where the number of convolution kernels in each layer was 64-64-128-128. Nevertheless, the local location and various redundant information were inevitably generated, when CNN had acquired each image. Three types of attention mechanisms were used to improve CNN, including Squeeze and Excitation Network (SE_NET), Efficient Channel Attention Networks, (ECA_NET), and Convolutional Block Attention Module (CBAM). A fully connected network with three neurons and an activation function of Softmax was also used to recognize abnormal sounds of pigs. The CBAM was then optimized to propose the CBAM-CNN using the ECA_NET improved SE_NET. The experimental results show that the optimal combination of parameters in pig voice recognition was 128 dimensional Mel frequency, 2048 point FFT, 1/4 window shift, and the optimal network model was _CBAM-CNN. The optimal recognition accuracy reached 94.46%, and the accuracy of pig squeal recognition reached 100%, better than before. The attention mechanism was also improved the model recognition, while reducing model complexity. A better recognition was achieved using the smaller size of _CBAM-CNN model, compared with CBAM-CNN. The accuracy of _CBAM-CNN model was 94.46% for the sound recognition of abnormal pigs. This finding can provide the accurate monitoring of abnormal behaviors of pigs in the process of breeding, thereby constructing intelligent and modern pig farms.
Keywords:acoustic signal processing  animals  abnormal noise  convolutional neural network  SE_NET  CBAM  ECA_NET
点击此处可从《农业工程学报》浏览原始摘要信息
点击此处可从《农业工程学报》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号