首页 | 本学科首页   官方微博 | 高级检索  
     检索      

基于深度神经网络及隐马尔科夫模型的生猪状态音频识别
引用本文:彭硕,刘东阳,时国龙,李广博,慕京生,辜丽川,焦俊.基于深度神经网络及隐马尔科夫模型的生猪状态音频识别[J].中国农业大学学报,2022,27(6):172-181.
作者姓名:彭硕  刘东阳  时国龙  李广博  慕京生  辜丽川  焦俊
作者单位:安徽农业大学 信息与计算机学院, 合肥 230036;蒙城县京徽蒙农业科技发展有限公司, 安徽 亳州 233524
基金项目:安徽省科技重大攻关项目(16030701092);安徽省2019年度科技重大专项(201903a06020009)
摘    要:针对传统音频识别方法在生猪音频信号识别中识别率较低的问题,将深度神经网络及隐马尔可夫模型理论作为生猪音频信号识别依据,以长白猪的吃饭声、发情声、嚎叫声、哼叫声和生病长白猪的喘气声为识别对象,利用卡尔曼滤波和改进的EMD-TEO倒谱距离端点检测算法对生猪音频信号进行预处理,把提取的39维的梅尔频率倒谱系数(Mel-frequency cepstral coefficient,MFCC)作为网络学习和识别的数据集,构建基于深度神经网络及隐马尔科夫模型的生猪状态音频识别模型。试验结果表明:1)隐马尔可夫隐状态数设置为5,深度神经网络隐藏层设置为3层,每层128个节点的深度神经网络-隐马尔可夫模型(Deep neural network-hidden Markov model,DNN-HMM),对5种生猪状态音频,即吃饭声、嚎叫声、哼叫声、发情声和病猪喘气声的识别率为70%、95%、75%、80%和95%,总体识别率83%;2)相较于传统的高斯混合模型-隐马尔可夫模型(Gaussian mixture model-hidden Markov model,GMM-HMM),DNN-HMM对相应音频的识别率分别提高了5%、5%、15%、30%、30%,总体识别率提高了17%;3)DNN-HMM模型对于5种不同类型的生猪音频信号均呈现出较好的识别效果。基于DNN-HMM生猪音频识别模型,对生猪不同状态下音频的识别具有较高正确率,且更为可靠。

关 键 词:生猪  MFCC  卡尔曼滤波  DNN-HMM  识别  音频信号
收稿时间:2021/8/11 0:00:00

Pig state audio recognition based on deep neural network and hidden Markov model
PENG Shuo,LIU Dongyang,SHI Guolong,LI Guangbo,MU Jingsheng,GU Lichuan,JIAO Jun.Pig state audio recognition based on deep neural network and hidden Markov model[J].Journal of China Agricultural University,2022,27(6):172-181.
Authors:PENG Shuo  LIU Dongyang  SHI Guolong  LI Guangbo  MU Jingsheng  GU Lichuan  JIAO Jun
Institution:College of Information and Computer Science, Anhui Agricultural University, Hefei 230036, China;Mengcheng Jinghui-Meng Agricultural Science and Technology Development Co., Ltd., Bozhou 233524, China
Abstract:In view of the difficulty and inaccuracy of traditional audio recognition in pig audio signal recognition, deep neural network and hidden Markov model theory were used as the basis for pig audio signal recognition. The eating sound, estrous sound, howling sound, humming sound of landraces and the panting sound of the sick landraces were used as recognition objects. Kalman filter and improved EMD-TEO cepstral distance endpoint detection algorithm were adopted to preprocess pig audio signals, and 39-dimensional mel-frequency cepstral coefficient(MFCC)was extracted as a data set for network learning and recognition. A pig states audio recognition model based on deep neural network and hidden Markov model was constructed. The experimental results showed that: 1)In the deep neural network and hidden Markov models(DNN-HMM)with five hidden states, three hidden layers and 128 nodes, the recognition rates of eating sound, howling sound, humming sound, estrous sound and panting sound of sick pigs were respectively 70%, 95%, 75%, 80% and 95%, and the overall recognition rate was 83%. 2)Compared with the traditional gaussian mixture model-hidden Markov model(GMM-HMM), DNN-HMM improved the recognition rates of corresponding audio by 5%, 5%, 15%, 30% and 30%, respectively. The overall recognition rate increased by 17%; 3)DNN-HMM model showed good recognition effect for 5 different types of pig audio signals. Based on the DNN-HMM pig audio recognition model, the recognition of pig audio in different states had higher accuracy and was more reliable.
Keywords:pig  MFCC  Kalman filter  DNN-HMM  identification  audio signal
点击此处可从《中国农业大学学报》浏览原始摘要信息
点击此处可从《中国农业大学学报》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号