DNN-HMM based acoustic model for continuous pig cough sound recognition

Jian Zhao; Xuan Li; Wanghong Liu; Yun Gao; Minggang Lei; Hequn Tan; Di Yang

doi:10.25165/ijabe.v13i3.4530

DNN-HMM based acoustic model for continuous pig cough sound recognition

Jian Zhao, Xuan Li, Wanghong Liu, Yun Gao, Minggang Lei, Hequn Tan, Di Yang

Abstract

To detect the respiratory disease through pig cough sound in the early stage, a novel method based on Deep Neural Networks-Hidden Markov Model (DNN-HMM) was proposed to construct an acoustic model for continuous pig cough sound recognition. Noises in the continuous pig sounds were eliminated by the Wiener algorithm based on wavelet thresholding the multitaper spectrum, and the experimental corpus was obtained from the denoised continuous pig sounds. The 39-dimensional Mel Frequency Cepstral Coefficients (MFCC) extracted from the corpus were considered as feature vectors. Sounds in pig farms were divided into pig coughs, non-pig coughs, and silence segments. In the HMM, the number of hidden states of pig cough, non-pig cough and silence segments were 5, 5 and 3 respectively, and the observation states represented the feature vectors of the continuous pig sound signal. Based on experiments and empirical theory, the DNN model with 3 hidden layers and 100 nodes per layer was used to describe the correspondence between hidden states and observation serials. Through experiments, the context frames of DNN input were set to 5. Under the condition of optimal parameter setting, the traditional acoustic model Gaussian Mixture Model-Hidden Markov Model (GMM-HMM) was compared with DNN-HMM through a 5-fold cross-validation experiment. It was found that the Word Error Rate (WER) of each group in DNN-HMM was lower than that in GMM-HMM, and the average WER was 3.45% lower. At the same time, the best result of the DNN-HMM model was obtained with the lowest WER of 7.54%, and the average WER was 8.03%. The results showed that the method of DNN-HMM based acoustic model for continuous pig cough sound recognition was stable and reliable.
Keywords: DNN-HMM, acoustic model, continuous pig coughs, recognition, pig industry
DOI: 10.25165/j.ijabe.20201303.4530

Citation: Zhao J, Li X, Liu W H, Gao Y, Lei M G, Tan H Q, et al. DNN-HMM based acoustic model for continuous pig cough sound recognition. Int J Agric & Biol Eng, 2020; 13(3): 186–193.

Keywords

DNN-HMM, acoustic model, continuous pig coughs, recognition, pig industry

Full Text:

PDF

References

van Zanten H H E, Bikker P, Meerburg B G, de Boer I J M. Attributional versus consequential life cycle assessment and feed optimization: alternative protein sources in pig diets. Int J Life Cycle Assess, 2018; 23(1): 1–11.

Alexandratos N, Bruinsma A. World agriculture towards 2030/2050: The 2012 revision. Rome: Agricultural Development Economics Division. Food and Agriculture Organization of the United Nations, 2012; 147p.

Huang W J, Zhu W X, Ma C H, Guo Y Z, Chen C. Identification of group-housed pigs based on Gabor and Local Binary Pattern features. Biosystems Engineering, 2018; 166: 90–100.

Moura D J, Silva W T, Naas I A, Tolón Y A, Lima K A O, Vale M M. Real time computer stress monitoring of piglets using vocalization analysis. Computers and Electronics in Agriculture, 2008; 64(1): 11–18.

Manteuffel G, Puppe B, Schön P C. Vocalization of farm animals as a measure of welfare. Applied Animal Behaviour Science, 2004; 88(1): 163–182.

Marx G, Horn T, Thielebein J, Knubel B, von Borell E. Analysis of pain-related vocalization in young pigs. Journal of Sound and Vibration, 2003; 266(3): 687–698.

Schön, P C, Puppe, B, Manteuffel, G. Automated recording of stress vocalization as a tool to document impaired welfare in pigs. Animal Welfare (South Mimms, England), 2004; 13(2): 105–110.

Silva M, Exadaktylos V, Ferrari S, Guarino M, Aerts J M, Berckmans D. The influence of respiratory disease on the energy envelope dynamics of pig cough sounds. Computers & Electronics in Agriculture, 2009; 69(1): 80–85.

Ferrari S, Silva M, Guarino M, Aerts J M, Berckmans D. Cough sound analysis to identify respiratory infection in pigs. Computers & Electronics in Agriculture, 2008; 64(2): 318–325.

Moshou D, Chedad A, Van Hirtum A, De Baerdemaeker J, Berckmans D, Ramon H. Neural recognition system for swine cough. Mathematics & Computers in Simulation, 2001; 56(4-5): 475–487.

Van Hirtum A, Berckmans D. Fuzzy approach for improved recognition of citric acid induced piglet coughing from continuous registration. Journal of Sound & Vibration, 2003; 266(3): 677–686.

Van Hirtum A, Berckmans D. Objective recognition of cough sound as biomarker for aerial pollutants. Indoor Air, 2004; 14(1): 10–15.

Guarino M, Jans P, Costa A, Aerts J M, Berckmans D. Field test of algorithm for automatic cough detection in pig house. Computers & Electronics in Agriculture, 2008; 62(1): 22–28.

Liu Z Y, He X Y, Sang J, Li Y T, Wu Z Q, Lu Z M. Research on pig cough sound recognition based on hidden Markov model. The proceedings of the 10th Academic Seminar of Information Technology Branch of Chinese Society of Animal Husbandry and Veterinary Science, 2015; pp.99–104.

Baker J K. Stochastic modeling for automatic speech understanding. In: Readings in Speech Recognition. San Francisco: Morgan Kaufmann Publishers, 1990; pp.297–307.

Jelinek F. Continuous speech recognition by statistical methods. Proc IEEE, 1976; 64(4): 532–556.

Milone D H, Galli J R, Cangiano C A, Rufiner H L, Laca E A. Automatic recognition of ingestive sounds of cattle based on hidden Markov models. Computers & Electronics in Agriculture, 2012; 87: 51–55.

Reby D, André-Obrecht R, Galinier A, Farinas J, Cargnelutti B. Cepstral coefficients and hidden Markov models reveal idiosyncratic voice characteristics in red deer (Cervus elaphus) stags. The Journal of the Acoustical Society of America, 2007; 120(6): 4080–4089.

Milone D H, Rufiner H L, Galli J R, Laca E A, Cangiano C A. Computational method for segmentation and classification of ingestive sounds in sheep. Computers & Electronics in Agriculture, 2009; 65(2): 228–237.

Trifa V M, Kirschel A N, Taylor C E, Vallejo E E. Automated species recognition of antbirds in a Mexican rainforest using hidden Markov models. Journal of the Acoustical Society of America, 2008; 123(4): 2424–2431.

Biagetti G, Crippa P, Falaschetti L, Orcioni S, Turchetti C. Learning HMM state sequences from phonemes for speech synthesis. Procedia Computer Science, 2016; 96: 1589–1596.

Vidal E, Frinken V. Hmm word graph based keyword spotting in

handwritten document images. Information Sciences, 2016; 370(C): 497–518.

Kamilaris A, Prenafeta-Boldú F X. Deep learning in agriculture: A survey. Computers & Electronics in Agriculture, 2018; 147: 70–90.

Schmidhuber, J. Deep learning in neural networks: an overview. Neural Networks, 2015; 61: 85–117.

LeCun Y, Bengio Y, Hinton G. Deep learning. Nature, 2015; 521: 436–444.

Bengio Y. Learning deep architectures for AI. Foundations and Trends in Machine Learning, 2009; 2(1): 1–127.

Bengio Y, Lecun Y. Scaling learning algorithms towards AI. Large-Scale Kernel Machines, 2007; pp.321–359.

Dahl G E, Yu D, Deng L, Acero A. Large vocabulary continuous speech recognition with context-dependent DBN-HMMS. IEEE International Conference on Acoustics. IEEE, 2011; 125(3): 4688–4691.

Seide F, Li G, Yu D. Conversational speech transcription using context-dependent deep neural networks. International Conference on International Conference on Machine Learning. Omnipress, 2012; pp.1–2.

Maas A L, Qi P, Xie Z A, Hannun A Y, Lengerich C T, Jurafsky D, et al. Building DNN acoustic models for large vocabulary speech recognition. Computer Speech & Language, 2017; 41: 195–213.

Hu Y, Loizou P C. Speech enhancement based on wavelet thresholding the multitaper spectrum. IEEE Transactions on Speech & Audio Processing, 2004; 12(1): 59–67.

Ai O C, Hariharan M, Yaacob S, Chee L S. Classification of speech dysfluencies with mfcc and lpcc features. Expert Systems with Applications, 2012; 39(2): 2157–2165.

Cao J, Zhao T, Wang J, Wang R, Chen Y. Excavation equipment classification based on improved MFCC features and elm. Neurocomputing, 2017; 216: 231–241.

Dahl G E, Yu D, Deng L, Acero A. Context-dependent pre-trained deep neural networks for large-vocabulary speech recognition. IEEE Transactions on Audio Speech and Language Processing, 2011; 20(1): 30–42.

Sakoe, H, Chiba, S. Dynamic programming algorithm optimization for spoken word recognition. IEEE Transactions on Acoustics, Speech, and Signal Processing, 1978; 26(1): 43–49.

Al-Naymat G, Chawla S, Taheri J. Sparse DTW: A novel approach to speed up dynamic time warping. In: Proceedings of the Eighth Australasian Data Mining Conference. Melbourne: Australian Computer Society, Inc, 2009; 101: 117–127.

Deng L, Kenny P, Lennig M, Gupta V, Seitz F, Mermelstein P. Phonemic hidden Markov models with continuous mixture output densities for large vocabulary word recognition. IEEE Transactions on Signal Processing, 1991; 39(7): 1677–1681.

Rabiner L R, Juang B H, Levinson S E, Sondhi M M. Recognition of isolated digits using hidden Markov models with continuous mixture densities. At and T Technical Journal, 1985; 64(6): 1211–1234.

Zeng J, Xie L, Liu Z Q. Type-2 fuzzy gaussian mixture model. Pattern Recognition, 2008; 41(12): 3636–3643.

Chuong B D, Serafim B. What is the expectation maximization algorithm? Nature Biotechnology, 2008; 26(8): 897–899.

Hero A O. On the convergence of the EM algorithm. IEEE International Symposium on Information Theory, San Antonio: IEEE, 1993; 187p.

Hinton G E, Osidero S, Teh Y W. A fast learning algorithm for deep belief nets. Neural Computation, 2006; 18(7): 1527–1554.

Hinton G E, Salakhutdinov R R. Reducing the dimensionality of data with neural networks. Science, 2006; 313(5786): 504–507.

Erhan D, Bengio Y, Courville A, Manzagol P A, Vincent P, Bengio S. Why does unsupervised pre-training help deep learning? Journal of Machine Learning Research, 2010; 11(3): 625–660.

Hinton G E. Learning multiple layers of representation. Trends in Cognitive Sciences, 2007; 11(10): 428–434.

Yu C C, Liu B D. A backpropagation algorithm with adaptive learning rate and momentum coefficient. International Joint Conference on Neural Networks. Honolulu: IEEE, 2002; pp.1218–1223.

Hameed A A, Karlik B, Salman M S. Back-propagation algorithm with variable adaptive momentum. Knowledge-Based Systems, 2016; 114: 79–87.

Michie D, Spiegelhalter D J, Taylor C C.Machine learning, neural, and statistical classification. Feb. 17, 1994.

This work is licensed under a Creative Commons Attribution 4.0 International License.

2023-2026 Copyright IJABE Editing and Publishing Office

Username
Password
Remember me