Author : Safra 1
Date of Publication :30th November 2021
Abstract: Isolation is the main issue of segregating real voice from external clamour interferences, which may include non-discourse noise, speech interference or both, as well as space resonance. Traditionally, speech segregation is considered as a signal processing problem but latest research shows discourse segregation as a superintend learning issue centered on deep neural network ( DNN), in which judicious discourse sample, orator, and grumbles are deliberated from training data. Here this work furnish the summary of the analysis on supervised speech separation based on deep learning and compares the result with the least mean square algorithm (LMS). The adaptive noise cancelation strategy is robust for the clamours that are moving spatially. The signal to noise ratio of the yield signal is upgraded by applying adaptive filtering which abuses the signals link properties. This research focuses on distinguishing speech from reverberation, using DNN-based deep learning. LMS adaptive filter, an advanced channel composed of a tapped line of postpones and adjustable loads, with an adaptive algorithm controlling the impulse response. Deep Neural Network model improves speech performance and significantly improves system stability. Exploration of speech recognition uses a variety of techniques that seek to improve precision, one of which is the use of Deep Learning, but high-dimensional information problems are one of the problems that reduce the difficulty of discourse recognition.
Reference :
-
- G A Miller and G A Heise, “The trill threshold,” J Acoust Soc. Amer., vol. 22, pp. 637–638, 1950.
- A. S. Bregman, Auditory Scene Analysis. Cambridge, MA, USA: MIT Press, 1990.
- G A Miller, “The masking of speech,” Psychol Bull , vol , pp. 105–129, 1947.
- P. C. Loizou, Speech Enhancement: Theory and Practice, 2nd ed., Boca Raton, FL, USA: CRC Press, 2013.
- D. L.Wang and G. J. Brown, Ed., Computational Auditory Scene Analysis: Principles, Algorithms, and Applications.Hoboken,NJ, USA:Wiley,2006.
- D. P. Jarrett, E. Habets, and P. A. Naylor, Theory and Applications of Spherical Microphone Array Processing. Zurich, Switzerland: Springer,2016.
- J. Chen and D.L. Wang, "DNN-based mask estimation for supervised speech separation," in Audio source separation, S. Makino, Ed., Berlin: Springer, pp. 207-235, 2018.
- M.C. Anzalone, L. Calandruccio, K.A. Doherty, and L.H. Carney, "Determination of the potential benefit of time-frequency gain manipulation," Ear Hear., vol. 27, pp. 480-492, 2006.
- S. Araki, et al., "Exploring multi-channel features for denoising-autoencoder-based speech enhancement," in Proceedings of ICASSP, pp. 116-120, 2015
- S. Araki, H. Sawada, R. Mukai, and S. Makino, "Underdetermined blind sparse source separation for arbitrarily arranged multiple sensors," Sig. Proc., vol. 87, pp. 1833-1847, 2007.