Αρχειοθήκη ιστολογίου

Σάββατο 9 Δεκεμβρίου 2017

Localizing Speakers in Multiple Rooms by Using Deep Neural Networks

1-s2.0-S0885230817301377-fx1.jpg

Publication date: Available online 7 December 2017
Source:Computer Speech & Language
Author(s): Fabio Vesperini, Paolo Vecchiotti, Emanuele Principi, Stefano Squartini, Francesco Piazza
In the field of human speech capturing systems, a fundamental role is played by the source localization algorithms. In this paper a Speaker Localization algorithm (SLOC) based on Deep Neural Networks (DNN) is evaluated and compared with state-of-the art approaches. The speaker position in the room under analysis is directly determined by the DNN, leading the proposed algorithm to be fully data-driven. Two different neural network architectures are investigated: the Multi Layer Perceptron (MLP) and Convolutional Neural Networks (CNN). GCC-PHAT (Generalized Cross Correlation-PHAse Transform) Patterns, computed from the audio signals captured by the microphone are used as input features for the DNN. In particular, a multi-room case study is dealt with, where the acoustic scene of each room is influenced by sounds emitted in the other rooms. The algorithm is tested by means of the home recorded DIRHA dataset, characterized by multiple wall and ceiling microphone signals for each room. In detail, the focus goes to speaker localization task in two distinct neighbouring rooms. As term of comparison, two algorithms proposed in literature for the addressed applicative context are evaluated, the Crosspower Spectrum Phase Speaker Localization (CSP-SLOC) and the Steered Response Power using the Phase Transform speaker localization (SRP-SLOC). Besides providing an extensive analysis of the proposed method, the article shows how DNN-based algorithm significantly outperforms the state-of-the-art approaches evaluated on the DIRHA dataset, providing an average localization error, expressed in terms of Root Mean Square Error (RMSE), equal to 324 mm and 367 mm respectively for the Simulated and the Real subsets.



from #ORL-AlexandrosSfakianakis via ola Kala on Inoreader http://ift.tt/2Aq4qeu

Δεν υπάρχουν σχόλια:

Δημοσίευση σχολίου