Deep Learning Based Two-dimensional Speaker Localization With Large Ad-hoc Microphone Arrays

10/19/2022
by   Shupei Liu, et al.
0

Deep learning based speaker localization has shown its advantage in reverberant scenarios. However, it mostly focuses on the direction-of-arrival (DOA) estimation subtask of speaker localization, where the DOA instead of the 2-dimensional (2D) coordinates is obtained only. To obtain the 2D coordinates of multiple speakers with random positions, this paper proposes a deep-learning-based 2D speaker localization method with large ad-hoc microphone arrays, where an ad-hoc microphone array is a set of randomly-distributed microphone nodes with each node set to a traditional microphone array, e.g. a linear array. Specifically, a convolutional neural network is applied to each node to get the direction-of-arrival (DOA) estimation of speech sources. Then, a triangulation and clustering method integrates the DOA estimations of the nodes for estimating the 2D positions of the speech sources. To further improve the estimation accuracy, we propose a softmax-based node selection algorithm. Experimental results with large-scale ad-hoc microphone arrays show that the proposed method achieves significantly better performance than conventional methods in both simulated and real-world environments. The softmax-based node selection further improves the performance.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset
Success!
Error Icon An error occurred

Sign in with Google

×

Use your Google Account to sign in to DeepAI

×

Consider DeepAI Pro