Publications

カンファレンス (国際) Sound Source Localization with Majorization Minimization

Masahito Togami, Robin Scheilbler

The 22nd Annual Conference of the International Speech Communication Association (INTERSPEECH 2021)

2021.8.30

We propose a sound source localization technique that estimates a speech source location without precise grid searching. The source location is estimated in a parameter optimization manner to minimize the steered-response power (SRP) function with the near-field assumption. Because there is no closed-form solution for the SRP function, we introduce an auxiliary function of the SRP function based on the majorization-minimization (MM) algorithm. Parameters are updated iteratively to minimize the auxiliary function with alternate execution of time-difference-of-arrival (TDOA) estimation and range-difference (RD) based localization. When TDOA estimation and RD-based localization are performed in a cascade manner, the estimation accuracy of the source location is strongly affected by the estimation accuracy of the TDOA. On contrary, the proposed method corrects the estimated TDOA by referring to the estimated source location in the previous iteration. Thus, it is expected for the proposed method to be robust against TDOA estimation error which occurs under reverberant environments. Experimental results show that the proposed method outperforms conventional techniques under a reverberant environment.

Paper : Sound Source Localization with Majorization Minimization新しいタブまたはウィンドウで開く (外部サイト)