Over-Determined Semi-Blind Speech Source Separation - LINEヤフーの研究開発

Publications

カンファレンス (国際) Over-Determined Semi-Blind Speech Source Separation

Masahito Togami, Robin Scheilbler

Asia-Pacific Signal and Information Processing Association Annual Summit and Conference 2021 (APSIPA ASC 2021)

2021.12.14

We propose a semi-blind speech source separation that jointly optimizes several acoustic functions, i.e., speech source separation (SS), dereverberation (DR), acoustic echo reduction (AE), and background noise reduction (BG). Instead of cascade connection of SS, AE, and DR, the proposed method performs DR, SS, and AE by a unified time-invariant filter. We assume the over-determined condition that the number of the microphones N m . is larger than the number of near-end speech sources N s . Joint optimization of DR, SS, AE, and BG can be performed by using the N m - N s dimensional subspace of the time-invariant filter for BG. Furthermore, we reveal that this subspace can be also utilized for residual acoustic echo reduction (RR) in which residual acoustic echo signal is reduced by spatial filtering. Two types of joint parameter optimization techniques for DR, SS, AE, BG, and RR are proposed based on vectorwise coordinate descent and fast multichannel nonnegative matrix factorization. Experimental results show that the proposed methods perform DR, SS, AE, and BG better than a cascade method. When the acoustic echo path between a loudspeaker and a microphone is time-varying, performance improvement of the proposed methods with RR is larger than the proposed method without RR.

音声処理

Paper : Over-Determined Semi-Blind Speech Source Separation 新しいタブまたはウィンドウで開く（外部サイト）