Simultaneous optimization of forgetting factor and time-frequency mask for Block Online Multi-Channel Speech enhancement - LINEヤフーの研究開発

Publications

カンファレンス (国際) Simultaneous optimization of forgetting factor and time-frequency mask for Block Online Multi-Channel Speech enhancement

Masahito Togami

2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2019)

2019.5.12

In this paper, we propose a block-online multi-channel speech enhancement technique which simultaneously optimizes time-frequency masks and forgetting factors for estimation of multichannel covariance matrices of the desired speech signal and the noise signal so as to maximize speech enhancement performance under the condition that environmental changes occur. The proposed method reduces the noise signal by using a multi-channel Wiener filter (MWF) which is generated by the covariance matrices with the estimated forgetting factors and the estimated time-frequency masks which are outputs of the proposed neural network. The proposed method learns all the parameters of the proposed neural network so as to maximize the speech enhancement performance. Three types of the input features for the forgetting factors adaptation are proposed. The first one is the magnitude spectral of the microphone input signal. The second one is the MWF output with the previous-block filter that is adapted in the previous block. The third one is the inner product between the microphone input signal and the estimated covariance matrices in the previous block. Experimental results show that the proposed method can reduce noise signal more accurately than the conventional equally weight sample averaging.

音声処理

Paper : Simultaneous optimization of forgetting factor and time-frequency mask for Block Online Multi-Channel Speech enhancement 新しいタブまたはウィンドウで開く（外部サイト）