Publications

カンファレンス (国際) Multichannel Separation and Classification of Sound Events

Robin Scheilbler, Tatsuya Komatsu, Masahito Togami

29th European Signal Processing Conference (EUSIPCO 2021)

2021.8.23

We investigate the use of determined blind source separation for sound event detection (SED) and classification using multichannel recordings. Our proposed system appends a single channel SED model to each of the output channels of the separation algorithm. We expect the number of events per channel to be reduced and overall performance increased. Such a system allows different number of channels at training and test time. We demonstrate the performance on the DCASE 2020 Sound Event Localization and Detection dataset. We compare baseline training on single channel recordings to using different combinations of 1, 2, 3, or 4 channel recordings. For the separation, we compare both a traditional source prior and a neural prior, trained without groundtruth signals available. First, we show that with the former, performance increases when using more channels. In addition, mixing different number of channels during training yields a system that is robust across varying number of channels. Second, while the training scheme proposed for the tailored source prior is not found effective for separation, it seems to be effective for data augmentation. This indicates that multichannel training data is beneficial, even when the target systems are single channel.

Paper : Multichannel Separation and Classification of Sound Events新しいタブまたはウィンドウで開く (外部サイト)