Multichannel Separation and Classification of Sound Events - LY Corporation R&D

Publications

CONFERENCE (INTERNATIONAL) Multichannel Separation and Classification of Sound Events

Robin Scheilbler, Tatsuya Komatsu, Masahito Togami

29th European Signal Processing Conference (EUSIPCO 2021)

August 23, 2021

We investigate the use of determined blind source separation for sound event detection (SED) and classification using multichannel recordings. Our proposed system appends a single channel SED model to each of the output channels of the separation algorithm. We expect the number of events per channel to be reduced and overall performance increased. Such a system allows different number of channels at training and test time. We demonstrate the performance on the DCASE 2020 Sound Event Localization and Detection dataset. We compare baseline training on single channel recordings to using different combinations of 1, 2, 3, or 4 channel recordings. For the separation, we compare both a traditional source prior and a neural prior, trained without groundtruth signals available. First, we show that with the former, performance increases when using more channels. In addition, mixing different number of channels during training yields a system that is robust across varying number of channels. Second, while the training scheme proposed for the tailored source prior is not found effective for separation, it seems to be effective for data augmentation. This indicates that multichannel training data is beneficial, even when the target systems are single channel.

Speech Processing

Paper : Multichannel Separation and Classification of Sound Events open into new tab or window (external link)