Investigating Factors Related to the Naturalness of Synthesized Unison Singing - LINEヤフーの研究開発

Publications

カンファレンス (国際) Investigating Factors Related to the Naturalness of Synthesized Unison Singing

Kaito Nishizawa (Nagoya University), Ryuichi Yamamoto, Wen-Chin Huang (Nagoya University), Tomoki Toda (Nagoya University)

2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2025)

2025.4.6

Singing voice synthesis (SVS) technology has progressed rapidly in recent years. However, vocal ensemble synthesis has not yet been widely explored. In this work, we focus on unison singing, which is to have several singers singing the same melody together. Our goal is to understand what acoustic properties affect the naturalness of the synthesized unison singing. We utilize NNSVS, an SVS toolkit that allows us to manipulate individual acoustic features, including timing, f0, and spectrum features, in a fully data-driven manner to investigate their effect in unison singing synthesis. Through listening tests, it was shown that the fluctuation in timing and f0 is an important factor in synthesizing natural unison singing. Furthermore, we discovered the potential to generate unison singing using an SVS model trained only with a single singer dataset.

音声処理

Paper : Investigating Factors Related to the Naturalness of Synthesized Unison Singing 新しいタブまたはウィンドウで開く（外部サイト）