News

ICASSP 2024に論文が8本採択されました

2023.12.20

以下の論文が[International Conference on Acoustics, Speech, and Signal Processing (ICASSP)](外部リンク)で採択されました。

CROSS-MODAL MULTI-TASKING FOR SPEECH-TO-TEXT TRANSLATION VIA HARD PARAMETER SHARING
Brian Yan (Carnegie Mellon University), Xuankai Chang, Antonios Anastasopoulos, Yuya Fujita, Shinji Watanabe

EXPLORING SPEECH RECOGNITION, TRANSLATION, AND UNDERSTANDING WITH DISCRETE SPEECH UNITS: A COMPARATIVE STUDY
Xuankai Chang (Carnegie Mellon University), Yuya Fujita, Takashi Maekaku, et al.

KEEP DECODING PARALLEL WITH EFFECTIVE KNOWLEDGE DISTILLATION FROM LANGUAGE MODELS TO END-TO-END SPEECH RECOGNISERS
Michael Hentschel (Works Mobile Japan), Yuta Nishikawa (NAIST), Tatsuya Komatsu, Yusuke Fujita

Electrolaryngeal Speech Intelligibility Enhancement Through Robust Linguistic Encoders
Lester Phillip Violeta (Nagoya University), Wen-Chin Huang, Ding Ma, Ryuichi Yamamoto, Kazuhiro Kobayashi, Tomoki Toda

Enhancing Multilingual TTS with Voice Conversion based Data Augmentation and an Optimized Centroid
Hyun-Wook Yoon (NAVER), Jin-Seob Kim, Ryuichi Yamamoto, Ryo Terashima, Chan-Ho Song, Jae-Min Kim, Eunwoo Song

PromptTTS++: Controlling Speaker Identity in Prompt-Based Text-to-Speech Using Natural Language Descriptions
Reo Shimizu (Tohoku University, LINE), Ryuichi Yamamoto, Masaya Kawamura, Yuma Shirahata, Hironori Doi, Tatsuya Komatsu, Kentaro Tachibana,

Audio Difference Learning for Audio Captioning
Tatsuya Komatsu, Yusuke Fujita, Kazuya Takeda, Tomoki Toda

HUBERTOPIC: ENHANCING SEMANTIC REPRESENTATION OF HUBERT THROUGH SELF-SUPERVISION UTILIZING TOPIC MODEL
Takashi Maekaku, Jiatong Shi (Carnegie Mellon University), Xuankai Chang (Carnegie Mellon University), Yuya Fujita, Shinji Watanabe (Carnegie Mellon University)