News
INTERSPEECH2025に8本の論文が採択されました
2025.5.23
以下の論文がINTERSPEECH2025 (外部サイト) で採択されました
BitTTS: Highly Compact Text-to-Speech Using 1.58-bit Quantization and Weight Indexing
Masaya Kawamura, Takuya Hasumi, Yuma Shirahata, Ryuichi Yamamoto
Grapheme-Coherent Phonemic and Prosodic Annotation of Speech by Implicit and Explicit Grapheme Conditioning
Hien Ohnaka, Yuma Shirahata, Byeongseon Park, Ryuichi Yamamoto
SLASH: Self-Supervised Speech Pitch Estimation Leveraging DSP-derived Absolute Pitch
Ryo Terashima, Yuma Shirahata, Masaya Kawamura
DnR-nonverbal: Cinematic Audio Source Separation Dataset Containing Non-Verbal Sounds
Takuya Hasumi, Yusuke Fujita
Language-Guided Contrastive Audio-Visual Masked Autoencoder with Automatically Generated Audio-Visual-Text Triplets from Videos
Yuchi Ishikawa, Shota Nakada, Hokuto Munakata, Kazuhiro Saito, Tatsuya Komatsu, Yoshimitsu Aoki (Keio University)
Audio-Text Contrastive Learning with Audio-Composed Text Features
Tatsuya Komatsu, Hokuto Munakata, Yuchi Ishikawa
Comparative Analysis of Fast and High-Fidelity Neural Vocoders for Low-Latency Streaming Synthesis in Resource-Constrained Environments
Reo Yoneyama (Nagoya University), Masaya Kawamura, Ryo Terashiama, Ryuichi Yamamoto, Tomoki Toda (Nagoya University)
OpusLM: A Family of Open Unified Speech Language Models
Jinchuan Tian (CMU), William Chen (CMU), Yifan Peng (CMU), Jiatong Shi (CMU), Siddhant Arora (CMU), Shikhar Bharadwaj (CMU), Takashi Maekaku, Yusuke Shinohara, Keita Goto, Xiang Yue (CMU), Chao-Han Huck Yang (NVIDIA), Shinji Watanabe (CMU)