Speech Processing
-
- CONFERENCE (DOMESTIC)
- 感情音声合成のためのアラインメント手法の比較
- 蓮実 拓也, 白旗 悠真, Welly Naptali, 山本 龍一, Eunwoo Song (NAVER Cloud), 橘 健太郎, Jae-Min Kim (NAVER Cloud)
- 日本音響学会 2024年秋季研究発表会 (ASJ 2024 autumn)
- September 04, 2024
-
- CONFERENCE (DOMESTIC)
- 離散トークン音声認識におけるドメイン適応の検討
- 石井 敬章, 小松 達也, 藤田 雄介, 藤田 悠哉
- 日本音響学会 2024年秋季研究発表会 (ASJ 2024 autumn)
- September 04, 2024
-
- CONFERENCE (INTERNATIONAL)
- LibriTTS-P: A Corpus with Speaking Style and Speaker Identity Prompts for Text-to-Speech and Style Captioning
- Masaya Kawamura, Ryuichi Yamamoto, Yuma Shirahata, Takuya Hasumi, Kentaro Tachibana
- The 25th Annual Conference of the International Speech Communication Association (INTERSPEECH 2024)
- September 03, 2024
-
- CONFERENCE (INTERNATIONAL)
- Song Data Cleansing for End-to-End Neural Singer Diarization Using Neural Analysis and Synthesis Framework
- Hokuto Munakata, Ryo Terashima, Yusuke Fujita
- The 25th Annual Conference of the International Speech Communication Association (INTERSPEECH 2024)
- September 03, 2024
-
- CONFERENCE (INTERNATIONAL)
- Audio Fingerprinting with Holographic Reduced Representations
- Yusuke Fujita, Tatsuya Komatsu
- The 25th Annual Conference of the International Speech Communication Association (INTERSPEECH 2024)
- September 01, 2024
-
- CONFERENCE (INTERNATIONAL)
- Noise-Robust Voice Conversion by Conditional Denoising Training Using Latent Variables of Recording Quality and Environment
- Takuto Igarashi (The University of Tokyo), Yuki Saito (The University of Tokyo), Kentaro Seki (The University of Tokyo), Shinnosuke Takamichi (The University of Tokyo), Ryuichi Yamamoto, Kentaro Tachibana, Hiroshi Saruwatari (The University of Tokyo)
- The 25th Annual Conference of the International Speech Communication Association (INTERSPEECH 2024)
- September 01, 2024
-
- CONFERENCE (INTERNATIONAL)
- SRC4VC: Smartphone-Recorded Corpus for Voice Conversion Benchmark
- Yuki Saito (The University of Tokyo), Takuto Igarashi (The University of Tokyo), Kentaro Seki (The University of Tokyo), Shinnosuke Takamichi (The University of Tokyo), Ryuichi Yamamoto, Kentaro Tachibana, Hiroshi Saruwatari (The University of Tokyo)
- The 25th Annual Conference of the International Speech Communication Association (INTERSPEECH 2024)
- September 01, 2024
-
- CONFERENCE (INTERNATIONAL)
- Universal Score-based Speech Enhancement with High Content Preservation
- Robin Scheibler, Yusuke Fujita, Yuma Shirahata, Tatsuya Komatsu
- The 25th Annual Conference of the International Speech Communication Association (INTERSPEECH 2024)
- September 01, 2024
-
- JOURNAL (INTERNATIONAL)
- MC-Whisper: Extending Speech Foundation Models to Multichannel Distant Speech Recognition
- Xuankai Chang (Carnegie Mellon University), Pengcheng Guo (Northwestern Polytechnical University), Yuya Fujita, Takashi Maekaku, Shinji Watanabe (Carnegie Mellon University)
- IEEE Signal Processing Letters (IEEE SPL)
- August 26, 2024
-
- OTHERS (INTERNATIONAL)
- Song Data Cleansing for End-to-End Neural Singer Diarization Using Neural Analysis and Synthesis Framework
- Hokuto Munakata, Ryo Terashima, Yusuke Fujita
- arXiv.org (arXiv)
- June 24, 2024
-
- CONFERENCE (DOMESTIC)
- 「音学シンポジウム2024」開催にあたって
- 大石 康智 (日本電信電話株式会社), 中村 栄太 (九州大学), 大町 基, 森川 大輔 (富山県立大学), 伊藤 信貴 (東京大学), 森 大毅 (宇都宮大学)
- 音学シンポジウム 2024 (第140回MUS・第152回SLP合同研究発表会)
- June 13, 2024
-
- CONFERENCE (INTERNATIONAL)
- Audio Difference Learning for Audio Captioning
- Tatsuya Komatsu, Yusuke Fujita, Kazuya Takeda (Nagoya University), Tomoki Toda (Nagoya University)
- 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2024)
- April 14, 2024
-
- CONFERENCE (INTERNATIONAL)
- Cross-Modal Multi-Tasking for Speech-to-Text Translation via Hard Parameter Sharing
- Brian Yan (Carnegie Mellon University), Xuankai Chang (Carnegie Mellon University), Antonios Anastasopoulos (George Mason University), Yuya Fujita, Shinji Watanabe (Carnegie Mellon University)
- 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2024)
- April 14, 2024
-
- CONFERENCE (INTERNATIONAL)
- Enhancing Multilingual TTS with Voice Conversion Based Data Augmentation and Posterior Embedding
- Hyun-Wook Yoon (NAVER Cloud), Jin-Seob Kim (NAVER Cloud), Ryuichi Yamamoto, Ryo Terashima, Chan-Ho Song (NAVER Cloud), Jae-Min Kim (NAVER Cloud), Eunwoo Song (NAVER Cloud)
- 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2024)
- April 14, 2024
-
- CONFERENCE (INTERNATIONAL)
- Keep Decoding Parallel With Effective Knowledge Distillation From Language Models To End-To-End Speech Recognisers
- Michael Hentschel (LINE WORKS Corporation), Yuta Nishikawa (Nara Institute of Science and Technology), Tatsuya Komatsu, Yusuke Fujita
- 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2024)
- April 14, 2024
-
- CONFERENCE (INTERNATIONAL)
- PromptTTS++: Controlling Speaker Identity in Prompt-Based Text-To-Speech Using Natural Language Descriptions
- Reo Shimizu (Tohoku University), Ryuichi Yamamoto, Masaya Kawamura, Yuma Shirahata, Hironori Doi, Tatsuya Komatsu, Kentaro Tachibana
- 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2024)
- April 14, 2024
-
- OTHERS (INTERNATIONAL)
- LV-CTC: Non-autoregressive ASR with CTC and latent variable models
- Yuya Fujita, Shinji Watanabe (Carnegie Mellon Univ.), Xuankai Chang (Carnegie Mellon Univ.), Takashi Maekaku
- arXiv.org (arXiv)
- March 28, 2024
-
- CONFERENCE (INTERNATIONAL)
- Exploring Speech Recognition, Translation, and Understanding with Discrete Speech Units: A Comparative Study
- Xuankai Chang (Carnegie Mellon University), Brian Yan (Carnegie Mellon University), Kwanghee Choi (Carnegie Mellon University), Jee-Weon Jung (Carnegie Mellon University), Yichen Lu (Carnegie Mellon University), Soumi Maiti (Carnegie Mellon University), Roshan Sharma (Carnegie Mellon University), Jiatong Shi (Carnegie Mellon University), Jinchuan Tian (Carnegie Mellon University), Shinji Watanabe (Carnegie Mellon University), Yuya Fujita, Takashi Maekaku, Pengcheng Guo (Northwestern Polytechnical University), Yao-Fei Cheng (University of Washington), Pavel Denisov (University of Stuttgart), Kohei Saijo (Waseda University), Hsiu-Hsuan Wang (National Taiwan University)
- 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2024)
- March 20, 2024
-
- CONFERENCE (INTERNATIONAL)
- Hubertopic: Enhancing Semantic Representation of Hubert Through Self-Supervision Utilizing Topic Model
- Takashi Maekaku, Jiatong Shi (Carnegie Mellon University), Xuankai Chang (Carnegie Mellon University), Yuya Fujita, Shinji Watanabe (Carnegie Mellon University)
- 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2024)
- March 20, 2024
-
- CONFERENCE (DOMESTIC)
- 日本語テキストと音楽の対照学習の実験的評価
- 蓮実 拓也, 小松 達也, 藤田 雄介, 二又 航介, 橘 健太郎
- 日本音響学会 2024年春季研究発表会 (ASJ 2024 spring)
- March 07, 2024