Speech Processing
-
- OTHERS (INTERNATIONAL)
- Evaluating Self-Supervised Speech Models via Text-Based LLMS
- Takashi Maekaku, Keita Goto, Jinchuan Tian (Carnegie Mellon University), Yusuke Shinohara, Shinji Watanabe (Carnegie Mellon University)
- arXiv.org (arXiv)
- October 07, 2025
-
- CONFERENCE (DOMESTIC)
- BitTTS: 1.58-bit量子化と重みインデキシングによる軽量なテキスト音声合成
- 川村 真也, 蓮実 拓也, 白旗 悠真, 山本 龍一
- 日本音響学会 2025年秋季研究発表会 (ASJ 2025 autumn)
- September 11, 2025
-
- OTHERS (DOMESTIC)
- 映画音源分離のための非言語音声を含むデータセット
- 蓮実 拓也, 藤田 雄介
- 日本音響学会 2025年秋季研究発表会 (ASJ 2025 autumn)
- September 10, 2025
-
- CONFERENCE (DOMESTIC)
- 音声からの音素・韻律ラベルの獲得とその応用
- 白旗 悠真, 朴 炳宣, 山本 龍一
- 日本音響学会 2025年秋季研究発表会 (ASJ 2025 autumn)
- September 10, 2025
-
- CONFERENCE (INTERNATIONAL)
- BitTTS: Highly Compact Text-to-Speech Using 1.58-bit Quantization and Weight Indexing
- Masaya Kawamura, Takuya Hasumi, Yuma Shirahata, Ryuichi Yamamoto
- The 26th Annual Conference of the International Speech Communication Association (INTERSPEECH 2025)
- August 21, 2025
-
- CONFERENCE (INTERNATIONAL)
- Comparative Analysis of Fast and High-Fidelity Neural Vocoders for Low-Latency Streaming Synthesis in Resource-Constrained Environments
- Reo Yoneyama (Nagoya University), Masaya Kawamura, Ryo Terashima, Ryuichi Yamamoto (Nagoya University/LY Corporation), Tomoki Toda (Nagoya University)
- The 26th Annual Conference of the International Speech Communication Association (INTERSPEECH 2025)
- August 21, 2025
-
- CONFERENCE (INTERNATIONAL)
- DnR-nonverbal: Cinematic Audio Source Separation DatasetContaining Non-Verbal Sounds
- Takuya Hasumi, Yusuke Fujita
- The 26th Annual Conference of the International Speech Communication Association (INTERSPEECH 2025)
- August 21, 2025
-
- CONFERENCE (INTERNATIONAL)
- Language-Guided Contrastive Audio-Visual Masked Autoencoder with Automatically Generated Audio-Visual-Text Triplets from Videos
- Yuchi Ishikawa, Shota Nakada, Hokuto Munakata, Tatsuya Komatsu, Yoshimitsu Aoki (Keio University)
- The 26th Annual Conference of the International Speech Communication Association (INTERSPEECH 2025)
- August 19, 2025
-
- CONFERENCE (INTERNATIONAL)
- SLASH: Self-Supervised Speech Pitch Estimation Leveraging DSP-derived Absolute Pitch
- Ryo Terashima, Yuma Shirahata, Masaya Kawamura
- The 26th Annual Conference of the International Speech Communication Association (INTERSPEECH 2025)
- August 19, 2025
-
- CONFERENCE (INTERNATIONAL)
- Grapheme-Coherent Phonemic and Prosodic Annotation of Speech by Implicit and Explicit Grapheme Conditioning
- Hien Ohnaka (Nara Institute of Science and Technology), Yuma Shirahata, Byeongseon Park, Ryuichi Yamamoto
- The 26th Annual Conference of the International Speech Communication Association (INTERSPEECH 2025)
- August 17, 2025
-
- CONFERENCE (DOMESTIC)
- 「音学シンポジウム2025」開催にあたって
- 大町 基, 南角 吉彦 (名古屋工業大学), 中村 栄太 (九州大学), 吉井 和佳 (京都大学/理化学研究所), 森川 大輔 (富山県立大学), 坂東 宜昭 (産業技術総合研究所)
- 音学シンポジウム 2025 (第143回音楽情報科学・第156回音声言語情報処理合同研究発表会)
- June 13, 2025
-
- CONFERENCE (INTERNATIONAL)
- Language-based Audio Moment Retrieval
- Hokuto Munakata, Taichi Nishimura, Shota Nakada, Tatsuya Komatsu
- 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2025)
- April 11, 2025
-
- CONFERENCE (INTERNATIONAL)
- Description-Based Controllable Text-to-Speech With Cross-Lingual Voice Control
- Ryuichi Yamamoto, Yuma Shirahata, Masaya Kawamura, Kentaro Tachibana
- 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2025)
- April 06, 2025
-
- CONFERENCE (INTERNATIONAL)
- Investigating Factors Related to the Naturalness of Synthesized Unison Singing
- Kaito Nishizawa (Nagoya University), Ryuichi Yamamoto, Wen-Chin Huang (Nagoya University), Tomoki Toda (Nagoya University)
- 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2025)
- April 06, 2025
-
- CONFERENCE (DOMESTIC)
- Wavehax:調波信号モデルと2次元畳み込みを用いた複素スペクトログラム推定に基づくエイリアシングフリーニューラルボコーダ
- 米山 怜於 (名大), 宮下 敦志 (名大), 山本 龍一, 戸田 智基 (名大)
- 日本音響学会 2025年春季研究発表会 (ASJ 2025 spring)
- March 17, 2025
-
- OTHERS (DOMESTIC)
- ホログラフィック縮退表現による音響フィンガープリントのメモリ削減
- 藤田 雄介, 小松 達也
- 日本音響学会 2025年春季研究発表会 (ASJ 2025 spring)
- March 17, 2025
-
- CONFERENCE (DOMESTIC)
- マルチモーダル共感的対話音声合成に向けたコーパスの構築
- 齋藤 佑樹 (東京大学), 陳 晋升 (東京大学), 楊 棟 (東京大学), 丹治 尚子 (東京大学), 土井 啓成, 白旗 悠真, 朴 炳宣, 橘 健太郎, 猿渡 洋 (東京大学)
- 日本音響学会 2025年春季研究発表会 (ASJ 2025 spring)
- March 17, 2025
-
- OTHERS (INTERNATIONAL)
- Description-based Controllable Text-to-Speech with Cross-Lingual Voice Control
- Ryuichi Yamamoto, Yuma Shirahata, Masaya Kawamura, Kentaro Tachibana
- arXiv.org (arXiv)
- September 27, 2024
-
- CONFERENCE (INTERNATIONAL)
- Audio-conditioned phonemic and prosodic annotation for building text-to-speech models from unlabeled speech data
- Yuma Shirahata, Byeongseon Park, Ryuichi Yamamoto, Kentaro Tachibana
- The 25th Annual Conference of the International Speech Communication Association (INTERSPEECH 2024)
- September 04, 2024
-
- CONFERENCE (DOMESTIC)
- トピックモデルを用いた教師なし学習によるHuBERTの意味表現向上
- 前角 高史, Jiatong Shi (カーネギーメロン大学), Xuankai Chang (カーネギーメロン大学), 藤田 悠哉, 渡部 晋治 (カーネギーメロン大学)
- 日本音響学会 2024年秋季研究発表会 (ASJ 2024 autumn)
- September 04, 2024