People
白旗 悠真 Yuma Shirahata ソフトウェアエンジニア
2021年東京大学大学院工学系研究科電気系工学専攻修士課程修了。同年4月にLINE株式会社に新卒入社し、2023年10月より現職。主に音声合成の研究開発に取り組んでおり、高品質な感情音声合成モデルや大規模ラベルなし音声データを用いた音声生成向け基盤モデルの開発などに従事。
Publications
-
- ワークショップ (国際)
- CAVIARES: Corpus for Audio-Visual Expressive Voice Agent
- Jinsheng Chen (The University of Tokyo), Yuki Saito (The University of Tokyo), Dong Yang (The University of Tokyo), Naoko Tanji (The University of Tokyo), Hironori Doi, Byeongseon Park, Yuma Shirahata, Kentaro Tachibana, Hiroshi Saruwatari (The University of Tokyo)
- 2025 IEEE Automatic Speech Recognition and Understanding Workshop
- 2025.12.9
-
- カンファレンス (国内)
- BitTTS: 1.58-bit量子化と重みインデキシングによる軽量なテキスト音声合成
- 川村 真也, 蓮実 拓也, 白旗 悠真, 山本 龍一
- 日本音響学会 2025年秋季研究発表会
- 2025.9.11
-
- カンファレンス (国内)
- 音声からの音素・韻律ラベルの獲得とその応用
- 白旗 悠真, 朴 炳宣, 山本 龍一
- 日本音響学会 2025年秋季研究発表会
- 2025.9.10
-
- カンファレンス (国際)
- BitTTS: Highly Compact Text-to-Speech Using 1.58-bit Quantization and Weight Indexing
- Masaya Kawamura, Takuya Hasumi, Yuma Shirahata, Ryuichi Yamamoto
- The 26th Annual Conference of the International Speech Communication Association
- 2025.8.21
-
- カンファレンス (国際)
- SLASH: Self-Supervised Speech Pitch Estimation Leveraging DSP-derived Absolute Pitch
- Ryo Terashima, Yuma Shirahata, Masaya Kawamura
- The 26th Annual Conference of the International Speech Communication Association
- 2025.8.19
-
- カンファレンス (国際)
- Grapheme-Coherent Phonemic and Prosodic Annotation of Speech by Implicit and Explicit Grapheme Conditioning
- Hien Ohnaka (Nara Institute of Science and Technology), Yuma Shirahata, Byeongseon Park, Ryuichi Yamamoto
- The 26th Annual Conference of the International Speech Communication Association
- 2025.8.17
-
- カンファレンス (国際)
- Description-Based Controllable Text-to-Speech With Cross-Lingual Voice Control
- Ryuichi Yamamoto, Yuma Shirahata, Masaya Kawamura, Kentaro Tachibana
- 2025 IEEE International Conference on Acoustics, Speech and Signal Processing
- 2025.4.6
-
- カンファレンス (国内)
- マルチモーダル共感的対話音声合成に向けたコーパスの構築
- 齋藤 佑樹 (東京大学), 陳 晋升 (東京大学), 楊 棟 (東京大学), 丹治 尚子 (東京大学), 土井 啓成, 白旗 悠真, 朴 炳宣, 橘 健太郎, 猿渡 洋 (東京大学)
- 日本音響学会 2025年春季研究発表会
- 2025.3.17
-
- その他 (国際)
- Description-based Controllable Text-to-Speech with Cross-Lingual Voice Control
- Ryuichi Yamamoto, Yuma Shirahata, Masaya Kawamura, Kentaro Tachibana
- arXiv.org
- 2024.9.27
-
- カンファレンス (国際)
- Audio-conditioned phonemic and prosodic annotation for building text-to-speech models from unlabeled speech data
- Yuma Shirahata, Byeongseon Park, Ryuichi Yamamoto, Kentaro Tachibana
- The 25th Annual Conference of the International Speech Communication Association
- 2024.9.4
-
- カンファレンス (国内)
- 感情音声合成のためのアラインメント手法の比較
- 蓮実 拓也, 白旗 悠真, Welly Naptali, 山本 龍一, Eunwoo Song (NAVER Cloud), 橘 健太郎, Jae-Min Kim (NAVER Cloud)
- 日本音響学会 2024年秋季研究発表会
- 2024.9.4
-
- カンファレンス (国際)
- LibriTTS-P: A Corpus with Speaking Style and Speaker Identity Prompts for Text-to-Speech and Style Captioning
- Masaya Kawamura, Ryuichi Yamamoto, Yuma Shirahata, Takuya Hasumi, Kentaro Tachibana
- The 25th Annual Conference of the International Speech Communication Association
- 2024.9.3
-
- カンファレンス (国際)
- Universal Score-based Speech Enhancement with High Content Preservation
- Robin Scheibler, Yusuke Fujita, Yuma Shirahata, Tatsuya Komatsu
- The 25th Annual Conference of the International Speech Communication Association
- 2024.9.1
-
- カンファレンス (国際)
- PromptTTS++: Controlling Speaker Identity in Prompt-Based Text-To-Speech Using Natural Language Descriptions
- Reo Shimizu (Tohoku University), Ryuichi Yamamoto, Masaya Kawamura, Yuma Shirahata, Hironori Doi, Tatsuya Komatsu, Kentaro Tachibana
- 2024 IEEE International Conference on Acoustics, Speech and Signal Processing
- 2024.4.14
-
- カンファレンス (国際)
- Lightweight and High-Fidelity End-to-End Text-to-Speech with Multi-Band Generation and Inverse Short-Time Fourier Transform
- Masaya Kawamura (The University of Tokyo), Yuma Shirahata, Ryuichi Yamamoto, Kentaro Tachibana
- 2023 IEEE International Conference on Acoustics, Speech and Signal Processing
- 2023.6.4
-
- カンファレンス (国際)
- Period VITS: Variational Inference With Explicit Pitch Modeling For End-to-End Emotional Speech
- Yuma Shirahata, Ryuichi Yamamoto, Eunwoo Song (NAVER), Ryo Terashima, Jae-Min Kim (NAVER), Kentaro Tachibana
- 2023 IEEE International Conference on Acoustics, Speech and Signal Processing
- 2023.6.4
-
- カンファレンス (国際)
- Cross-Speaker Emotion Transfer for Low-Resource Text-to-Speech Using Non-Parallel Voice Conversion with Pitch-Shift Data Augmentation
- Ryo Terashima, Ryuichi Yamamoto, Eunwoo Song (NAVER), Yuma Shirahata, Hyun-Wook Yoon (NAVER), Jae-Min Kim (NAVER), Kentaro Tachibana
- The 23rd Annual Conference of the International Speech Communication Association
- 2022.9.18