Yuma Shirahata - LY Corporation R&D

People

Yuma Shirahata Software Engineer

Speech Processing
Machine Learning & Data Science
Generative AI

Graduated with a Master's degree from the Department of Electrical Engineering and Information Systems, Graduate School of Engineering, The University of Tokyo, in 2021. Joined LINE Corporation as a new graduate in April of the same year. Since October 2023, I have been in my current position. I am mainly engaged in research and development of speech synthesis, working on the development of high-quality emotional speech synthesis models and foundational models for speech generation using large-scale unlabeled speech data.

Publications

CONFERENCE (INTERNATIONAL)

CC-G2PNP: Streaming Grapheme-to-Phoneme and Prosody with Conformer-CTC for Unsegmented Languages

Yuma Shirahata, Ryuichi Yamamoto

2026 IEEE International Conference on Acoustics, Speech and Signal Processing

May 07, 2026
CONFERENCE (INTERNATIONAL)

Wave-Trainer-Fit: Neural Vocoder With Trainable Prior And Fixed-Point Iteration Towards High-Quality Speech Generation From SSL Features

Hien Ohnaka (Nara Institute of Science and Technology), Yuma Shirahata, Masaya Kawamura

2026 IEEE International Conference on Acoustics, Speech and Signal Processing

May 05, 2026
CONFERENCE (DOMESTIC)

ニューラルオーディオコーデック特徴量を用いた音声から話者特有の表情予測モデルの構築及び分析

朴浚鎔 (東京大学), 陳晋升 (東京大学), 土井啓成, 朴炳宣, 白旗悠真, 橘健太郎, 楊棟 (東京大学), 齋藤佑樹 (東京大学), 猿渡洋 (東京大学)

日本音響学会 2026年春季研究発表会

March 19, 2026
WORKSHOP (INTERNATIONAL)

CAVIARES: Corpus for Audio-Visual Expressive Voice Agent

Jinsheng Chen (The University of Tokyo), Yuki Saito (The University of Tokyo), Dong Yang (The University of Tokyo), Naoko Tanji (The University of Tokyo), Hironori Doi, Byeongseon Park, Yuma Shirahata, Kentaro Tachibana, Hiroshi Saruwatari (The University of Tokyo)

2025 IEEE Automatic Speech Recognition and Understanding Workshop

December 09, 2025
CONFERENCE (DOMESTIC)

BitTTS: 1.58-bit量子化と重みインデキシングによる軽量なテキスト音声合成

川村真也, 蓮実拓也, 白旗悠真, 山本龍一

日本音響学会 2025年秋季研究発表会

September 11, 2025
CONFERENCE (DOMESTIC)

音声からの音素・韻律ラベルの獲得とその応用

白旗悠真, 朴炳宣, 山本龍一

日本音響学会 2025年秋季研究発表会

September 10, 2025
CONFERENCE (INTERNATIONAL)

BitTTS: Highly Compact Text-to-Speech Using 1.58-bit Quantization and Weight Indexing

Masaya Kawamura, Takuya Hasumi, Yuma Shirahata, Ryuichi Yamamoto

The 26th Annual Conference of the International Speech Communication Association

August 21, 2025
CONFERENCE (INTERNATIONAL)

SLASH: Self-Supervised Speech Pitch Estimation Leveraging DSP-derived Absolute Pitch

Ryo Terashima, Yuma Shirahata, Masaya Kawamura

The 26th Annual Conference of the International Speech Communication Association

August 19, 2025
CONFERENCE (INTERNATIONAL)

Grapheme-Coherent Phonemic and Prosodic Annotation of Speech by Implicit and Explicit Grapheme Conditioning

Hien Ohnaka (Nara Institute of Science and Technology), Yuma Shirahata, Byeongseon Park, Ryuichi Yamamoto

The 26th Annual Conference of the International Speech Communication Association

August 17, 2025
CONFERENCE (INTERNATIONAL)

Description-Based Controllable Text-to-Speech With Cross-Lingual Voice Control

Ryuichi Yamamoto, Yuma Shirahata, Masaya Kawamura, Kentaro Tachibana

2025 IEEE International Conference on Acoustics, Speech and Signal Processing

April 06, 2025
CONFERENCE (DOMESTIC)

マルチモーダル共感的対話音声合成に向けたコーパスの構築

齋藤佑樹 (東京大学), 陳晋升 (東京大学), 楊棟 (東京大学), 丹治尚子 (東京大学), 土井啓成, 白旗悠真, 朴炳宣, 橘健太郎, 猿渡洋 (東京大学)

日本音響学会 2025年春季研究発表会

March 17, 2025
OTHERS (INTERNATIONAL)

Description-based Controllable Text-to-Speech with Cross-Lingual Voice Control

Ryuichi Yamamoto, Yuma Shirahata, Masaya Kawamura, Kentaro Tachibana

arXiv.org

September 27, 2024
CONFERENCE (INTERNATIONAL)

Audio-conditioned phonemic and prosodic annotation for building text-to-speech models from unlabeled speech data

Yuma Shirahata, Byeongseon Park, Ryuichi Yamamoto, Kentaro Tachibana

The 25th Annual Conference of the International Speech Communication Association

September 04, 2024
CONFERENCE (DOMESTIC)

感情音声合成のためのアラインメント手法の比較

蓮実拓也, 白旗悠真, Welly Naptali, 山本龍一, Eunwoo Song (NAVER Cloud), 橘健太郎, Jae-Min Kim (NAVER Cloud)

日本音響学会 2024年秋季研究発表会

September 04, 2024
CONFERENCE (INTERNATIONAL)

LibriTTS-P: A Corpus with Speaking Style and Speaker Identity Prompts for Text-to-Speech and Style Captioning

Masaya Kawamura, Ryuichi Yamamoto, Yuma Shirahata, Takuya Hasumi, Kentaro Tachibana

The 25th Annual Conference of the International Speech Communication Association

September 03, 2024
CONFERENCE (INTERNATIONAL)

Universal Score-based Speech Enhancement with High Content Preservation

Robin Scheibler, Yusuke Fujita, Yuma Shirahata, Tatsuya Komatsu

The 25th Annual Conference of the International Speech Communication Association

September 01, 2024
CONFERENCE (INTERNATIONAL)

PromptTTS++: Controlling Speaker Identity in Prompt-Based Text-To-Speech Using Natural Language Descriptions

Reo Shimizu (Tohoku University), Ryuichi Yamamoto, Masaya Kawamura, Yuma Shirahata, Hironori Doi, Tatsuya Komatsu, Kentaro Tachibana

2024 IEEE International Conference on Acoustics, Speech and Signal Processing

April 14, 2024
CONFERENCE (INTERNATIONAL)

Lightweight and High-Fidelity End-to-End Text-to-Speech with Multi-Band Generation and Inverse Short-Time Fourier Transform

Masaya Kawamura (The University of Tokyo), Yuma Shirahata, Ryuichi Yamamoto, Kentaro Tachibana

2023 IEEE International Conference on Acoustics, Speech and Signal Processing

June 04, 2023
CONFERENCE (INTERNATIONAL)

Period VITS: Variational Inference With Explicit Pitch Modeling For End-to-End Emotional Speech

Yuma Shirahata, Ryuichi Yamamoto, Eunwoo Song (NAVER), Ryo Terashima, Jae-Min Kim (NAVER), Kentaro Tachibana

2023 IEEE International Conference on Acoustics, Speech and Signal Processing

June 04, 2023
CONFERENCE (INTERNATIONAL)

Cross-Speaker Emotion Transfer for Low-Resource Text-to-Speech Using Non-Parallel Voice Conversion with Pitch-Shift Data Augmentation

Ryo Terashima, Ryuichi Yamamoto, Eunwoo Song (NAVER), Yuma Shirahata, Hyun-Wook Yoon (NAVER), Jae-Min Kim (NAVER), Kentaro Tachibana

The 23rd Annual Conference of the International Speech Communication Association

September 18, 2022