Publications

その他 (国際)
Description-based Controllable Text-to-Speech with Cross-Lingual Voice Control: Ryuichi Yamamoto, Yuma Shirahata, Masaya Kawamura, Kentaro Tachibana; arXiv.org (arXiv); 2024.9.27

カンファレンス (国際)
Audio-conditioned phonemic and prosodic annotation for building text-to-speech models from unlabeled speech data: Yuma Shirahata, Byeongseon Park, Ryuichi Yamamoto, Kentaro Tachibana; The 25th Annual Conference of the International Speech Communication Association (INTERSPEECH 2024); 2024.9.4

カンファレンス (国際)
Universal Score-based Speech Enhancement with High Content Preservation: Robin Scheibler, Yusuke Fujita, Yuma Shirahata, Tatsuya Komatsu; The 25th Annual Conference of the International Speech Communication Association (INTERSPEECH 2024); 2024.9.1

カンファレンス (国際)
Watermark-embedded Adversarial Examples for Copyright Protection against Diffusion Models: Peifei Zhu, Tsubasa Takahashi, Hirokatsu Kataoka; The IEEE/CVF Conference on Computer Vision and Pattern Recognition 2024 (CVPR 2024); 2024.6.17

カンファレンス (国際)
Enhancing Multilingual TTS with Voice Conversion Based Data Augmentation and Posterior Embedding: Hyun-Wook Yoon (NAVER Cloud), Jin-Seob Kim (NAVER Cloud), Ryuichi Yamamoto, Ryo Terashima, Chan-Ho Song (NAVER Cloud), Jae-Min Kim (NAVER Cloud), Eunwoo Song (NAVER Cloud); 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2024); 2024.4.14

カンファレンス (国際)
PromptTTS++: Controlling Speaker Identity in Prompt-Based Text-To-Speech Using Natural Language Descriptions: Reo Shimizu (Tohoku University), Ryuichi Yamamoto, Masaya Kawamura, Yuma Shirahata, Hironori Doi, Tatsuya Komatsu, Kentaro Tachibana; 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2024); 2024.4.14

カンファレンス (国内)
生成型言語モデルによる仮想映画レビューを介した映画検索: 宮下天祥 (青学大), 莊司慶行 (静岡大), 藤田澄男, 大原剛三 (青学大); 第16回データ工学と情報マネジメントに関するフォーラム（第22回日本データベース学会年次大会） (DEIM 2024); 2024.4.1

カンファレンス (国際)
Understanding Likelihood of Normalizing Flow and Image Complexity through the Lens of Out-of-Distribution Detection: Genki Osada, Tsubasa Takahashi, Takashi Nishide (University of Tsukuba); Thirty-Eighth AAAI Conference on Artificial Intelligence (AAAI-24); 2024.3.24

その他 (国内)
Constitutional AI におけるセーフティアラインメントの改善: 綿岡晃輝, Thien Q. Tran, 前田若菜, 髙橋翼; 言語処理学会第30回年次大会 (NLP2024); 2024.3.4

その他 (国内)
対話モデルに対する敵対的プロンプトの効率的な最適化: 矢野一樹 (東北大学), 綿岡晃輝, Thien Q. Tran, 髙橋翼, Seng Pei Liew, 鈴木潤 (東北大学/理化学研究所); 言語処理学会第30回年次大会 (NLP2024); 2024.3.4

ワークショップ (国際)
University of Tsukuba Team at the TREC 2023 Deep Learning Track: KAIYU YANG (Tsukuba univ.), SUMIO FUJITA, HAITAO YU (Tsukuba univ.), HIDEO JOHO (Tsukuba univ.); The Thirty-Second Text REtrieval Conference (TREC 2023); 2024.3.1

ワークショップ (国際)
University of Tsukuba Team at the TREC 2023 Interactive Knowledge Assistance Track: LINGZHEN ZHENG (Tsukuba univ.), KAIYU YANG (Tsukuba univ.), HAITAO YU (Tsukuba univ.), SUMIO FUJITA, HIDEO JOHO (Tsukuba univ.); The Thirty-Second Text REtrieval Conference (TREC 2023); 2024.3.1

ワークショップ (国際)
A Comparative Study of Voice Conversion Models with Large-Scale Speech and Singing Data: The T13 Systems for the Singing Voice Conversion Challenge 2023: Ryuichi Yamamoto (Nagoya University / LINE Corp.), Reo Yoneyama (Nagoya University), Lester Phillip Violeta (Nagoya University), Wen-Chin Huang (Nagoya University), Tomoki Toda (Nagoya University); The 2023 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU 2023); 2023.12.19

ワークショップ (国際)
Verbosity Bias in Preference Labeling by Large Language Models: Keita Saito (University of Tsukuba), Akifumi Wachi, Koki Wataoka, Youhei Akimoto (University of Tsukuba); Workshop on Instruction Tuning and Instruction Following at NeurIPS 2023. (Instruction Workshop @ NeurIPS 2023.); 2023.12.16

カンファレンス (国際)
Role-aware Interaction Generation from Textual Description: Mikihiro Tanaka, Kent Fujiwara; 2023 International Conference on Computer Vision (ICCV 2023); 2023.10.2

カンファレンス (国内)
Foley Sound Synthesis with a Class-Conditioned Latent Diffusion Model and FAD-Based Post-filtering: シャイブラーロビン, 蓮実拓也, 藤田雄介, 小松達也, 山本龍一, 橘健太郎; 日本音響学会 2023年秋季研究発表会 (ASJ 2023 autumn); 2023.9.26

ワークショップ (国際)
Foley Sound Synthesis with a Class-conditioned Latent Diffusion Model: Robin Scheibler, Takuya Hasumi, Yusuke Fujita, Tatsuya Komatsu, Ryuichi Yamamoto, Kentaro Tachibana; Detection and Classification of Acoustic Scenes and Events (DCASE 2023); 2023.9.20

カンファレンス (国際)
An Open-Domain Avatar Chatbot by Exploiting a Large Language Model: Takato Yamazaki, Tomoya Mizumoto, Katsumasa Yoshikawa, Masaya Ohagi, Toshiki Kawamoto (LINE/Tokyo Institute of Technology), Toshinori Sato; 24th Meeting of the Special Interest Group on Discourse and Dialogue (SIGDIAL 2023); 2023.9.11

論文誌 (国際)
Building a hospitable and reliable dialogue system for android robots: a scenario-based approach with large language models: Takato Yamazaki, Katsumasa Yoshikawa, Toshiki Kawamoto (LINE/Tokyo Institute of Technology), Tomoya Mizumoto, Masaya Ohagi, Toshinori Sato; Advanced Robotics (Advanced Robotics); 2023.8.22

カンファレンス (国際)
ChatGPT-EDSS: Empathetic Dialogue Speech Synthesis Trained from ChatGPT-derived Context Word Embeddings: Yuki Saito (The University of Tokyo), Shinnosuke Takamichi (The University of Tokyo), Eiji Iimori (The University of Tokyo), Kentaro Tachibana, Hiroshi Saruwatari (The University of Tokyo); The 24th Annual Conference of the International Speech Communication Association (INTERSPEECH 2023); 2023.8.20