音声処理
-
- カンファレンス (国内)
- ヘビーテイル生成モデルに基づく独立低ランク行列分析における iterative source steering を用いた分離行列の更新
- 蓮実 拓也, シャイブラー ロビン
- 日本音響学会 2023年春季研究発表会 (ASJ 2023 spring)
- 2023.3.15
-
- カンファレンス (国内)
- 中間層予測を用いたEnd-to-end ダイアライゼーション
- 藤田 雄介, 小松 達也, Scheibler Robin, 木田 祐介, 小川 哲司 (早稲田大学)
- 日本音響学会 2023年春季研究発表会 (ASJ 2023 spring)
- 2023.3.15
-
- カンファレンス (国内)
- ストリーミング End-to-End 音声認識のための RNN Transducer の最小遅延学習
- 篠原 雄介, 渡部 晋治 (Carnegie Mellon University)
- 日本音響学会2023年春季研究発表会
- 2023.3.15
-
- その他 (国内)
- 日本語音声認識における語彙集合分割とマルチタスク学習による 目的語彙抽出
- 伊藤 葵 (LINE/法政大学), 小松 達也, 藤田 雄介
- 電子情報通信学会/日本音響学会 音声研究会 (SP研究会)
- 2023.2.28
-
- カンファレンス (国際)
- Alternate Intermediate Conditioning with Syllable-level and Character-level Targets for Japanese ASR
- Yusuke Fujita, Tatsuya Komatsu, Yusuke Kida
- The 2022 IEEE Spoken Language Technology Workshop (SLT 2022)
- 2023.1.9
-
- カンファレンス (国際)
- End-to-End Multi-speaker ASR with Independent Vector Analysis
- Robin Scheibler, Wangyou Zhang (Shanghai Jiao Tong University), Xuankai Chang (Carnegie Mellon University), Shinji Watanabe (Carnegie Mellon University), Yanmin Qian (Shanghai Jiao Tong University)
- The 2022 IEEE Spoken Language Technology Workshop (SLT 2022)
- 2023.1.9
-
- カンファレンス (国際)
- Inter-Decoder: Using Attention-Decoder losses as Intermediate Regularization for CTC-based Speech Recognition
- Tatsuya Komatsu, Yusuke Fujita
- The 2022 IEEE Spoken Language Technology Workshop (SLT 2022)
- 2023.1.9
-
- カンファレンス (国際)
- Adaptive Noise Canceller Algorithm with an SNR-Based Stepsize and Controlled Averaging
- Akihiko Sugiyama
- IEEE International Conference on Consumer Electronics (ICCE)
- 2023.1.6
-
- その他 (国内)
- Raw or cooked? That is the Question in Adaptive Noise Cancelling
- Akihiko Sugiyama
- 電子情報通信学会第37回信号処理シンポジウム (SIPシンポジウム)
- 2022.12.13
-
- その他 (国際)
- Align, Write, Re-order: Explainable End-to-End Speech Translation via Operation Sequence Generation
- Motoi Omachi, Brian Yan (Carnegie Mellon University), Siddharth Dalmia (Carnegie Mellon University), Yuya Fujita, Shinji Watanabe (Carnegie Mellon University)
- arXiv.org (arXiv)
- 2022.11.14
-
- カンファレンス (国際)
- How Information on Acoustic Scenes and Sound Events Mutually Benefits Event Detection and Scene Classification Tasks.
- Ami Igarashi (Doshisha University), Keisuke Imoto (Doshisha University), Yuka Komatsu (Doshisha University), Shunsuke Tsubaki (Doshisha University), Shuto Hario (Doshisha University), Tatsuya Komatsu
- Asia-Pacific Signal and Information Processing Association Annual Summit and Conference 2022 (APSIPA ASC 2022)
- 2022.11.7
-
- カンファレンス (国際)
- On Sorting and Padding Multiple Targets for Sound Event Localization and Detection with Permutation Invariant and Location-based Training
- Robin Scheibler, Tatsuya Komatsu, Yusuke Fujita, Michael Hentschel
- Asia-Pacific Signal and Information Processing Association Annual Summit and Conference 2022 (APSIPA ASC 2022)
- 2022.11.7
-
- ワークショップ (国際)
- Sound Event Localization and Detection with pre-trained Audio Spectrogram Transformer and Multichannel Separation Network
- Robin Scheibler, Tatsuya Komatsu, Yusuke Fujita, Michael Hentschel
- Detection and Classification of Acoustic Scenes and Events (DCASE 2022)
- 2022.11.3
-
- カンファレンス (国際)
- Attention Weight Smoothing Using Prior Distributions for Transformer-Based End-to-End ASR
- Takashi Maekaku, Yuya Fujita, Yifan Peng (Carnegie Mellon University), Shinji Watanabe (Carnegie Mellon University)
- The 23rd Annual Conference of the International Speech Communication Association (INTERSPEECH 2022)
- 2022.9.19
-
- カンファレンス (国際)
- End-to-End Integration of Speech Recognition, Speech Enhancement, and Self-Supervised Learning Representation
- Xuankai Chang (Carnegie Mellon University), Takashi Maekaku, Yuya Fujita, Shinji Watanabe (Carnegie Mellon University)
- The 23rd Annual Conference of the International Speech Communication Association (INTERSPEECH 2022)
- 2022.9.19
-
- カンファレンス (国際)
- A Unified Accent Estimation Method Based on Multi-Task Learning for Japanese Text-to-Speech
- Byeongseon Park, Ryuichi Yamamoto, Kentaro Tachibana
- The 23rd Annual Conference of the International Speech Communication Association (INTERSPEECH 2022)
- 2022.9.18
-
- カンファレンス (国際)
- Acoustic Modeling for End-to-End Empathetic Dialogue Speech Synthesis Using Linguistic and Prosodic Contexts of Dialogue History
- Yuto Nishimura (The University of Tokyo), Yuki Saito (The University of Tokyo), Shinnosuke Takamichi (The University of Tokyo), Kentaro Tachibana, Hiroshi Saruwatari (The University of Tokyo)
- The 23rd Annual Conference of the International Speech Communication Association (INTERSPEECH 2022)
- 2022.9.18
-
- カンファレンス (国際)
- Better Intermediates Improve CTC Inference
- Tatsuya Komatsu, Yusuke Fujita, Jaesong Lee (NAVER), Lukas Lee (NAVER), Shinji Watanabe (Carnegie Mellon University), Yusuke Kida
- The 23rd Annual Conference of the International Speech Communication Association (INTERSPEECH 2022)
- 2022.9.18
-
- カンファレンス (国際)
- Cross-Speaker Emotion Transfer for Low-Resource Text-to-Speech Using Non-Parallel Voice Conversion with Pitch-Shift Data Augmentation
- Ryo Terashima, Ryuichi Yamamoto, Eunwoo Song (NAVER), Yuma Shirahata, Hyun-Wook Yoon (NAVER), Jae-Min Kim (NAVER), Kentaro Tachibana
- The 23rd Annual Conference of the International Speech Communication Association (INTERSPEECH 2022)
- 2022.9.18
-
- カンファレンス (国際)
- DRSpeech: Degradation-Robust Text-to-Speech Synthesis with Frame-Level and Utterance-Level Acoustic Representation Learning
- Takaaki Saeki (The University of Tokyo), Kentaro Tachibana, Ryuichi Yamamoto
- The 23rd Annual Conference of the International Speech Communication Association (INTERSPEECH 2022)
- 2022.9.18