Speech Processing
-
- CONFERENCE (INTERNATIONAL)
- Adaptive Noise Canceller Algorithm with an SNR-Based Stepsize and Controlled Averaging
- Akihiko Sugiyama
- IEEE International Conference on Consumer Electronics (ICCE)
- January 06, 2023
-
- OTHERS (DOMESTIC)
- Raw or cooked? That is the Question in Adaptive Noise Cancelling
- Akihiko Sugiyama
- 電子情報通信学会第37回信号処理シンポジウム (SIPシンポジウム)
- December 13, 2022
-
- OTHERS (INTERNATIONAL)
- Align, Write, Re-order: Explainable End-to-End Speech Translation via Operation Sequence Generation
- Motoi Omachi, Brian Yan (Carnegie Mellon University), Siddharth Dalmia (Carnegie Mellon University), Yuya Fujita, Shinji Watanabe (Carnegie Mellon University)
- arXiv.org (arXiv)
- November 14, 2022
-
- CONFERENCE (INTERNATIONAL)
- How Information on Acoustic Scenes and Sound Events Mutually Benefits Event Detection and Scene Classification Tasks.
- Ami Igarashi (Doshisha University), Keisuke Imoto (Doshisha University), Yuka Komatsu (Doshisha University), Shunsuke Tsubaki (Doshisha University), Shuto Hario (Doshisha University), Tatsuya Komatsu
- Asia-Pacific Signal and Information Processing Association Annual Summit and Conference 2022 (APSIPA ASC 2022)
- November 07, 2022
-
- CONFERENCE (INTERNATIONAL)
- On Sorting and Padding Multiple Targets for Sound Event Localization and Detection with Permutation Invariant and Location-based Training
- Robin Scheibler, Tatsuya Komatsu, Yusuke Fujita, Michael Hentschel
- Asia-Pacific Signal and Information Processing Association Annual Summit and Conference 2022 (APSIPA ASC 2022)
- November 07, 2022
-
- WORKSHOP (INTERNATIONAL)
- Sound Event Localization and Detection with pre-trained Audio Spectrogram Transformer and Multichannel Separation Network
- Robin Scheibler, Tatsuya Komatsu, Yusuke Fujita, Michael Hentschel
- Detection and Classification of Acoustic Scenes and Events (DCASE 2022)
- November 03, 2022
-
- CONFERENCE (INTERNATIONAL)
- Attention Weight Smoothing Using Prior Distributions for Transformer-Based End-to-End ASR
- Takashi Maekaku, Yuya Fujita, Yifan Peng (Carnegie Mellon University), Shinji Watanabe (Carnegie Mellon University)
- The 23rd Annual Conference of the International Speech Communication Association (INTERSPEECH 2022)
- September 19, 2022
-
- CONFERENCE (INTERNATIONAL)
- End-to-End Integration of Speech Recognition, Speech Enhancement, and Self-Supervised Learning Representation
- Xuankai Chang (Carnegie Mellon University), Takashi Maekaku, Yuya Fujita, Shinji Watanabe (Carnegie Mellon University)
- The 23rd Annual Conference of the International Speech Communication Association (INTERSPEECH 2022)
- September 19, 2022
-
- CONFERENCE (INTERNATIONAL)
- A Unified Accent Estimation Method Based on Multi-Task Learning for Japanese Text-to-Speech
- Byeongseon Park, Ryuichi Yamamoto, Kentaro Tachibana
- The 23rd Annual Conference of the International Speech Communication Association (INTERSPEECH 2022)
- September 18, 2022
-
- CONFERENCE (INTERNATIONAL)
- Acoustic Modeling for End-to-End Empathetic Dialogue Speech Synthesis Using Linguistic and Prosodic Contexts of Dialogue History
- Yuto Nishimura (The University of Tokyo), Yuki Saito (The University of Tokyo), Shinnosuke Takamichi (The University of Tokyo), Kentaro Tachibana, Hiroshi Saruwatari (The University of Tokyo)
- The 23rd Annual Conference of the International Speech Communication Association (INTERSPEECH 2022)
- September 18, 2022
-
- CONFERENCE (INTERNATIONAL)
- Better Intermediates Improve CTC Inference
- Tatsuya Komatsu, Yusuke Fujita, Jaesong Lee (NAVER), Lukas Lee (NAVER), Shinji Watanabe (Carnegie Mellon University), Yusuke Kida
- The 23rd Annual Conference of the International Speech Communication Association (INTERSPEECH 2022)
- September 18, 2022
-
- CONFERENCE (INTERNATIONAL)
- Cross-Speaker Emotion Transfer for Low-Resource Text-to-Speech Using Non-Parallel Voice Conversion with Pitch-Shift Data Augmentation
- Ryo Terashima, Ryuichi Yamamoto, Eunwoo Song (NAVER), Yuma Shirahata, Hyun-Wook Yoon (NAVER), Jae-Min Kim (NAVER), Kentaro Tachibana
- The 23rd Annual Conference of the International Speech Communication Association (INTERSPEECH 2022)
- September 18, 2022
-
- CONFERENCE (INTERNATIONAL)
- DRSpeech: Degradation-Robust Text-to-Speech Synthesis with Frame-Level and Utterance-Level Acoustic Representation Learning
- Takaaki Saeki (The University of Tokyo), Kentaro Tachibana, Ryuichi Yamamoto
- The 23rd Annual Conference of the International Speech Communication Association (INTERSPEECH 2022)
- September 18, 2022
-
- CONFERENCE (INTERNATIONAL)
- ESPnet-SE++: Speech Enhancement for Robust Speech Recognition, Translation, and Understanding
- Yen-Ju Lu (Academia Sinica), Xuankai Chang (CMU), Chenda Li (SJTU), Wangyou Zhang (SJTU), Samuele Cornell (Universit`a Politecnica delle Marche), Zhaoheng Ni (Meta AI), Yoshiki Masuyama (CMU/TMU), Brian Yan (CMU), Robin Scheibler, Zhong-Qiu Wang (CMU), Yu Tsao (Academica Sinica), Yanmin Qian (SJTU), Shinji Watanabe (CMU)
- The 23rd Annual Conference of the International Speech Communication Association (INTERSPEECH 2022)
- September 18, 2022
-
- CONFERENCE (INTERNATIONAL)
- Independence-based Joint Dereverberation and Separation with Neural Source Model
- Kohei Saijo (Waseda University), Robin Scheilbler
- The 23rd Annual Conference of the International Speech Communication Association (INTERSPEECH 2022)
- September 18, 2022
-
- CONFERENCE (INTERNATIONAL)
- InterAug: Augmenting Noisy Intermediate Predictions for CTC-based ASR
- Yu Nakagome, Tatsuya Komatsu, Yusuke Fujita, Shuta Ichimura, Yusuke Kida
- The 23rd Annual Conference of the International Speech Communication Association (INTERSPEECH 2022)
- September 18, 2022
-
- CONFERENCE (INTERNATIONAL)
- Language Model-Based Emotion Prediction Methods for Emotional Speech Synthesis Systems
- Hyun-Wook Yoon (NAVER), Ohsung Kwon (NAVER), Hoyeon Lee (NAVER), Ryuichi Yamamoto, Eunwoo Song (NAVER), Jae-Min Kim (NAVER), Min-Jae Hwang (NAVER)
- The 23rd Annual Conference of the International Speech Communication Association (INTERSPEECH 2022)
- September 18, 2022
-
- CONFERENCE (INTERNATIONAL)
- Minimum Latency Training of Sequence Transducers for Streaming End-to-End Speech Recognition
- Yusuke Shinohara, Shinji Watanabe (Carnegie Mellon University)
- The 23rd Annual Conference of the International Speech Communication Association (INTERSPEECH 2022)
- September 18, 2022
-
- CONFERENCE (INTERNATIONAL)
- Spatial Loss for Unsupervised Multi-channel Source Separation
- Kohei Saijo (Waseda University), Robin Scheilbler
- The 23rd Annual Conference of the International Speech Communication Association (INTERSPEECH 2022)
- September 18, 2022
-
- CONFERENCE (INTERNATIONAL)
- STUDIES: Corpus of Japanese Empathetic Dialogue Speech Towards Friendly Voice Agent
- Yuki Saito (The University of Tokyo), Yuto Nishimura (The University of Tokyo), Shinnosuke Takamichi (The University of Tokyo), Kentaro Tachibana, Hiroshi Saruwatari (The University of Tokyo)
- The 23rd Annual Conference of the International Speech Communication Association (INTERSPEECH 2022)
- September 18, 2022