People
Akifumi Wachi Senior Chief Researcher, LY Corporation Research
Akifumi Wachi is a research scientist at LY Corporation Research. His research interests lie primarily in reinforcement learning, and span the entire theory-to-application spectrum from fundamental advances to deployment in real-world systems. Especially, he is interested in how a policy should (and can) be trained and deployed in safety-critical problems. See https://akifumi-wachi-4.github.io/website/ (external link) for details.
Publications
-
- OTHERS (INTERNATIONAL)
- Inference-Aware Meta-Alignment of LLMs via Non-Linear GRPO
- Shokichi Takakura, Akifumi Wachi, Rei Higuchi (The University of Tokyo/RIKEN AIP), Kohei Miyaguchi, Taiji Suzuki (The University of Tokyo/RIKEN AIP)
- arXiv.org
- February 03, 2026
-
- OTHERS (INTERNATIONAL)
- A Relative-Budget Theory for Reinforcement Learning with Verifiable Rewards in Large Language Model Reasoning
- Akifumi Wachi, Hirota Kinoshita (Toyota Technological Institute at Chicago), Shokichi Takakura, Rei Higuchi (University of Tokyo/RIKEN AIP), Taiji Suzuki (University of Tokyo/RIKEN AIP)
- arXiv.org
- February 02, 2026
-
- CONFERENCE (INTERNATIONAL)
- A Provable Approach for End-to-End Safe Reinforcement Learning
- Akifumi Wachi, Kohei Miyaguchi, Takumi Tanabe, Rei Sato, Youhei Akimoto (University of Tsukuba, RIKEN AIP)
- The Thirty-Ninth Annual Conference on Neural Information Processing Systems
- December 05, 2025
-
- CONFERENCE (INTERNATIONAL)
- Offline Guarded Safe Reinforcement Learning for Medical Treatment Optimization Strategies
- Runze Yan (Emory University), Xun Shen (Tokyo University of Agriculture and Technology), Akifumi Wachi, Sebastien Gros (Norwegian University of Science and Technology), Anni Zhao (Emory University), Xiao Hu (Emory University)
- The Thirty-Ninth Annual Conference on Neural Information Processing Systems
- December 03, 2025
-
- OTHERS (INTERNATIONAL)
- Vulnerability Mitigation for Safety-Aligned Language Models via Debiasing
- Thien Q. Tran, Akifumi Wachi, Rei Sato, Takumi Tanabe, Youhei AKimoto (University of Tsukuba, RIKEN AIP)
- arXiv.org
- February 04, 2025
-
- CONFERENCE (INTERNATIONAL)
- Flipping-based Policy for Chance-Constrained Markov Decision Processes
- Xun Shen (Osaka University), Shuo Jiang (Osaka University), Akifumi Wachi, Kazumune Hashimoto (Osaka University), Sebastien Gros (Norwegian University of Science and Technology)
- The 38th Annual Conference on Neural Information Processing Systems
- December 13, 2024
-
- CONFERENCE (INTERNATIONAL)
- Stepwise Alignment for Constrained Language Model Policy Optimization
- Akifumi Wachi, Thien Q. Tran, Rei Sato, Takumi Tanabe, Youhei Akimoto (University of Tsukuba)
- The 38th Annual Conference on Neural Information Processing Systems
- December 11, 2024
-
- CONFERENCE (INTERNATIONAL)
- A Survey of Constraint Formulations in Safe Reinforcement Learning
- Akifumi Wachi, Xun Shen (Osaka University), Yanan Sui (Tsinghua University)
- The 33rd International Joint Conference on Artificial Intelligence
- August 03, 2024
-
- CONFERENCE (INTERNATIONAL)
- Safe Reinforcement Learning Using Model Predictive Control with Probabilistic Control Barrier Function
- Xun Shen (Osaka University), Akifumi Wachi, Wataru Hashimoto (Osaka University), Kazumune Hashimoto (Osaka University), Shigemasa Takai (Osaka University)
- 2024 American Control Conference
- July 10, 2024
-
- CONFERENCE (INTERNATIONAL)
- Long-term Safe Reinforcement Learning with Binary Feedback
- Akifumi Wachi, Wataru Hashimoto (Osaka University), Kazumune Hashimoto (Osaka University)
- Thirty-Eighth AAAI Conference on Artificial Intelligence
- March 24, 2024