Mahjax: A GPU-Accelerated Mahjong Simulator for Reinforcement Learning in JAX
Mahjax is a new GPU-accelerated Mahjong simulator designed for reinforcement learning using JAX. It allows for large-scale parallelization and offers a high-quality visualization tool for debugging. Experimental results indicate that Mahjax can achieve impressive training throughputs, demonstrating its effectiveness for training agents in the game.
- ▪Mahjax is implemented in JAX to facilitate reinforcement learning research.
- ▪The simulator achieves throughputs of up to 2 million steps per second on NVIDIA A100 GPUs.
- ▪Agents trained in Mahjax can effectively improve their rank against baseline policies.
Opening excerpt (first ~120 words) tap to expand
Computer Science > Artificial Intelligence arXiv:2605.20577 (cs) [Submitted on 20 May 2026] Title:Mahjax: A GPU-Accelerated Mahjong Simulator for Reinforcement Learning in JAX Authors:Soichiro Nishimori, Shinri Okano, Keigo Habara, Sotetsu Koyamada, Eason Yu, Masashi Sugiyama View a PDF of the paper titled Mahjax: A GPU-Accelerated Mahjong Simulator for Reinforcement Learning in JAX, by Soichiro Nishimori and 5 other authors View PDF HTML (experimental) Abstract:Riichi Mahjong is a multi-player, imperfect-information game characterized by stochasticity and high-dimensional state spaces. These attributes present a unique combination of challenges that mirror complex real-world decision-making problems in reinforcement learning.
…
Excerpt limited to ~120 words for fair-use compliance. The full article is at arXiv cs.AI.