Introducing AutoMuon, a one line drop in for AdamW [P]

Apr 26, 2026 · 3:23 AM UTC · 0 reactions · 0 comments · 7 views

via

Hey everyone, I've been working on a small Python package called AutoMuon that makes the Muon optimizer usable as a drop-in replacement for AdamW in arbitrary PyTorch training pipelines. The core idea is relatively simple: Muon works primarily on 2D weight matrices (linear projections, conv layers) on hidden states, but you still need AdamW for embeddings, norms, and biases, etc. AutoMuon scans your model at init, figures out the right optimizer for each parameter automatically. I am open to PRs

Original article

Read full at Reddit →

Anonymous · no account needed

Discussion

0 comments

Introducing AutoMuon, a one line drop in for AdamW [P]

Discussion

More from Reddit