WeSearch

From SGD to Muon: Adaptive Optimization via Schatten-p Norms

·3 min read · 0 reactions · 0 comments · 14 views
#artificial intelligence#optimization#deep learning
From SGD to Muon: Adaptive Optimization via Schatten-p Norms
⚡ TL;DR · AI summary

The article introduces a new adaptive optimization framework that utilizes Schatten-p norms for deep neural networks. This framework dynamically selects optimal update geometries based on runtime data, improving upon traditional fixed geometries. The proposed method demonstrates competitive performance against established optimizers like Muon and AdamW across various training scenarios.

Key facts
Original article
arXiv cs.AI
Read full at arXiv cs.AI →
Opening excerpt (first ~120 words) tap to expand

Computer Science > Artificial Intelligence arXiv:2605.19781 (cs) [Submitted on 19 May 2026] Title:From SGD to Muon: Adaptive Optimization via Schatten-p Norms Authors:Thomas Massena (IRIT, DTIPG - SNCF, UT3), Corentin Friedrich, Mathieu Serrurier (IRIT) View a PDF of the paper titled From SGD to Muon: Adaptive Optimization via Schatten-p Norms, by Thomas Massena (IRIT and 4 other authors View PDF Abstract:Modern optimizers, like Muon, impose matrix-wise geometry constraints on their updates. These matrix-wise constraints can be unified under Linear Minimization Oracle (LMO) theory. However, all current methods impose fixed LMO geometries for the update rules, chosen by-design or empirically, which are not necessarily optimal according to the problem's geometry.

Excerpt limited to ~120 words for fair-use compliance. The full article is at arXiv cs.AI.

Anonymous · no account needed
Share 𝕏 Facebook Reddit LinkedIn Threads WhatsApp Bluesky Mastodon Email

Discussion

0 comments

More from arXiv cs.AI