WeSearch

Softmax in front of CrossEntropyLoss: 16 other bugs PyTorch won't catch

Xin Gao· ·7 min read · 0 reactions · 0 comments · 15 views
#machine learning#pytorch#neural networks#software linter#deep learning
Softmax in front of CrossEntropyLoss: 16 other bugs PyTorch won't catch
⚡ TL;DR · AI summary

PyTorch does not catch certain architectural bugs during model design, leading to issues that only appear during or after training. A design-time linter called Neurarch has been developed to detect 17 common structural failure modes in neural networks before training begins. These include incorrect layer ordering, missing components, and inefficient configurations that degrade performance or stability.

Key facts
Original article
Hacker News (Newest) · Xin Gao
Read full at Hacker News (Newest) →
Opening excerpt (first ~120 words) tap to expand

You can put a Softmax in front of CrossEntropyLoss. PyTorch won’t stop you. Here are 16 other architecture bugs it won’t catch.A walkthrough of the 17-rule design-time linter inside Neurarch: what each rule catches, why it matters, and where static analysis stops being useful for neural networks.Xin GaoMay 17, 2026ShareThe bug that started thisYou can put a Softmax in front of CrossEntropyLoss in PyTorch. The model trains. The loss curve looks fine. You ship it. Accuracy is bad, and you spend the next day finding out why.The bug is that nn.CrossEntropyLoss applies log-softmax internally, so the explicit Softmax causes double-application and degrades training stability. The bug is visible from the architecture diagram in two seconds.

Excerpt limited to ~120 words for fair-use compliance. The full article is at Hacker News (Newest).

Anonymous · no account needed
Share 𝕏 Facebook Reddit LinkedIn Threads WhatsApp Bluesky Mastodon Email

Discussion

0 comments