WeSearch

PyTorch 2.12 Release

·9 min read · 0 reactions · 0 comments · 19 views
#pytorch#technology#software#ai
⚡ TL;DR · AI summary

The release of PyTorch 2.12 introduces significant performance improvements and new features. Key enhancements include a batched linalg.eigh on CUDA that is up to 100x faster and a new device-agnostic Graph API for unified graph capture and replay. This version continues to evolve PyTorch into a versatile platform for production training and inference across various hardware.

Key facts
Original article
Pytorch
Read full at Pytorch →
Opening excerpt (first ~120 words) tap to expand

Featured projects We are excited to announce the release of PyTorch® 2.12 (release notes)! The PyTorch 2.12 release features the following changes: Batched linalg.eigh on CUDA is up to 100x faster due to updated cuSolver backend selection New torch.accelerator.Graph API unifies graph capture and replay across CUDA, XPU, and out-of-tree backends torch.export.save now supports Microscaling (MX) quantization formats, enabling full export of aggressively compressed models Adagrad now supports fused=True, joining Adam, AdamW, and SGD with a single-kernel optimizer implementation torch.cond control flow can now be captured and replayed inside CUDA Graphs ROCm users gain expandable memory segments, rocSHMEM symmetric memory collectives, and FlexAttention pipelining This release is composed of…

Excerpt limited to ~120 words for fair-use compliance. The full article is at Pytorch.

Anonymous · no account needed
Share 𝕏 Facebook Reddit LinkedIn Threads WhatsApp Bluesky Mastodon Email

Discussion

0 comments

More from Pytorch