WeSearch

Show HN: Auto GPU Kernel – Autonomous GPU-kernel discovery and optimizer

·2 min read · 0 reactions · 0 comments · 24 views
#gpu#optimization#technology
Show HN: Auto GPU Kernel – Autonomous GPU-kernel discovery and optimizer
⚡ TL;DR · AI summary

Auto GPU Kernel is an autonomous GPU-kernel discovery and optimizer that has achieved significant recognition. It ranked #1 in the MLSys 2026 FlashInfer AI Kernel Generation Contest, demonstrating an impressive average speedup of 34.93x. The tool is designed to work in isolated environments and can operate without a local GPU using cloud services.

Key facts
Original article
GitHub
Read full at GitHub →
Opening excerpt (first ~120 words) tap to expand

Auto GPU Kernel 🏆 Autonomous GPU-kernel discovery & optimizer. Technical Report Ranked #1 on MLSys 2026 - FlashInfer AI Kernel Generation Contest for the DeepSeek Sparse Attention (DSA) track with an average speedup of 34.93x. Submissions can be found at: Kernel Runtime (ms) dsa_sparse_attention_h16_ckv512_kpe64_topk2048_ps64 — DSA Sparse Attention 0.010 dsa_topk_indexer_fp8_h64_d128_topk2048_ps64 — DSA TopK Indexer 0.016 Setup Copy the template directory into a separate folder / git repository to make sure your agents work in an isolated environment. The kernel agent is compatible with FlashInfer format and can run without a local GPU on cloud using Modal. Requires Claude Code CLI.

Excerpt limited to ~120 words for fair-use compliance. The full article is at GitHub.

Anonymous · no account needed
Share 𝕏 Facebook Reddit LinkedIn Threads WhatsApp Bluesky Mastodon Email

Discussion

0 comments

More from GitHub