WeSearch

Qwen3.6-35B-A3B KLDs - INTs and NVFPs

· 0 reactions · 0 comments · 7 views
Qwen3.6-35B-A3B KLDs - INTs and NVFPs

KLD for INTs and NVFP4s. AS ALWAYS - Use Case is important. Accuracy versus speed versus native kernels on your GPUs. Things to note again: This is done in VLLM, with REAL logits. My Repo ( ) has made changes in the VLLM "hot path", so it's real, it's on GPU, and it's ~3-5 minutes on RTX 6000s KLD does not lie, it's just raw math against Logits KLD tells a story of divergence. Evals are still important, for use-case specific A quant can have a worse KLD and get a better eval on a test versus a b

Original article
Reddit
Read full at Reddit →
Anonymous · no account needed
Share 𝕏 Facebook Reddit LinkedIn Threads WhatsApp Bluesky Mastodon Email

Discussion

0 comments

More from Reddit