Qwen3.6-35B-A3B KLDs - INTs and NVFPs
·
0 reactions
·
0 comments
·
7 views
KLD for INTs and NVFP4s. AS ALWAYS - Use Case is important. Accuracy versus speed versus native kernels on your GPUs. Things to note again: This is done in VLLM, with REAL logits. My Repo ( ) has made changes in the VLLM "hot path", so it's real, it's on GPU, and it's ~3-5 minutes on RTX 6000s KLD does not lie, it's just raw math against Logits KLD tells a story of divergence. Evals are still important, for use-case specific A quant can have a worse KLD and get a better eval on a test versus a b
Original article
Reddit
Anonymous · no account needed