GBNF grammar tweak for faster Qwen3.6 35B-A3B and Qwen3.6 27B

April 27, 2026 at 3:58 PM · 0 reactions · 0 comments · 1 view

Hi folks, Enjoy an optimised Qwen3.6 35B-A3B and Qwen3.6 27B for coding and general purpose - it's able to solve puzzles correctly more often too. The initial intent was to optimise the 35B-A3B reasoning traces since it's the most efficient on my 5090 setup as I can perform parallel jobs with llama.cpp on my prod. Love 27B consistency, but the prefill churn on long horizon work is painful. Tweaked the GBNF and tested a basic prompt to my custom Rust/Next.js bench to see improvements, and I have

Original article

Read full at Reddit →

Anonymous · no account needed

Discussion

0 comments

GBNF grammar tweak for faster Qwen3.6 35B-A3B and Qwen3.6 27B

Discussion

More from Reddit