GBNF grammar tweak for faster Qwen3.6 35B-A3B and Qwen3.6 27B
·
0 reactions
·
0 comments
·
1 view
Hi folks, Enjoy an optimised Qwen3.6 35B-A3B and Qwen3.6 27B for coding and general purpose - it's able to solve puzzles correctly more often too. The initial intent was to optimise the 35B-A3B reasoning traces since it's the most efficient on my 5090 setup as I can perform parallel jobs with llama.cpp on my prod. Love 27B consistency, but the prefill churn on long horizon work is painful. Tweaked the GBNF and tested a basic prompt to my custom Rust/Next.js bench to see improvements, and I have
Original article
Reddit
Anonymous · no account needed