WeSearch

GBNF grammar tweak for faster Qwen3.6 35B-A3B and Qwen3.6 27B

· 0 reactions · 0 comments · 1 view
GBNF grammar tweak for faster Qwen3.6 35B-A3B and Qwen3.6 27B

Hi folks, Enjoy an optimised Qwen3.6 35B-A3B and Qwen3.6 27B for coding and general purpose - it's able to solve puzzles correctly more often too. The initial intent was to optimise the 35B-A3B reasoning traces since it's the most efficient on my 5090 setup as I can perform parallel jobs with llama.cpp on my prod. Love 27B consistency, but the prefill churn on long horizon work is painful. Tweaked the GBNF and tested a basic prompt to my custom Rust/Next.js bench to see improvements, and I have

Original article
Reddit
Read full at Reddit →
Anonymous · no account needed
Share 𝕏 Facebook Reddit LinkedIn Email

Discussion

0 comments

More from Reddit