WeSearch

[7900XT] Qwen3.6 27B for OpenCode

· 0 reactions · 0 comments · 8 views

I'm just looking for some advice on optimally setting up Qwen3.6 27B for OpenCode. The VRAM is a little bit scarce, but I ended up with this so far: llama-server --model models/Qwen3.6-27B-IQ4_XS.gguf \ --port 8080 \ --host 127.0.0.1 \ --top-p 0.95 \ --top-k 20 \ --min-p 0.0 \ --temperature 0.6 \ --flash-attn on \ --cache-type-k q8_0 \ --cache-type-v q8_0 \ --presence-penalty 0.0 \ --repeat-penalty 1.0 \ --ctx-size 65536 \ --chat-template-kwargs '{"preserve_thinking": true}' \ With this my VRAM

Original article
LocalLlama
Read full at LocalLlama →
Anonymous · no account needed
Share 𝕏 Facebook Reddit LinkedIn Threads WhatsApp Bluesky Mastodon Email

Discussion

0 comments

More from LocalLlama