WeSearch
Hub / Tags / Qwen3
TAG · #QWEN3

Qwen3 coverage.

Every story in the WeSearch catalog tagged with #qwen3, chronological, with view counts. Subscribe to the per-tag RSS feed to follow this topic in your reader of choice.

25 stories tagged with #qwen3, in publish-time order across the WeSearch catalog. Tag pages update as new stories ingest.

⌘ RSS feed for this tag →   or   search "Qwen3"

RELATED TAGS
#ollama2#claude2#qwen3-6-27b1#coding-model1#open-weight1#llama-cpp1#quantization1#ai1#llms1#agents1#local-models1#coding1
LOCALLLAMA

Quant Qwen3.6-27B on 16GB VRAM with 100k context length

I have experimented how to run Qwen3.6-27B on my laptop with an A5000 16GB GPU. I have created an own IQ4_XS GGUF "qwen3.6-27b-IQ4_XS-pure.gguf" with the Unsloth imatrix and compar…

13 views ·
REDDIT

Qwen3.6-35B-A3B KLDs - INTs and NVFPs

KLD for INTs and NVFP4s. AS ALWAYS - Use Case is important. Accuracy versus speed versus native kernels on your GPUs. Things to note again: This is done in VLLM, with REAL logits. …

13 views ·
PRISMML

Bonsai: The First Commercially Viable 1-Bit LLM

Today, we are announcing 1-bit Bonsai models that bring advanced intelligence to the devices where people actually live and work.…

6 views ·
#ai efficiency#edge computing#model compression
LOCALLLAMA

Got DFlash speculative decoding working on Qwen3.5-35B-A3B with an RTX 2080 SUPER 8GB

## Got DFlash speculative decoding working on Qwen3.5-35B-A3B with an RTX 2080 SUPER 8GB I managed to get **DFlash speculative decoding** working in llama.cpp on a pretty VRAM-limi…

4 views ·
LOCALLLAMA

Qwen3.6-27B - Closed-loop SVG Images

Yesterday, I saw an impressive presentation of Qwen 3.6 27B's SVG capabilities on the sub . To maximize the model's capabilities in terms of SVG generation, I put together a closed…

3 views ·
XDA

I replaced ChatGPT and Claude with this powerful local LLM and saved over $20 a month while gaining full control

Qwen3.6 runs on my old GPU and does what ChatGPT does for free…

6 views ·
#local llm#ai privacy#cost savings
R/LOCALLLAMA

Follow-up: Qwen3.6-27B on 1× RTX 3090 — pushing to ~218K context + ~50–66 TPS, tool calls now stable (PN12 fix)

5 views ·
X (FORMERLY TWITTER)

Post-trained Qwen3-Coder with a debugger: 70% → 89% solve rate, 59% fewer turns

8 views ·
LOCALLLAMA

[7900XT] Qwen3.6 27B for OpenCode

I'm just looking for some advice on optimally setting up Qwen3.6 27B for OpenCode. The VRAM is a little bit scarce, but I ended up with this so far: llama-server --model models/Qwe…

10 views ·
WILLIAMANGEL

Offline Agentic Coding

Offline Agentic Coding: Ollama and Claude code…

5 views ·
#ai#llms#agents
REDDIT

GBNF grammar tweak for faster Qwen3.6 35B-A3B and Qwen3.6 27B

Hi folks, Enjoy an optimised Qwen3.6 35B-A3B and Qwen3.6 27B for coding and general purpose - it's able to solve puzzles correctly more often too. The initial intent was to optimis…

9 views ·
REDDIT

Used a Claude Code skill to fine-tune Qwen3-1.7B from 327 noisy traces, matches GLM-5

Had 327 production traces from a restaurant-reservation agent I wanted to retrain. The plan was to fine-tune a smaller self-hostable model so I could ditch the frontier-API bill. T…

12 views ·
REDDIT

Luce DFlash: Qwen3.6-27B at up to 2x throughput on a single RTX 3090

Hey fellow Llamas, your time is precious, so I'll keep it short. We built a GGUF port of DFlash speculative decoding. Standalone C++/CUDA stack on top of ggml, runs on a single 24 …

46 views ·
LOCALLLAMA

Simple to use vLLM Docker Container for Qwen3.6 27b with Lorbus AutoRound INT4 quant and MTP speculative decoding - 118 tokens/second on 2x 3090s

8 views ·
REDDIT

Qwen3.6-27B-3bit-mlx · Hugging Face: 3 & 5 mixed quant for RAM poor Mac users.

Just dropped a 3bit mixed quant (5bit for embeds and prediction layers) for Mac users. There was only one 3 bit version of this model (from Unsloth), but it was very heavy and pain…

10 views ·
REDDIT

Brief Ngram-Mod Test Results - R9700/Qwen3.6 27B

Decided to try out the new --spec-type ngram-mod feature in llama.cpp using Qwen3.6 27B during an OpenCode bug chasing session. TLDR: Performance is variable, but so far it seems t…

11 views ·
REDDIT

Switched from Qwen3.6 35b-a3b to Qwen3.6 27b mid coding and it's noticeably better!

A bit of context. I was coding up a little html tower defense game where you can alter the path by placing additional waypoints. My setup: 32gb ram with 16gb vram 5070 ti. Using Ae…

12 views ·
SIMON WILLISON'S WEBLOG

Qwen3.6-27B: Flagship-Level Coding in a 27B Dense Model

Qwen3.6-27B: Flagship-Level Coding in a 27B Dense Model Big claims from Qwen about their latest open weight model: Qwen3.6-27B delivers flagship-level agentic coding performance, s…

14 views ·
#qwen3.6-27b#coding model#open-weight
REDDIT

Qwen3.6-27B-INT4 clocking 100 tps with 256k context length on 1x RTX 5090 via vllm 0.19

Thanks to the community the Qwen3.6-27B speed keeps getting better. The following improves upon my recipe from yesterday and delivered a whopping 100+ tps (TG). Model: - MTP suppor…

10 views ·
REDDIT

Qwen3.6 35B A3B Heretic (KLD 0.0015!) Incredible model. Best 35B I have found!

Been using this for a few days. It is BY FAR the best uncensored model I have found for Qwen 3.6 35B. With IQ4XS, Q8 KVcache, 262K context, it fits in 24GB of VRAM and does not fai…

10 views ·
REDDIT

Qwen3.5/3.6 Coder?

With practically all of LocalLlama glazing Qwen 3.5/3.6 for it's coding skills. Along with the fact that Alibaba themselves are focusing on making Qwen a reliable coding agent, doe…

93 views ·
REDDIT

[Qwen3.6 35b a3b] Used the top config for my setup 8gb vram and 32gb ram, and found that somehow the Q4_K_XL model from Unsloth runs just slightly faster and used less tokens for output compared to Q4_K_M despite more memory usage

Config CtxSize: 131,072 GpuLayers: 99 CpuMoeLayers: 38 Threads: 16 BatchSize/UBatchSize: 4096/4096 CacheType K/V: q8_0 Tool Context: file mode (tools.kilocode.official.md) Metric M…

8 views ·
REDDIT

Qwen3.6-27B at ~80 tps with 218k context window on 1x RTX 5090 served by vllm 0.19

Qwen3.6-27B is out for a few days and the NVFP4 with MTP is dropped earlier on HF: Can follow the same recipe I used for Qwen3.5-27B to achieve ~80 tps on a single RTX 5090 at 218k…

10 views ·
REDDIT

Qwen3.6 35b a3b Particle System

Started testing Qwen3.6 35b a3b. I let it code a particle System with my Pi Agent. It just made one little ValueError but I was impressed how fast it got it right. Which task are y…

10 views ·
REDDIT

Qwen3.5/3.6 Coder?

8 views ·