60 stories tagged with #qwen, in publish-time order across the WeSearch catalog. Tag pages update as new stories ingest.
⌘ RSS feed for this tag → or search "Qwen"
Still: Amortized KV Cache Compaction in a Single Forward Pass
The KV cache is the memory bottleneck of long-horizon language model deployment. Practically, a deployable compactor must be lightweight enough to call during inference, expressive…
TensorSharp: Open-Source Local LLM Inference Engine
A C# inference engine for running large language models (LLMs) locally using GGUF model files. TensorSharp provides a console application, a web-based chatbot interface, and Ollama…
Nvidia PiD Flux-2 color fix is Out + PiD for Qwen
I ran Gemma 4 and Qwen 3.5 for the same local tasks, and one pulled miles ahead
Pitting them against each other to find the best one for my workflow…
This day in LLM history….105 years ago today, Qwen 3.6 27b was released open source. /s
Qwen 3.7 Plus just briefly appeared and then disappeared on OpenRouter.
Tensor split mode: CUDA error on latest llama.cpp with Qwen-3.6-27b
From fried chicken to flight plans: Alibaba wants Qwen to become China’s digital fixer
As Tencent prepares a rival WeChat agent, Alibaba is moving quickly to turn Qwen from a chatbot into a digital concierge for everyday life.…
Alibaba releases Qwen3.7-Plus, a multimodal proprietary model with a 1M-token context window, costing $2 per 1M tokens, 60% less than text-only Qwen3.7-Max (Carl Franzen/VentureBeat)
Holo3.1 35B/9B/4B/0.8B (Qwen 3.5 finetunes)
llama.cpp b9455 Finally Caught vLLM: 70t/s on 2x3090 Qwen 27B UQ8
Test post…
Mutli-character sectioned prompts Qwen2512
Running Qwen 3.6 35b MoE With Zoo Code On M1 Max is Amazing! Fully local, battery-powered coding powerhouse!
nvidia/Qwen3.6-35B-A3B-NVFP4 · Hugging Face
I use Claude Pro, Qwen 3-Coder, and Gemma 4 together, and it's the most cost-efficient AI workflow I've ever built
It's the holy trinity of cost savings when it comes to LLMs…
Can't get over 250TPS on RTX5090 with Qwen3.5-4B
Qwen 3.6 coding choice–27B vs 35B quants
Weekly AI roundup (May 23–30, 2026): Claude Opus 4.8 Fast Mode 3x cheaper, Qwen 3.7 Max beats Claude at half the price, ChatGPT moves into Excel
Fine-Tuning Qwen2.5-0.5B to Write SRE Post-Mortem Summaries
Writing post-mortem root-cause summaries is time-consuming and inconsistent. Junior SREs miss...…
Qwen Image Bench - Finetune for image eval
Uploaded my Qwen3.6 27B based fine tune, after two years of experience fine tuning models
FP16 on Qwen 3.6 27B
SharkBay – a local macOS workbench for coding-agent CLIs
SharkBay is a local-first macOS workbench for software projects. It helps you keep a set of local repositories visible, open project-scoped terminals and browser tabs, inspect Git …
Cracked the case on high res + quality Qwen Edit 2511 outputs, here are minimalistic workflows & lots of info on how/why
Alibaba’s Qwen Beats OpenAI, Google in AI Coding Rank - eWeek
Comprehensive up-to-date news coverage, aggregated from sources all over the world by Google News.…
Inferencing at 10.33 t/s on Qwen 3.5 35B on a $300 laptop
Qwen3.6 huge quality gain from Q4 to Q6 for coding agent
[Qwen Image Edit 2511] Any way to control the strength of a controlnet reference image?
LLM Prompt Caching: The Complete 2026 Guide
If you ship a chatbot, a RAG app, or an AI agent against a large language model, prompt caching is...…
InvokeAI 6.13 just released, its largest community-driven release ever. Adds full support for Anima & Qwen Image, support for API models (like GPT Image), support for Prompt Expansion & Image To Prompt, lasso & polygon tools, overhauled docs website and more
Show r/selfhosted: I built my production site from home with Caddy, SearXNG, and a local Qwen LLM ($0 cloud bills)
How Qwen3.6-35B-A3B fails differently as a sub agent compared to solo
Qwen Is Not Yet Ready to Power Local OpenClaw Deployments
Three weeks ago I ran a model showdown — twelve tasks, five models, one RTX 5090 — and...…
Short Story Creative Writing Benchmark. Baidu Ernie 5.1: -0.35, Qwen 3.7 Max: -2.01, Mistral Medium 3.5: -2.13, Grok 4.3: -3.81.
I built a full AI animation pipeline and made a 2.5 minute animated show in 5 days (Qwen, Flux, LTXV)
Qwen3.5 35B A3B uncensored heretic Native MTP Preserved is Out Now With the Full 785 MTPs Preserved and Retained, Available in Safetensors, GGUFs. NVFP4, NVFP4 GGUFs and GPTQ-Int4 Formats
Qwen Multi angle workflow without plastic skin
ComfyUI-Angelo now supports Qwen Edit
Anyone use QwQ-32B? It's over a year old? Has Qwen 3.6 27b basically replaced it?
Whats the best Qwen 27B Q8 quant?
Qwen Image 2511 losing detail? Overall Skin consistensy?
Want Built a React-style looping agent with small LLMs (Qwen 3.5 9B / Gemma4) + LangGraph?
We trained a personal voice DoRA on Qwen3-8B for $1.50 — beat stock model 100% in blind A/B
TL;DR. Trained a DoRA adapter on Qwen3-8B using 6128 personal Telegram messages. Cost: $1.50 on a...…
Alibaba’s Qwen catches up with ‘Sharif speed’ to help forge Pakistan deal
Chairman Joe Tsai uses firm’s mobile AI tool to draft a sweeping tech pact in moments, as ‘frontier technologies’ increasingly enter the global limelight.…
Qwen multi angle workflow
Qwen3.7-Max Ran for 35 Hours on Unknown Hardware and Achieved a 10× Speedup
Alibaba gave Qwen3.7-Max a kernel optimization task on a hardware platform the model had never encountered before. No documentation or profiling data. No example kernels for the ar…
Qwen 3.6 benchmarks on 2x RTX PRO 6000
I built a high-speed API gateway for DeepSeek/Qwen to fix 429 errors. Free credits inside!
1000 tps generation on Qwen3.6 27B with V100s
Qwen 3.6 Has Four Tiers. Here's How to Route Without Burning Cash.
Qwen 3.6 ships Max-Preview, Plus, Flash, and 35B-A3B with a 41x output-cost spread. Here's a tier-routing pattern that picks the right one per task — and survives the Max-Preview "…
Qwen 3.6 27B MTP speed on 3080ti (getting 4.5 t/s)
hipEngine: Fast Native Qwen 3.6 Inference for RDNA3 (Strix Halo, 7900 XTX)
Need Help Choosing a Harness for Qwen 3.6 27B
Qwen3.6-35B-A3B vs Gemma4-26B-A4B
Qwen Plays ̶p̶̶o̶̶k̶̶e̶̶m̶̶o̶̶n̶ ? / QWEN PLAYS DCSS! - qwen3.6-35b-a3b@q4_k_xl plays open source roguelike adventure DCSS (and does a decent job)
Qwen3.6-35B-A3B-Uncensored-Genesis-APEX-MTP
minor speed bump for MTP with Qwen3.6-27B-MTP Q6_K_XL
Qwen 3.6 27B and 35B MTP vs Standard on 16GB GPU
I tested Speculative decoding (Multi-Token Prediction, MTP) performance in Qwen 3.6 27B and 35B on an...…
DeepSeek vs Qwen vs Kimi vs GLM: Which AI API Actually Wins in 2026? (A Cost-Optimizer’s Verdict)
Let me start with a confession: I’m obsessed with getting the most bang for my buck. Whenever I see a...…