Simple to use vLLM Docker Container for Qwen3.6 27b with Lorbus AutoRound INT4 quant and MTP speculative decoding - 118 tokens/second on 2x 3090s
·
0 reactions
·
0 comments
·
4 views
Original article
LocalLlama
Anonymous · no account needed