4 results for "local inference"
TOM'S HARDWARE
Ubuntu's AI roadmap revealed, universal AI 'kill switch' and forced AI integration are not part of the plan — cloud tracking, local inference, and agentic system tools take center stage
AI is coming to Ubuntu…
REDDIT
Skymizer Taiwan Inc. Unveils Breakthrough Architecture Enabling Ultra-Large LLM Inference on a Single Card
Source Article excerpt: With a single PCIe card — powered by six HTX301 chips and 384 GB of memory — enterprises can now run 700B-parameter model inference locally at just ~240W per card. The memory-b…
REDDIT
AMD Hipfire - a new inference engine optimized for AMD GPU's
Came across hipfire the other day. It's a brand new inference engine focused on all AMD GPU's (not just the latest). Github. It uses a special mq4 quantization method. The hipfire creator is pumping o…
LOCALLLAMA
GLM 5.1 Locally: 40tps, 2000+ pp/s
After some sglang patching and countless experiments, managed to get reap-ed nvfp4 version running stable and FAST on 4 x RTX 6000 Pros (limited to 350W). Very happy with performance and quality. Infe…