Kimi K2.5 runs on RTX 3060 with 768GB Intel Optane memory at 4 tokens per second

Editorial Team· May 24, 2026 · 6:16 AM UTC ·2 min read · 0 reactions · 0 comments · 25 views

#artificial intelligence #hardware #technology

Kimi K2.5 runs on RTX 3060 with 768GB Intel Optane memory at 4 tokens per second

TL;DR · WeSearch summary

A Chinese AI enthusiast demonstrated the Kimi K2.5 model running on an Nvidia RTX 3060 GPU with 768GB of Intel Optane memory. This setup achieved a performance of four tokens per second, showcasing the capabilities of a trillion-parameter model on consumer hardware. The experiment highlights the potential for using legacy components to run advanced AI models typically reserved for high-end infrastructure.

Key facts

▪The Kimi K2.5 model has a total of 1 trillion parameters but activates only 32 billion at a time for each token generated.
▪The full model size is approximately 630 GB, necessitating the use of 768 GB of Intel Optane Persistent Memory.
▪APFrisco's demonstration was notable as it utilized a mid-range GPU designed for gaming rather than AI workloads.

Original article

Crypto Briefing · Editorial Team

Read full at Crypto Briefing →

Opening excerpt (first ~120 words) tap to expand

Kimi K2.5 runs on RTX 3060 with 768GB Intel Optane memory at 4 tokens per second A Chinese AI enthusiast squeezed a trillion-parameter model onto consumer hardware using second-hand memory DIMMs, and the implications go far beyond the stunt itself. Share Add us on Google by Editorial Team May. 24, 2026 window.sevioads = window.sevioads || []; var sevioads_preferences = []; sevioads_preferences[0] = {}; sevioads_preferences[0].zone = "01f21ccf-2092-46b1-9ac7-8c44cc782e0f"; sevioads_preferences[0].adType = "native"; sevioads_preferences[0].inventoryId = "c5700508-581b-472c-8fdd-a931cdbfc8e1"; sevioads_preferences[0].accountId = "1e47efc1-ec2d-4fca-a8b9-354e249e5095"; sevioads.push(sevioads_preferences); A trillion-parameter AI model just ran on a graphics card that most gamers would…

Excerpt limited to ~120 words for fair-use compliance. The full article is at Crypto Briefing.

Anonymous · no account needed

Discussion

0 comments

Kimi K2.5 runs on RTX 3060 with 768GB Intel Optane memory at 4 tokens per second

Discussion

More from Crypto Briefing