Running 35B–400B LLMs on a GPU-less Cluster to Mine 10,000 Papers — and the 4 Bugs That Almost Ruined the Data
A team successfully built a CPU-only distributed LLM pipeline to extract structured data from 10,000 research papers. The project faced challenges, particularly with data quality, as four significant bugs were discovered during the process. The architecture utilized open-source tools and demonstrated that effective LLM extraction is possible without GPUs, focusing on correctness over speed.
- ▪The team operated an internal research cluster with older x86 servers and no GPUs.
- ▪They aimed to extract structured data for meta-analysis from approximately 10,000 full-text research papers.
- ▪The architecture included a MoE model and a vector database, relying solely on CPU resources.
Opening excerpt (first ~120 words) tap to expand
try { if(localStorage) { let currentUser = localStorage.getItem('current_user'); if (currentUser) { currentUser = JSON.parse(currentUser); if (currentUser.id === 3962195) { document.getElementById('article-show-container').classList.add('current-user-is-article-author'); } } } } catch (e) { console.error(e); } byeongsoo kang Posted on Jun 3 • Originally published at bric.pe.kr Running 35B–400B LLMs on a GPU-less Cluster to Mine 10,000 Papers — and the 4 Bugs That Almost Ruined the Data #llm #machinelearning #python #infrastructure A field report from building a CPU-only, distributed LLM pipeline for large-scale scientific literature extraction. No GPUs. A lot of quantization. And four silent data-quality bugs that taught me more than the happy path ever did.
…
Excerpt limited to ~120 words for fair-use compliance. The full article is at DEV.to (Top).