WeSearch

RAG in Practice — Part 8: RAG in Production — What Breaks After Launch

·15 min read · 0 reactions · 0 comments · 0 views
RAG in Practice — Part 8: RAG in Production — What Breaks After Launch

Why production RAG drifts, degrades, and quietly fails — and the patterns and discipline that prevent it.

Original article
DEV Community
Read full at DEV Community →
Full article excerpt tap to expand

try { if(localStorage) { let currentUser = localStorage.getItem('current_user'); if (currentUser) { currentUser = JSON.parse(currentUser); if (currentUser.id === 2006864) { document.getElementById('article-show-container').classList.add('current-user-is-article-author'); } } } } catch (e) { console.error(e); } Gursharan Singh Posted on Apr 28 RAG in Practice — Part 8: RAG in Production — What Breaks After Launch #rag #ai #architecture #webdev RAG Article Series (8 Part Series) 1 RAG in Practice — Part 1: Why AI Gets Things Wrong 2 RAG in Practice — Part 2: What RAG Is and Why It Works ... 4 more parts... 3 RAG in Practice — Part 3: How RAG Works — The Complete Pipeline 4 RAG in Practice — Part 4: Chunking, Retrieval, and the Decisions That Break RAG 5 RAG in Practice — Part 5: Build a RAG System in Practice 6 RAG in Practice — Part 6: RAG, Fine-Tuning, or Long Context? 7 RAG in Practice — Part 7: Your RAG System Is Wrong. Here's How to Find Out Why. 8 RAG in Practice — Part 8: RAG in Production — What Breaks After Launch Part 8 of 8 — RAG Article Series Previous: Your RAG System Is Wrong. Here's How to Find Out Why. (Part 7) The System That Stopped Being Right TechNova's RAG system was correct at launch. Three months later, it was confidently wrong. The return policy had changed. The firmware changelog had new versions. The warranty terms had been revised. The documents in the CMS were current. The chunks in the vector index were not. A production RAG system does not fail all at once. It drifts, degrades quietly, and keeps sounding confident while its retrieval quality gets worse. The model does not know the data is stale. The retriever does not know the documents changed. The user sees the same fluent, authoritative tone delivering answers that were right last quarter. Most RAG systems that fail in production fail because of stale data, not bad models. That is the operational opinion this article is built around. Data Freshness and Embedding Drift The TechNova scenario from the opening is not hypothetical. Every RAG system with changing source data will face this problem. The question is not whether the index will go stale. It is whether you will detect it before your users do. Three re-indexing strategies, in order of complexity. Scheduled re-indexing: re-run the full ingestion pipeline on a cadence, nightly, weekly, or after every document update. Simple, reliable, and sufficient for most teams. Incremental re-indexing: detect which documents changed and re-embed only those chunks. Faster and cheaper, but requires change-detection logic. Event-driven re-indexing: trigger re-indexing automatically when documents are updated in the CMS (content management system). The most responsive, but the most complex to build and operate. Document freshness is only half of the story. Embedding models change too. If you switch from one embedding model to another, the vectors already stored in your index are no longer comparable in quite the same way, even if the documents themselves never changed. That is its own form of drift. When a provider deprecates a model or you upgrade for quality or cost reasons, re-embedding the corpus is not optional. It is a full re-indexing event. Over time, drift is not only about stale documents. Index drift can also come from changed chunk boundaries, new metadata rules, or embedding-model changes that quietly alter retrieval behavior. Whichever strategy you choose, the diagnostic signal from Part 7 applies here:…

This excerpt is published under fair use for community discussion. Read the full article at DEV Community.

Anonymous · no account needed
Share 𝕏 Facebook Reddit LinkedIn Email

Discussion

0 comments

More from DEV Community