60 stories tagged with #retrieval, in publish-time order across the WeSearch catalog. Tag pages update as new stories ingest.
⌘ RSS feed for this tag → or search "Retrieval"
My RAG pipeline couldn't find the CEO — here's how I fixed it with hybrid retrieval
In my last post, I built a RAG pipeline from scratch — no LangChain, just FastAPI + FAISS. It scored...…
Retrieval Found the Sensitive Memory. That Made It More Dangerous.
This continues the research on why relevance alone is insufficient for agent memory safety. Article...…
Authorization Before Retrieval: Making RAG Safe by Construction
Retrieval-augmented generation makes language models far more useful by grounding them in real data, But it also raises a hard question: who is allowed to see what? This post shows…
Embeddings Aren’t Magic: The Predictable Failure Modes of RAG Retrieval
Enterprise Document Intelligence [Vol. 1 #2] Why the same vector search that handles synonyms and paraphrase silently fails on negation, exact identifiers, and your company’s acron…
ON1 (G116 V8): 38μs Black-Box AI Memory Retrieval on Virtual Chip ISA
G116 v8: 38μs Black-box AI Memory Retrieval on Virtual Chip ISA (Latency-Separated Fetch/Compute/ANN) — Live Tunnel Inside - ON1-Hao/ON1…
A graph-theoretic approach to building reliable LLM judges for retrieval
Georgian x turbopuffer: Evaluating Retrieval Without Ground Truth…
📄Paper: RORA-VLM: Robust Retrieval Augmentation for Vision Language Models
Public At International Conference on Learning Representations (ICLR) 2025 💡 Why I read...…
Dual Encoder vs Cross-Encoder: Why Your RAG Pipeline Needs Both
My RAG pipeline looked fine on paper. Fast retrieval. Decent cosine scores. But when I tested it with...…
A file-level tree that lets an LLM reason over a document corpus
Introducing PageIndex File System: the vectorless retrieval engine now scales to millions of documents in a single index.…
I made a small tool to inspect retrieval results before feeding them into RAG
Does Engram Do Memory Retrieval in Autoregressive Image Generation?
From Norms to Indicators (N2I-RAG): An Agentic Retrieval-Augmented Generation Framework for Legal Indicator Computation
Computing legal indicators from normative texts is a key task in legal monitoring and policy evaluation, but presents significant challenges due to the complexity, scale, and inter…
Detecting Is Not Resolving: The Monitoring Control Gap in Retrieval Augmented LLMs
Retrieval-augmented LLMs are deployed for tasks where evidence quality determines action safety, yet evaluation protocols assume that single-turn robustness predicts robustness whe…
Natural Language Query to Configuration for Retrieval Agents
Modern retrieval agents expose many configuration choices -- LLM, retriever, number of documents, number of hops, and synthesis strategy -- each shaping both answer quality and ser…
We reduced RAG retrieval cost 10× with a hippocampus-inspired memory substrate
Build smarter. Grow faster. AI systems for ambitious businesses.…
RAG - Sparse Embedding
Sparse means thinly spread, scattered, or not dense. In sparse embeddings, chunks are converted into...…
What is RAG? A Beginner's Guide to Retrieval-Augmented Generation (For Engineers Who Actually Build It)
RAG sounds complicated. It's not. But a lot of introductions to RAG make it sound more mysterious...…
Layered retrieval beats grep alone for LLM-generated engineering docs
Empirical study: layered retrieval (typed→semantic→grep) scores 0.954 for LLM-generated engineering artifacts. 5 conditions, 3 model tiers, 36 generated ADRs, 23 score files. - rdu…
TIGER: Text-Informed Generalized Enzyme-Reaction Retrieval
Enzyme-reaction retrieval is a fundamental problem in computational biology, underpinning enzyme characterization, reaction mechanism elucidation, and the rational design of metabo…
Privacy-Preserving Local Language Models for Longitudinal Data Retrieval in Chronic Dermatologic Disease: Implementation in Pemphigus Patients
Chronic dermatologic diseases such as pemphigus require long-term follow-up, generating extensive longitudinal clinical documentation that is difficult to review comprehensively du…
Aiki my local Wikipedia Retrieval-Augmented Generation system [R]
Why prompt debt, retrieval debt, and evaluation debt are quietly reshaping enterprise AI risk
RAG Explained: How Retrieval-Augmented Generation Actually Works
A visual walkthrough of RAG's two pipelines — ingestion and query — covering chunking, embeddings, vector databases, and why it beats sending all your text to an LLM.…
LFRAG: Layout-oriented Fine-grained Retrieval-Augmented Generation on Multimodal Document Understanding
Multimodal Retrieval-Augmented Generation (RAG) has emerged as an effective paradigm for enhancing Large Language Models (LLMs) with external knowledge. However, existing multimoda…
RAG4Outcome: A Retrieval-Augmented Multimodal Framework for Prognostic Prediction in Chronic Osteomyelitis
Chronic osteomyelitis presents substantial prognostic challenges due to its high recurrence risk and complex postoperative recovery trajectories. Traditional assessment often relie…
ObjectCache: Layerwise Object-Storage Retrieval for KV Cache Reuse
Prefix KV caching has become a key mechanism in LLM serving: it reduces time to first token (TTFT) by avoiding redundant computation across requests that share a prefix (i.e., the …
A measurement substrate for agentic Kubernetes operations: Methodology and a case study in retrieval-compounding falsification
Empirical claims about autonomous Kubernetes operations agents are largely unfalsifiable. Published work reports observational results without controlled comparisons against an age…
When recall plateaus: the late-interaction technique most teams skip
A founder we work with had been stuck on the same problem for two months. Their RAG retrieval recall...…
Long-Context Models Killed RAG. Except for the 6 Cases Where They Made It Worse.
Long-context didn't kill retrieval. It buried it in cases where retrieval still beats a 1M token window on accuracy, not just price.…
AI Visibility Engineering Glossary – AEO, Geo, LLM Retrieval
51 canonical definitions for the AI Visibility Engineering discipline. The official terminology reference for the AIMENSION™ Protocol by Axon System.…
Photo retrieval from Iphone
From Manual RAG to Real Retrieval — Embedding-Based RAG with NVIDIA NIM
Replace hardcoded context with real retrieval using NVIDIA's nv-embedqa-e5-v5 embedding model. Cosine similarity, the query vs passage input distinction most beginners get wrong, n…
I built a self-hosted RAG system for Journalism — What Production Retrieval Taught Me
Over the last few months, I built Atlas — a fully self-hosted retrieval system designed for...…
AgentCo-op: Retrieval-Based Synthesis of Interoperable Multi-Agent Workflows
Designing multi-agent workflows is especially difficult in open-ended scientific settings where tasks lack curated training sets, reliable scalar evaluation metrics, and standardiz…
Efficient Table QA via TableGrid Navigation and Progressive Inference Prompting
Large Language Models (LLMs) have shown promising results on NLP tasks, however, their performance on tabular data still needs research attention, because Table Question-Answering …
Retrieval-Augmented Long-Context Translation for Cultural Image Captioning: Gators submission for AmericasNLP 2026 shared task
We present the University of Florida Gators submission to the AmericasNLP 2026 shared task on cultural image captioning for Indigenous languages. Our two-stage pipeline generates a…
DIVE: Embedding Compression via Self-Limiting Gradient Updates
High-dimensional embeddings from large language models impose significant storage and computational costs on vector search systems. Recent embedding compression methods, including …
GraphRAG on Consumer Hardware: Benchmarking Local LLMs for Healthcare EHR Schema Retrieval
Graph-based Retrieval Augmented Generation (GraphRAG) extends retrieval-augmented generation to support structured reasoning over complex corpora, but its reliability under resourc…
I rebuilt my Financial Mentor retrieval from scratch. Here's everything the RAG stack taught me
From stuffing JSON into Claude to GraphRAG, hybrid search, CRAG, and adversarial evaluation — the...…
Man joins father in jail after being found guilty of deadly cocaine retrieval
A young Gold Coast man has joined his father in jail after being found guilty of a deadly underwater cocaine retrieval at Newcastle.…
DOTRAG: Retrieval-Time Reasoning Along Paths
Graph Retrieval-Augmented Generation (GraphRAG) is dominated by a retrieve-then-reason paradigm, where context is retrieved using heuristics and then reasoned over. Such methods st…
ALDEN: Boosting Private Data Extraction from Retrieval-Augmented Generation Systems via Active Learning and Distribution Estimation
Retrieval-Augmented Generation (RAG) is widely used to augment large language models with external knowledge retrieval to improve reliability and generalization. However, recent st…
Query-Conditioned Graph Retrieval for Contextualized LLM Reasoning in Personalized Wearable Data
Large language models (LLMs) are increasingly applied to analyzing wearable sensing data, which are long-term, multimodal, and highly personalized. A key challenge is context selec…
STAR: Semantic-Tuned and Tail-Adaptive Retriever for Graph-Augmented Generation
To augment Large Language Models (LLMs) for multi-hop question answering, a mainstream solution within Graph Retrieval Augmented Generation (GraphRAG) leverages lightweight retriev…
Retrieve Only Relevant Tables Whether Few or Many: Adaptive Table Retrieval Method
Retrieving relevant tables from extensive databases for a given natural language query is essential for accurately answering questions in tasks such as text-to-SQL. Existing table …
DualView: Adaptive Local-Global Fusion for Multi-Hop Document Reranking
Multi-hop question answering requires aggregating information from multiple documents, a critical capability for knowledge-intensive applications. A fundamental challenge lies in e…
ClusterRAG: Cluster-Based Collaborative Filtering for Personalized Retrieval-Augmented Generation
Personalized Retrieval-Augmented Generation (RAG) relies on accurately selecting user-relevant documents. In practice, existing RAG approaches often suffer from high retrieval cost…
Agentic GraphRAG: Navigating Unstructured Financial Data with Collaborative AI
We present a collaborative agentic GraphRAG framework for expert analysis of commercial registry data. Public registries are often formally accessible, yet difficult to use in prac…
Improving Retrieval-Augmented Generation without Taxonomy-based Error Categorization
Retrieval-Augmented Generation (RAG) improves the factual accuracy of large language model (LLM) outputs by grounding generation in external knowledge. Recent agentic RAG systems e…
M3DocDep: Multi-modal, Multi-page, Multi-document Dependency Chunking with Large Vision-Language Models
In long, multi-page industrial documents, retrieval-augmented generation (RAG) depends heavily on whether chunk boundaries follow the document's true structure. Existing text-centr…
Query-Aware Flow Diffusion for Graph-Based RAG with Retrieval Guarantees
Graph-based Retrieval-Augmented Generation (RAG) systems leverage interconnected knowledge structures to capture complex relationships that flat retrieval struggles with, enabling …
Mask-to-Correct$^+$: Leveraging Retriever Diversity for Masking-guided Faithful Fact Correction
The rapid spread of misinformation on social media highlights the need for robust, automated fact correction frameworks. However, existing works rely on supervised learning from ma…
A Reproducibility Analysis of PO4ISR: Diagnosing and Mitigating Semantic Drift in LLM-Based Session Recommendation
Reasoning-based Large Language Models (LLMs) like PO4ISR have set new benchmarks in session-based recommendation. However, the reproducibility of their reasoning capabilities acros…
RecoAtlas: From Semantic Plausibility to Set-Level Utility in LLM Recommendation Agents
LLM recommendation agents increasingly produce structured recommendation reports: sets of items accompanied by natural-language justifications. Yet existing evaluations often reduc…
KadiAssistant: A conversational AI Agent for information retrieval in Kadi4Mat
We introduce KadiAssistant, a privacy-by-design AI assistant integrated into the Kadi research data ecosystem, enabling researchers to efficiently access, aggregate, and synthesize…
The 99% Success Paradox: When Near-Perfect Retrieval Equals Random Selection
For most of the history of information retrieval (IR), search results were designed for human consumers who could scan, filter, and discard irrelevant information on their own. Thi…
Show HN: Nano-RAG – Agentic multi-hog retrieval without graph database
RAGA: Reading-And-Graph-building-Agent for Autonomous Knowledge Graph Construction and Retrieval-Augmented Generation
Existing LLM-driven knowledge graph (KG) construction methods predominantly employ stateless batch processing pipelines, exhibiting structural deficiencies in cross-chunk semantic …
Surface-Form Neural Sparse Retrieval: Robust Fuzzy Matching for Industrial Music Search
Music search at the scale of Amazon Music presents a unique challenge: queries frequently deviate from indexed metadata due to misspellings, transpositions, and phonetic variations…
LAST-RAG: Literature-Anchored Stochastic Trajectory Retrieval-Augmented Generation for Knowledge-Conditioned Degradation Model Selection
Stochastic-process-based degradation modeling is a core approach for estimating the distribution of remaining useful life (RUL); however, the selection of an appropriate stochastic…