26 stories tagged with #information-retrieval, in publish-time order across the WeSearch catalog. Tag pages update as new stories ingest.
⌘ RSS feed for this tag → or search "Information Retrieval"
RAG - Sparse Embedding
Sparse means thinly spread, scattered, or not dense. In sparse embeddings, chunks are converted into...…
LFRAG: Layout-oriented Fine-grained Retrieval-Augmented Generation on Multimodal Document Understanding
Multimodal Retrieval-Augmented Generation (RAG) has emerged as an effective paradigm for enhancing Large Language Models (LLMs) with external knowledge. However, existing multimoda…
Efficient Table QA via TableGrid Navigation and Progressive Inference Prompting
Large Language Models (LLMs) have shown promising results on NLP tasks, however, their performance on tabular data still needs research attention, because Table Question-Answering …
DIVE: Embedding Compression via Self-Limiting Gradient Updates
High-dimensional embeddings from large language models impose significant storage and computational costs on vector search systems. Recent embedding compression methods, including …
DOTRAG: Retrieval-Time Reasoning Along Paths
Graph Retrieval-Augmented Generation (GraphRAG) is dominated by a retrieve-then-reason paradigm, where context is retrieved using heuristics and then reasoned over. Such methods st…
ALDEN: Boosting Private Data Extraction from Retrieval-Augmented Generation Systems via Active Learning and Distribution Estimation
Retrieval-Augmented Generation (RAG) is widely used to augment large language models with external knowledge retrieval to improve reliability and generalization. However, recent st…
Query-Conditioned Graph Retrieval for Contextualized LLM Reasoning in Personalized Wearable Data
Large language models (LLMs) are increasingly applied to analyzing wearable sensing data, which are long-term, multimodal, and highly personalized. A key challenge is context selec…
STAR: Semantic-Tuned and Tail-Adaptive Retriever for Graph-Augmented Generation
To augment Large Language Models (LLMs) for multi-hop question answering, a mainstream solution within Graph Retrieval Augmented Generation (GraphRAG) leverages lightweight retriev…
Retrieve Only Relevant Tables Whether Few or Many: Adaptive Table Retrieval Method
Retrieving relevant tables from extensive databases for a given natural language query is essential for accurately answering questions in tasks such as text-to-SQL. Existing table …
DualView: Adaptive Local-Global Fusion for Multi-Hop Document Reranking
Multi-hop question answering requires aggregating information from multiple documents, a critical capability for knowledge-intensive applications. A fundamental challenge lies in e…
ClusterRAG: Cluster-Based Collaborative Filtering for Personalized Retrieval-Augmented Generation
Personalized Retrieval-Augmented Generation (RAG) relies on accurately selecting user-relevant documents. In practice, existing RAG approaches often suffer from high retrieval cost…
Agentic GraphRAG: Navigating Unstructured Financial Data with Collaborative AI
We present a collaborative agentic GraphRAG framework for expert analysis of commercial registry data. Public registries are often formally accessible, yet difficult to use in prac…
Improving Retrieval-Augmented Generation without Taxonomy-based Error Categorization
Retrieval-Augmented Generation (RAG) improves the factual accuracy of large language model (LLM) outputs by grounding generation in external knowledge. Recent agentic RAG systems e…
M3DocDep: Multi-modal, Multi-page, Multi-document Dependency Chunking with Large Vision-Language Models
In long, multi-page industrial documents, retrieval-augmented generation (RAG) depends heavily on whether chunk boundaries follow the document's true structure. Existing text-centr…
Query-Aware Flow Diffusion for Graph-Based RAG with Retrieval Guarantees
Graph-based Retrieval-Augmented Generation (RAG) systems leverage interconnected knowledge structures to capture complex relationships that flat retrieval struggles with, enabling …
Mask-to-Correct$^+$: Leveraging Retriever Diversity for Masking-guided Faithful Fact Correction
The rapid spread of misinformation on social media highlights the need for robust, automated fact correction frameworks. However, existing works rely on supervised learning from ma…
A Reproducibility Analysis of PO4ISR: Diagnosing and Mitigating Semantic Drift in LLM-Based Session Recommendation
Reasoning-based Large Language Models (LLMs) like PO4ISR have set new benchmarks in session-based recommendation. However, the reproducibility of their reasoning capabilities acros…
RecoAtlas: From Semantic Plausibility to Set-Level Utility in LLM Recommendation Agents
LLM recommendation agents increasingly produce structured recommendation reports: sets of items accompanied by natural-language justifications. Yet existing evaluations often reduc…
KadiAssistant: A conversational AI Agent for information retrieval in Kadi4Mat
We introduce KadiAssistant, a privacy-by-design AI assistant integrated into the Kadi research data ecosystem, enabling researchers to efficiently access, aggregate, and synthesize…
The 99% Success Paradox: When Near-Perfect Retrieval Equals Random Selection
For most of the history of information retrieval (IR), search results were designed for human consumers who could scan, filter, and discard irrelevant information on their own. Thi…
SD-Search: On-Policy Hindsight Self-Distillation for Search-Augmented Reasoning
Search-augmented reasoning agents interleave internal reasoning with calls to an external retriever, and their performance relies on the quality of each issued query. However, unde…
Retrieval vs Representation in Knowledge Systems
Most modern knowledge systems optimize retrieval, and that is understandable. Search is visible, easy...…
X-SYNTH: Beyond Retrieval -- Enterprise Context Synthesis from Observed Human Attention
In enterprise operations, the context required for an AI agent task is scattered across systems of record, static information stores, and communication channels. What is stored is …
Agent4POI: Agentic Context-Conditioned Affordance Reasoning for Multimodal Point-of-Interest Recommendation
We introduce Agent4POI, the first POI recommendation framework that generates context-conditioned multimodal representations at recommendation time, rather than relying on static P…
Fortress: A Case Study in Stabilizing Search Recommendations via Temporal Data Augmentation and Feature Pruning
In search and recommendation systems, predictive models often suffer from temporal instability when certain input features introduce volatility in output scores. This instability c…
Differentially Private Motif-Preserving Multi-modal Hashing
Cross-modal hashing enables efficient retrieval by encoding images and text into compact binary codes. State-of-the-art methods rely on semantic similarity graphs derived from user…