WeSearch
Hub / Search / llm agent
SEARCH · LLM AGENT

Results for "llm agent".

30 stories match your query across our 700+ source catalog. Ranked by relevance and recency.

30 results for "llm agent"

ARXIV.ORG

Towards Automated Ontology Generation from Unstructured Text: A Multi-Agent LLM Approach

Automatically generating formal ontologies from unstructured natural language remains a central challenge in knowledge engineering. While large language models (LLMs) show promise, it remains unclear …

· 3 views
ARXIV.ORG

GSAR: Typed Grounding for Hallucination Detection and Recovery in Multi-Agent LLMs

Autonomous multi-agent LLM systems are increasingly deployed to investigate operational incidents and produce structured diagnostic reports. Their trustworthiness hinges on whether each claim is groun…

· 3 views
ARXIV.ORG

ClawTrace: Cost-Aware Tracing for LLM Agent Skill Distillation

Skill-distillation pipelines learn reusable rules from LLM agent trajectories, but they lack a key signal: how much each step costs. Without per-step cost, a pipeline cannot distinguish adding a missi…

· 3 views
ARXIV.ORG

LLM-Guided Agentic Floor Plan Parsing for Accessible Indoor Navigation of Blind and Low-Vision People

Indoor navigation remains a critical accessibility challenge for the blind and low-vision (BLV) individuals, as existing solutions rely on costly per-building infrastructure. We present an agentic fra…

· 3 views
ARXIV.ORG

The Price of Agreement: Measuring LLM Sycophancy in Agentic Financial Applications

Given the increased use of LLMs in financial systems today, it becomes important to evaluate the safety and robustness of such systems. One failure mode that LLMs frequently display in general domain …

· 3 views
REDDIT

ast-outline: a parallel structural code summarizer written in Rust (5–10x token savings for LLM agents)

I just open-sourced ast-outline – a fast, zero-dependency CLI tool that extracts the structural outline of source files (classes, functions, signatures, fields, doc comments + line numbers) and drops …

· 7 views
GITHUB

Show HN: VoiceGoat – A vulnerable voice agent for practicing LLM attacks

A purposely vulnerable voice agent application for security practitioners to practice exploiting voice-based (and text based) AI systems. - redcaller/voice-goat…

· 1 view
ARXIV.ORG

HeLa-Mem: Hebbian Learning and Associative Memory for LLM Agents

Long-term memory is a critical challenge for Large Language Model agents, as fixed context windows cannot preserve coherence across extended interactions. Existing memory systems represent conversatio…

· 12 views
ARXIV.ORG

From Coarse to Fine: Self-Adaptive Hierarchical Planning for LLM Agents

Large language model-based agents have recently emerged as powerful approaches for solving dynamic and multi-step tasks. Most existing agents employ planning mechanisms to guide long-term actions in d…

· 3 views
DEV COMMUNITY

TradingAgents v0.2.4: A Multi-Agent LLM Framework That Simulates an Entire Trading Firm

TL;DR UCLA Tauric Research released TradingAgents v0.2.4 (2026-04-25) — a LangGraph-based...…

· 3 views
GITHUB

NARE: An LLM agent that amortizes reasoning into memory and executable rules

Contribute to starface77/Neuro-Adaptive-Reasoning-Engine development by creating an account on GitHub.…

· 3 views
SELF-HOSTED ALTERNATIVES TO PO

Self-hosting an LLM agent for incident response — does anyone here actually do this? What's working / not working?

· 3 views
ARXIV.ORG

LLMs Corrupt Your Documents When You Delegate

Large Language Models (LLMs) are poised to disrupt knowledge work, with the emergence of delegated work as a new interaction paradigm (e.g., vibe coding). Delegation requires trust - the expectation t…

· 3 views
DEV COMMUNITY

Two Nasty Gotchas When Building Multi-Agent Systems with Google ADK

Google's Agent Development Kit (ADK) makes it straightforward to compose LlmAgent instances into...…

· 3 views
GIZMODO

Claude-Powered Agent Apparently Deletes Company Database, Debases Itself Further in Confession

AI agents are powered by the same obsequious LLMs as consumer chatbots.…

· 4 views
ARXIV.ORG

Mitigating Belief Inertia via Active Intervention in Embodied Agents

Recent advancements in large language models (LLMs) have enabled agents to tackle complex embodied tasks through environmental interaction. However, these agents still make suboptimal decisions and pe…

· 2 views
ARXIV.ORG

PExA: Parallel Exploration Agent for Complex Text-to-SQL

LLM-based agents for text-to-SQL often struggle with latency-performance trade-off, where performance improvements come at the cost of latency or vice versa. We reformulate text-to-SQL generation with…

· 3 views
ARXIV.ORG

FormalScience: Scalable Human-in-the-Loop Autoformalisation of Science with Agentic Code Generation in Lean

Formalising informal mathematical reasoning into formally verifiable code is a significant challenge for large language models. In scientific fields such as physics, domain-specific machinery (\textit…

· 3 views
ARXIV.ORG

Don't Make the LLM Read the Graph: Make the Graph Think

We investigate whether explicit belief graphs improve LLM performance in cooperative multi-agent reasoning. Through 3,000+ controlled trials across four LLM families in the cooperative card game Hanab…

· 3 views
ARXIV.ORG

Analytica: Soft Propositional Reasoning for Robust and Scalable LLM-Driven Analysis

Large language model (LLM) agents are increasingly tasked with complex real-world analysis (e.g., in financial forecasting, scientific discovery), yet their reasoning suffers from stochastic instabili…

· 3 views
ARXIV.ORG

PhySE: A Psychological Framework for Real-Time AR-LLM Social Engineering Attacks

The emerging threat of AR-LLM-based Social Engineering (AR-LLM-SE) attacks (e.g. SEAR) poses a significant risk to real-world social interactions. In such an attack, a malicious actor uses Augmented R…

· 3 views
ARXIV.ORG

Discovering Agentic Safety Specifications from 1-Bit Danger Signals

Can large language model agents discover hidden safety objectives through experience alone? We introduce EPO-Safe (Experiential Prompt Optimization for Safe Agents), a framework where an LLM iterative…

· 2 views
ARXIV.ORG

CAP-CoT: Cycle Adversarial Prompt for Improving Chain of Thoughts in LLM Reasoning

Chain-of-Thought (CoT) prompting has emerged as a simple and effective way to elicit step-by-step solutions from large language models (LLMs). However, CoT reasoning can be unstable across runs on lon…

· 3 views
ARXIV.ORG

LEGO: An LLM Skill-Based Front-End Design Generation Platform

Existing LLM-based EDA agents are often isolated task-specific systems. This leads to repeated engineering effort and limited reuse of successful design and debugging strategies. We present LEGO, a un…

· 3 views
ARXIV.ORG

SoccerRef-Agents: Multi-Agent System for Automated Soccer Refereeing

Refereeing is vital in sports, where fair, accurate, and explainable decisions are fundamental. While intelligent assistant technologies are being widely adopted in soccer refereeing, current AI-assis…

· 2 views
ARXIV.ORG

Agentic Adversarial Rewriting Exposes Architectural Vulnerabilities in Black-Box NLP Pipelines

Multi-component natural language processing (NLP) pipelines are increasingly deployed for high-stakes decisions, yet no existing adversarial method can test their robustness under realistic conditions…

· 3 views
ARXIV.ORG

MarketBench: Evaluating AI Agents as Market Participants

Markets are a promising way to coordinate AI agent activity for similar reasons to those used to justify markets more broadly. In order to effectively participate in markets, agents need to have infor…

· 3 views
ARXIV.ORG

QED: An Open-Source Multi-Agent System for Generating Mathematical Proofs on Open Problems

We explore a central question in AI for mathematics: can AI systems produce original, nontrivial proofs for open research problems? Despite strong benchmark performance, producing genuinely novel proo…

· 5 views
ARXIV.ORG

Agentic clinical reasoning over longitudinal myeloma records: a retrospective evaluation against expert consensus

Multiple myeloma is managed through sequential lines of therapy over years to decades, with each decision depending on cumulative disease history distributed across dozens to hundreds of heterogeneous…

· 3 views
ARXIV.ORG

Beyond the Attention Stability Boundary: Agentic Self-Synthesizing Reasoning Protocols

As LLM agents transition to autonomous digital coworkers, maintaining deterministic goal-directedness in non-linear multi-turn conversations emerged as an architectural bottleneck. We identify and for…

· 3 views