Search: "llm agent" — WeSearch Press

ARXIV.ORG

Towards Automated Ontology Generation from Unstructured Text: A Multi-Agent LLM Approach

Automatically generating formal ontologies from unstructured natural language remains a central challenge in knowledge engineering. While large language models (LLMs) show promise, it remains unclear …

Tue, 28 Apr 2026 04:13:21 GMT · 3 views

ARXIV.ORG

GSAR: Typed Grounding for Hallucination Detection and Recovery in Multi-Agent LLMs

Autonomous multi-agent LLM systems are increasingly deployed to investigate operational incidents and produce structured diagnostic reports. Their trustworthiness hinges on whether each claim is groun…

Tue, 28 Apr 2026 04:13:21 GMT · 3 views

ARXIV.ORG

ClawTrace: Cost-Aware Tracing for LLM Agent Skill Distillation

Skill-distillation pipelines learn reusable rules from LLM agent trajectories, but they lack a key signal: how much each step costs. Without per-step cost, a pipeline cannot distinguish adding a missi…

Tue, 28 Apr 2026 04:13:21 GMT · 3 views

ARXIV.ORG

LLM-Guided Agentic Floor Plan Parsing for Accessible Indoor Navigation of Blind and Low-Vision People

Indoor navigation remains a critical accessibility challenge for the blind and low-vision (BLV) individuals, as existing solutions rely on costly per-building infrastructure. We present an agentic fra…

Tue, 28 Apr 2026 04:13:21 GMT · 3 views

ARXIV.ORG

The Price of Agreement: Measuring LLM Sycophancy in Agentic Financial Applications

Given the increased use of LLMs in financial systems today, it becomes important to evaluate the safety and robustness of such systems. One failure mode that LLMs frequently display in general domain …

Tue, 28 Apr 2026 04:13:21 GMT · 3 views

ast-outline: a parallel structural code summarizer written in Rust (5–10x token savings for LLM agents)

I just open-sourced ast-outline – a fast, zero-dependency CLI tool that extracts the structural outline of source files (classes, functions, signatures, fields, doc comments + line numbers) and drops …

Sun, 26 Apr 2026 05:36:44 GMT · 7 views

GITHUB

Show HN: VoiceGoat – A vulnerable voice agent for practicing LLM attacks

A purposely vulnerable voice agent application for security practitioners to practice exploiting voice-based (and text based) AI systems. - redcaller/voice-goat…

Tue, 28 Apr 2026 14:55:00 GMT · 1 view

ARXIV.ORG

HeLa-Mem: Hebbian Learning and Associative Memory for LLM Agents

Long-term memory is a critical challenge for Large Language Model agents, as fixed context windows cannot preserve coherence across extended interactions. Existing memory systems represent conversatio…

Tue, 28 Apr 2026 13:14:59 GMT · 12 views

ARXIV.ORG

From Coarse to Fine: Self-Adaptive Hierarchical Planning for LLM Agents

Large language model-based agents have recently emerged as powerful approaches for solving dynamic and multi-step tasks. Most existing agents employ planning mechanisms to guide long-term actions in d…

Tue, 28 Apr 2026 04:13:21 GMT · 3 views

DEV COMMUNITY

TradingAgents v0.2.4: A Multi-Agent LLM Framework That Simulates an Entire Trading Firm

TL;DR UCLA Tauric Research released TradingAgents v0.2.4 (2026-04-25) — a LangGraph-based...…

Tue, 28 Apr 2026 01:54:30 GMT · 3 views

GITHUB

NARE: An LLM agent that amortizes reasoning into memory and executable rules

Contribute to starface77/Neuro-Adaptive-Reasoning-Engine development by creating an account on GitHub.…

Tue, 28 Apr 2026 12:09:59 GMT · 3 views

SELF-HOSTED ALTERNATIVES TO PO

Self-hosting an LLM agent for incident response — does anyone here actually do this? What's working / not working?

Tue, 28 Apr 2026 03:24:30 GMT · 3 views

ARXIV.ORG

LLMs Corrupt Your Documents When You Delegate

Large Language Models (LLMs) are poised to disrupt knowledge work, with the emergence of delegated work as a new interaction paradigm (e.g., vibe coding). Delegation requires trust - the expectation t…

Tue, 28 Apr 2026 12:54:59 GMT · 3 views

DEV COMMUNITY

Two Nasty Gotchas When Building Multi-Agent Systems with Google ADK

Google's Agent Development Kit (ADK) makes it straightforward to compose LlmAgent instances into...…

Tue, 28 Apr 2026 09:54:13 GMT · 3 views

GIZMODO

Claude-Powered Agent Apparently Deletes Company Database, Debases Itself Further in Confession

AI agents are powered by the same obsequious LLMs as consumer chatbots.…

Tue, 28 Apr 2026 09:34:13 GMT · 4 views

ARXIV.ORG

Mitigating Belief Inertia via Active Intervention in Embodied Agents

Recent advancements in large language models (LLMs) have enabled agents to tackle complex embodied tasks through environmental interaction. However, these agents still make suboptimal decisions and pe…

Tue, 28 Apr 2026 08:54:13 GMT · 2 views

ARXIV.ORG

PExA: Parallel Exploration Agent for Complex Text-to-SQL

LLM-based agents for text-to-SQL often struggle with latency-performance trade-off, where performance improvements come at the cost of latency or vice versa. We reformulate text-to-SQL generation with…

Tue, 28 Apr 2026 04:13:21 GMT · 3 views

ARXIV.ORG

FormalScience: Scalable Human-in-the-Loop Autoformalisation of Science with Agentic Code Generation in Lean

Formalising informal mathematical reasoning into formally verifiable code is a significant challenge for large language models. In scientific fields such as physics, domain-specific machinery (\textit…

Tue, 28 Apr 2026 04:13:21 GMT · 3 views

ARXIV.ORG

Don't Make the LLM Read the Graph: Make the Graph Think

We investigate whether explicit belief graphs improve LLM performance in cooperative multi-agent reasoning. Through 3,000+ controlled trials across four LLM families in the cooperative card game Hanab…

Tue, 28 Apr 2026 04:13:21 GMT · 3 views

ARXIV.ORG

Analytica: Soft Propositional Reasoning for Robust and Scalable LLM-Driven Analysis

Large language model (LLM) agents are increasingly tasked with complex real-world analysis (e.g., in financial forecasting, scientific discovery), yet their reasoning suffers from stochastic instabili…

Tue, 28 Apr 2026 04:13:21 GMT · 3 views

ARXIV.ORG

PhySE: A Psychological Framework for Real-Time AR-LLM Social Engineering Attacks

The emerging threat of AR-LLM-based Social Engineering (AR-LLM-SE) attacks (e.g. SEAR) poses a significant risk to real-world social interactions. In such an attack, a malicious actor uses Augmented R…

Tue, 28 Apr 2026 04:13:21 GMT · 3 views

ARXIV.ORG

Discovering Agentic Safety Specifications from 1-Bit Danger Signals

Can large language model agents discover hidden safety objectives through experience alone? We introduce EPO-Safe (Experiential Prompt Optimization for Safe Agents), a framework where an LLM iterative…

Tue, 28 Apr 2026 04:13:21 GMT · 2 views

ARXIV.ORG

CAP-CoT: Cycle Adversarial Prompt for Improving Chain of Thoughts in LLM Reasoning

Chain-of-Thought (CoT) prompting has emerged as a simple and effective way to elicit step-by-step solutions from large language models (LLMs). However, CoT reasoning can be unstable across runs on lon…

Tue, 28 Apr 2026 04:13:21 GMT · 3 views

ARXIV.ORG

LEGO: An LLM Skill-Based Front-End Design Generation Platform

Existing LLM-based EDA agents are often isolated task-specific systems. This leads to repeated engineering effort and limited reuse of successful design and debugging strategies. We present LEGO, a un…

Tue, 28 Apr 2026 04:13:21 GMT · 3 views

ARXIV.ORG

SoccerRef-Agents: Multi-Agent System for Automated Soccer Refereeing

Refereeing is vital in sports, where fair, accurate, and explainable decisions are fundamental. While intelligent assistant technologies are being widely adopted in soccer refereeing, current AI-assis…

Tue, 28 Apr 2026 04:13:21 GMT · 2 views

ARXIV.ORG

Agentic Adversarial Rewriting Exposes Architectural Vulnerabilities in Black-Box NLP Pipelines

Multi-component natural language processing (NLP) pipelines are increasingly deployed for high-stakes decisions, yet no existing adversarial method can test their robustness under realistic conditions…

Tue, 28 Apr 2026 04:13:21 GMT · 3 views

ARXIV.ORG

MarketBench: Evaluating AI Agents as Market Participants

Markets are a promising way to coordinate AI agent activity for similar reasons to those used to justify markets more broadly. In order to effectively participate in markets, agents need to have infor…

Tue, 28 Apr 2026 04:13:21 GMT · 3 views

ARXIV.ORG

QED: An Open-Source Multi-Agent System for Generating Mathematical Proofs on Open Problems

We explore a central question in AI for mathematics: can AI systems produce original, nontrivial proofs for open research problems? Despite strong benchmark performance, producing genuinely novel proo…

Tue, 28 Apr 2026 04:13:21 GMT · 5 views

ARXIV.ORG

Agentic clinical reasoning over longitudinal myeloma records: a retrospective evaluation against expert consensus

Multiple myeloma is managed through sequential lines of therapy over years to decades, with each decision depending on cumulative disease history distributed across dozens to hundreds of heterogeneous…

Tue, 28 Apr 2026 04:13:21 GMT · 3 views

ARXIV.ORG

Beyond the Attention Stability Boundary: Agentic Self-Synthesizing Reasoning Protocols

As LLM agents transition to autonomous digital coworkers, maintaining deterministic goal-directedness in non-linear multi-turn conversations emerged as an architectural bottleneck. We identify and for…

Tue, 28 Apr 2026 04:13:21 GMT · 3 views

Results for "llm agent".

Towards Automated Ontology Generation from Unstructured Text: A Multi-Agent LLM Approach

GSAR: Typed Grounding for Hallucination Detection and Recovery in Multi-Agent LLMs

ClawTrace: Cost-Aware Tracing for LLM Agent Skill Distillation

LLM-Guided Agentic Floor Plan Parsing for Accessible Indoor Navigation of Blind and Low-Vision People

The Price of Agreement: Measuring LLM Sycophancy in Agentic Financial Applications

ast-outline: a parallel structural code summarizer written in Rust (5–10x token savings for LLM agents)

Show HN: VoiceGoat – A vulnerable voice agent for practicing LLM attacks

HeLa-Mem: Hebbian Learning and Associative Memory for LLM Agents

From Coarse to Fine: Self-Adaptive Hierarchical Planning for LLM Agents

TradingAgents v0.2.4: A Multi-Agent LLM Framework That Simulates an Entire Trading Firm

NARE: An LLM agent that amortizes reasoning into memory and executable rules

Self-hosting an LLM agent for incident response — does anyone here actually do this? What's working / not working?

LLMs Corrupt Your Documents When You Delegate

Two Nasty Gotchas When Building Multi-Agent Systems with Google ADK

Claude-Powered Agent Apparently Deletes Company Database, Debases Itself Further in Confession

Mitigating Belief Inertia via Active Intervention in Embodied Agents

PExA: Parallel Exploration Agent for Complex Text-to-SQL

FormalScience: Scalable Human-in-the-Loop Autoformalisation of Science with Agentic Code Generation in Lean

Don't Make the LLM Read the Graph: Make the Graph Think

Analytica: Soft Propositional Reasoning for Robust and Scalable LLM-Driven Analysis

PhySE: A Psychological Framework for Real-Time AR-LLM Social Engineering Attacks

Discovering Agentic Safety Specifications from 1-Bit Danger Signals

CAP-CoT: Cycle Adversarial Prompt for Improving Chain of Thoughts in LLM Reasoning

LEGO: An LLM Skill-Based Front-End Design Generation Platform

SoccerRef-Agents: Multi-Agent System for Automated Soccer Refereeing

Agentic Adversarial Rewriting Exposes Architectural Vulnerabilities in Black-Box NLP Pipelines

MarketBench: Evaluating AI Agents as Market Participants

QED: An Open-Source Multi-Agent System for Generating Mathematical Proofs on Open Problems

Agentic clinical reasoning over longitudinal myeloma records: a retrospective evaluation against expert consensus

Beyond the Attention Stability Boundary: Agentic Self-Synthesizing Reasoning Protocols

Or browse by topic