Search: "llm agents" — WeSearch Press

ARXIV CS.AI

Your Reviews Replicate You: LLM-Based Agents as Customer Digital Twins for Conjoint Analysis

Conjoint analysis is a cornerstone of market research for estimating consumer preferences; however, traditional methods face persistent challenges regarding time, cost, and respondent fatigue. To addr…

Wed, 29 Apr 2026 04:04:25 GMT · 2 views

ast-outline: a parallel structural code summarizer written in Rust (5–10x token savings for LLM agents)

I just open-sourced ast-outline – a fast, zero-dependency CLI tool that extracts the structural outline of source files (classes, functions, signatures, fields, doc comments + line numbers) and drops …

Sun, 26 Apr 2026 05:36:44 GMT · 8 views

ARXIV.ORG

HeLa-Mem: Hebbian Learning and Associative Memory for LLM Agents

Long-term memory is a critical challenge for Large Language Model agents, as fixed context windows cannot preserve coherence across extended interactions. Existing memory systems represent conversatio…

Tue, 28 Apr 2026 13:14:59 GMT · 13 views

ARXIV.ORG

From Coarse to Fine: Self-Adaptive Hierarchical Planning for LLM Agents

Large language model-based agents have recently emerged as powerful approaches for solving dynamic and multi-step tasks. Most existing agents employ planning mechanisms to guide long-term actions in d…

Tue, 28 Apr 2026 04:13:21 GMT · 3 views

DEV COMMUNITY

TradingAgents v0.2.4: A Multi-Agent LLM Framework That Simulates an Entire Trading Firm

TL;DR UCLA Tauric Research released TradingAgents v0.2.4 (2026-04-25) — a LangGraph-based...…

Tue, 28 Apr 2026 01:54:30 GMT · 3 views

ARXIV CS.AI

Complete Cyclic Subtask Graphs for Tool-Using LLM Agents: Flexibility, Cost, and Bottlenecks in Multi-Agent Workflows

Long-horizon tool-using tasks sometimes benefit from revisiting earlier subtasks for recovery and exploration, but added multi-agent workflow flexibility can also introduce coordination overhead and s…

Wed, 29 Apr 2026 04:04:25 GMT · 1 view

ARXIV CS.AI

Can Current Agents Close the Discovery-to-Application Gap? A Case Study in Minecraft

Discovering causal regularities and applying them to build functional systems--the discovery-to-application loop--is a hallmark of general intelligence, yet evaluating this capacity has been hindered …

Wed, 29 Apr 2026 04:04:25 GMT · 1 view

ARXIV CS.AI

How Do AI Agents Spend Your Money? Analyzing and Predicting Token Consumption in Agentic Coding Tasks

The wide adoption of AI agents in complex human workflows is driving rapid growth in LLM token consumption. When agents are deployed on tasks that require a significant amount of tokens, three questio…

Wed, 29 Apr 2026 04:04:25 GMT · 1 view

ARXIV CS.AI

Quantifying Divergence in Inter-LLM Communication Through API Retrieval and Ranking

Large language models (LLMs) increasingly operate as autonomous agents that reason over external APIs to perform complex tasks. However, their reliability and agreement remain poorly characterized. We…

Wed, 29 Apr 2026 04:04:25 GMT · 1 view

ARXIV.ORG

Mitigating Belief Inertia via Active Intervention in Embodied Agents

Recent advancements in large language models (LLMs) have enabled agents to tackle complex embodied tasks through environmental interaction. However, these agents still make suboptimal decisions and pe…

Tue, 28 Apr 2026 08:54:13 GMT · 4 views

ARXIV.ORG

Analytica: Soft Propositional Reasoning for Robust and Scalable LLM-Driven Analysis

Large language model (LLM) agents are increasingly tasked with complex real-world analysis (e.g., in financial forecasting, scientific discovery), yet their reasoning suffers from stochastic instabili…

Tue, 28 Apr 2026 04:13:21 GMT · 3 views

ARXIV.ORG

PhySE: A Psychological Framework for Real-Time AR-LLM Social Engineering Attacks

The emerging threat of AR-LLM-based Social Engineering (AR-LLM-SE) attacks (e.g. SEAR) poses a significant risk to real-world social interactions. In such an attack, a malicious actor uses Augmented R…

Tue, 28 Apr 2026 04:13:21 GMT · 3 views

ARXIV.ORG

LEGO: An LLM Skill-Based Front-End Design Generation Platform

Existing LLM-based EDA agents are often isolated task-specific systems. This leads to repeated engineering effort and limited reuse of successful design and debugging strategies. We present LEGO, a un…

Tue, 28 Apr 2026 04:13:21 GMT · 3 views

ARXIV.ORG

SoccerRef-Agents: Multi-Agent System for Automated Soccer Refereeing

Refereeing is vital in sports, where fair, accurate, and explainable decisions are fundamental. While intelligent assistant technologies are being widely adopted in soccer refereeing, current AI-assis…

Tue, 28 Apr 2026 04:13:21 GMT · 4 views

ARXIV.ORG

MarketBench: Evaluating AI Agents as Market Participants

Markets are a promising way to coordinate AI agent activity for similar reasons to those used to justify markets more broadly. In order to effectively participate in markets, agents need to have infor…

Tue, 28 Apr 2026 04:13:21 GMT · 3 views

PROMPTENGINEERING

Tool for inline annotation of LLM-generated specs and prompts (works with any MCP client)

I'm a product manager and spend a lot of time iterating on long prompts and specs that AI agents then act on. The review loop has been the worst part. When the model gives me a 5-page draft, leaving f…

Tue, 28 Apr 2026 19:00:54 GMT · 14 views

GITHUB

Claude Leak Confirms It: LLM Systems Are Architecture, Not Prompts (Orca)

Agents should execute whenever possible — runtime for composable AI agent skills - gfernandf/agent-skills…

Tue, 28 Apr 2026 17:22:08 GMT · 6 views

MEDIUM

Agents Are Microservices with a Brain

We solved this in 2010. It was called microservices. Now we're making the same mistakes with LLMs.…

Tue, 28 Apr 2026 16:25:01 GMT · 3 views

ARXIV CS.AI

Learning in Blocks: A Multi Agent Debate Assisted Personalized Adaptive Learning Framework for Language Learning

Most digital language learning curricula rely on discrete-item quizzes that test recall rather than applied conversational proficiency. When progression is driven by quiz performance, learners can adv…

Wed, 29 Apr 2026 04:04:25 GMT · 1 view

GIZMODO

Claude-Powered Agent Apparently Deletes Company Database, Debases Itself Further in Confession

AI agents are powered by the same obsequious LLMs as consumer chatbots.…

Tue, 28 Apr 2026 09:34:13 GMT · 5 views

ARXIV.ORG

PExA: Parallel Exploration Agent for Complex Text-to-SQL

LLM-based agents for text-to-SQL often struggle with latency-performance trade-off, where performance improvements come at the cost of latency or vice versa. We reformulate text-to-SQL generation with…

Tue, 28 Apr 2026 04:13:21 GMT · 3 views

ARXIV.ORG

Discovering Agentic Safety Specifications from 1-Bit Danger Signals

Can large language model agents discover hidden safety objectives through experience alone? We introduce EPO-Safe (Experiential Prompt Optimization for Safe Agents), a framework where an LLM iterative…

Tue, 28 Apr 2026 04:13:21 GMT · 3 views

ARXIV.ORG

MetaGAI: A Large-Scale and High-Quality Benchmark for Generative AI Model and Data Card Generation

The rapid proliferation of Generative AI necessitates rigorous documentation standards for transparency and governance. However, manual creation of Model and Data Cards is not scalable, while automate…

Tue, 28 Apr 2026 04:13:21 GMT · 4 views

ARXIV.ORG

Vibe Medicine: Redefining Biomedical Research Through Human-AI Co-Work

With the emergence of large language models (LLMs) and AI agent frameworks, the human-AI co-work paradigm known as Vibe Coding is changing how people code, making it more accessible and productive. In…

Tue, 28 Apr 2026 04:13:21 GMT · 4 views

ARXIV.ORG

Grounding Before Generalizing: How AI Differs from Humans in Causal Transfer

Extracting abstract causal structures and applying them to novel situations is a hallmark of human intelligence. While Large Language Models (LLMs) and Vision Language Models (VLMs) have shown strong …

Tue, 28 Apr 2026 04:13:21 GMT · 3 views

ARXIV.ORG

Beyond the Attention Stability Boundary: Agentic Self-Synthesizing Reasoning Protocols

As LLM agents transition to autonomous digital coworkers, maintaining deterministic goal-directedness in non-linear multi-turn conversations emerged as an architectural bottleneck. We identify and for…

Tue, 28 Apr 2026 04:13:21 GMT · 3 views

ARXIV.ORG

Evaluating whether AI models would sabotage AI safety research

We evaluate the propensity of frontier models to sabotage or refuse to assist with safety research when deployed as AI research agents within a frontier AI company. We apply two complementary evaluati…

Tue, 28 Apr 2026 04:13:21 GMT · 3 views

Results for "llm agents".

Your Reviews Replicate You: LLM-Based Agents as Customer Digital Twins for Conjoint Analysis

ast-outline: a parallel structural code summarizer written in Rust (5–10x token savings for LLM agents)

HeLa-Mem: Hebbian Learning and Associative Memory for LLM Agents

From Coarse to Fine: Self-Adaptive Hierarchical Planning for LLM Agents

TradingAgents v0.2.4: A Multi-Agent LLM Framework That Simulates an Entire Trading Firm

Complete Cyclic Subtask Graphs for Tool-Using LLM Agents: Flexibility, Cost, and Bottlenecks in Multi-Agent Workflows

Can Current Agents Close the Discovery-to-Application Gap? A Case Study in Minecraft

How Do AI Agents Spend Your Money? Analyzing and Predicting Token Consumption in Agentic Coding Tasks

Quantifying Divergence in Inter-LLM Communication Through API Retrieval and Ranking

Mitigating Belief Inertia via Active Intervention in Embodied Agents

Analytica: Soft Propositional Reasoning for Robust and Scalable LLM-Driven Analysis

PhySE: A Psychological Framework for Real-Time AR-LLM Social Engineering Attacks

LEGO: An LLM Skill-Based Front-End Design Generation Platform

SoccerRef-Agents: Multi-Agent System for Automated Soccer Refereeing

MarketBench: Evaluating AI Agents as Market Participants

Tool for inline annotation of LLM-generated specs and prompts (works with any MCP client)

Claude Leak Confirms It: LLM Systems Are Architecture, Not Prompts (Orca)

Agents Are Microservices with a Brain

Learning in Blocks: A Multi Agent Debate Assisted Personalized Adaptive Learning Framework for Language Learning

Claude-Powered Agent Apparently Deletes Company Database, Debases Itself Further in Confession

PExA: Parallel Exploration Agent for Complex Text-to-SQL

Discovering Agentic Safety Specifications from 1-Bit Danger Signals

MetaGAI: A Large-Scale and High-Quality Benchmark for Generative AI Model and Data Card Generation

Vibe Medicine: Redefining Biomedical Research Through Human-AI Co-Work

Grounding Before Generalizing: How AI Differs from Humans in Causal Transfer

Beyond the Attention Stability Boundary: Agentic Self-Synthesizing Reasoning Protocols

Evaluating whether AI models would sabotage AI safety research

Or browse by topic