WeSearch
Hub / Search / llm agents
SEARCH · LLM AGENTS

Results for "llm agents".

27 stories match your query across our 700+ source catalog. Ranked by relevance and recency.

27 results for "llm agents"

ARXIV CS.AI

Your Reviews Replicate You: LLM-Based Agents as Customer Digital Twins for Conjoint Analysis

Conjoint analysis is a cornerstone of market research for estimating consumer preferences; however, traditional methods face persistent challenges regarding time, cost, and respondent fatigue. To addr…

· 2 views
REDDIT

ast-outline: a parallel structural code summarizer written in Rust (5–10x token savings for LLM agents)

I just open-sourced ast-outline – a fast, zero-dependency CLI tool that extracts the structural outline of source files (classes, functions, signatures, fields, doc comments + line numbers) and drops …

· 8 views
ARXIV.ORG

HeLa-Mem: Hebbian Learning and Associative Memory for LLM Agents

Long-term memory is a critical challenge for Large Language Model agents, as fixed context windows cannot preserve coherence across extended interactions. Existing memory systems represent conversatio…

· 13 views
ARXIV.ORG

From Coarse to Fine: Self-Adaptive Hierarchical Planning for LLM Agents

Large language model-based agents have recently emerged as powerful approaches for solving dynamic and multi-step tasks. Most existing agents employ planning mechanisms to guide long-term actions in d…

· 3 views
DEV COMMUNITY

TradingAgents v0.2.4: A Multi-Agent LLM Framework That Simulates an Entire Trading Firm

TL;DR UCLA Tauric Research released TradingAgents v0.2.4 (2026-04-25) — a LangGraph-based...…

· 3 views
ARXIV CS.AI

Complete Cyclic Subtask Graphs for Tool-Using LLM Agents: Flexibility, Cost, and Bottlenecks in Multi-Agent Workflows

Long-horizon tool-using tasks sometimes benefit from revisiting earlier subtasks for recovery and exploration, but added multi-agent workflow flexibility can also introduce coordination overhead and s…

· 1 view
ARXIV CS.AI

Can Current Agents Close the Discovery-to-Application Gap? A Case Study in Minecraft

Discovering causal regularities and applying them to build functional systems--the discovery-to-application loop--is a hallmark of general intelligence, yet evaluating this capacity has been hindered …

· 1 view
ARXIV CS.AI

How Do AI Agents Spend Your Money? Analyzing and Predicting Token Consumption in Agentic Coding Tasks

The wide adoption of AI agents in complex human workflows is driving rapid growth in LLM token consumption. When agents are deployed on tasks that require a significant amount of tokens, three questio…

· 1 view
ARXIV CS.AI

Quantifying Divergence in Inter-LLM Communication Through API Retrieval and Ranking

Large language models (LLMs) increasingly operate as autonomous agents that reason over external APIs to perform complex tasks. However, their reliability and agreement remain poorly characterized. We…

· 1 view
ARXIV.ORG

Mitigating Belief Inertia via Active Intervention in Embodied Agents

Recent advancements in large language models (LLMs) have enabled agents to tackle complex embodied tasks through environmental interaction. However, these agents still make suboptimal decisions and pe…

· 4 views
ARXIV.ORG

Analytica: Soft Propositional Reasoning for Robust and Scalable LLM-Driven Analysis

Large language model (LLM) agents are increasingly tasked with complex real-world analysis (e.g., in financial forecasting, scientific discovery), yet their reasoning suffers from stochastic instabili…

· 3 views
ARXIV.ORG

PhySE: A Psychological Framework for Real-Time AR-LLM Social Engineering Attacks

The emerging threat of AR-LLM-based Social Engineering (AR-LLM-SE) attacks (e.g. SEAR) poses a significant risk to real-world social interactions. In such an attack, a malicious actor uses Augmented R…

· 3 views
ARXIV.ORG

LEGO: An LLM Skill-Based Front-End Design Generation Platform

Existing LLM-based EDA agents are often isolated task-specific systems. This leads to repeated engineering effort and limited reuse of successful design and debugging strategies. We present LEGO, a un…

· 3 views
ARXIV.ORG

SoccerRef-Agents: Multi-Agent System for Automated Soccer Refereeing

Refereeing is vital in sports, where fair, accurate, and explainable decisions are fundamental. While intelligent assistant technologies are being widely adopted in soccer refereeing, current AI-assis…

· 4 views
ARXIV.ORG

MarketBench: Evaluating AI Agents as Market Participants

Markets are a promising way to coordinate AI agent activity for similar reasons to those used to justify markets more broadly. In order to effectively participate in markets, agents need to have infor…

· 3 views
PROMPTENGINEERING

Tool for inline annotation of LLM-generated specs and prompts (works with any MCP client)

I'm a product manager and spend a lot of time iterating on long prompts and specs that AI agents then act on. The review loop has been the worst part. When the model gives me a 5-page draft, leaving f…

· 14 views
GITHUB

Claude Leak Confirms It: LLM Systems Are Architecture, Not Prompts (Orca)

Agents should execute whenever possible — runtime for composable AI agent skills - gfernandf/agent-skills…

· 6 views
MEDIUM

Agents Are Microservices with a Brain

We solved this in 2010. It was called microservices. Now we're making the same mistakes with LLMs.…

· 3 views
ARXIV CS.AI

Learning in Blocks: A Multi Agent Debate Assisted Personalized Adaptive Learning Framework for Language Learning

Most digital language learning curricula rely on discrete-item quizzes that test recall rather than applied conversational proficiency. When progression is driven by quiz performance, learners can adv…

· 1 view
GIZMODO

Claude-Powered Agent Apparently Deletes Company Database, Debases Itself Further in Confession

AI agents are powered by the same obsequious LLMs as consumer chatbots.…

· 5 views
ARXIV.ORG

PExA: Parallel Exploration Agent for Complex Text-to-SQL

LLM-based agents for text-to-SQL often struggle with latency-performance trade-off, where performance improvements come at the cost of latency or vice versa. We reformulate text-to-SQL generation with…

· 3 views
ARXIV.ORG

Discovering Agentic Safety Specifications from 1-Bit Danger Signals

Can large language model agents discover hidden safety objectives through experience alone? We introduce EPO-Safe (Experiential Prompt Optimization for Safe Agents), a framework where an LLM iterative…

· 3 views
ARXIV.ORG

MetaGAI: A Large-Scale and High-Quality Benchmark for Generative AI Model and Data Card Generation

The rapid proliferation of Generative AI necessitates rigorous documentation standards for transparency and governance. However, manual creation of Model and Data Cards is not scalable, while automate…

· 4 views
ARXIV.ORG

Vibe Medicine: Redefining Biomedical Research Through Human-AI Co-Work

With the emergence of large language models (LLMs) and AI agent frameworks, the human-AI co-work paradigm known as Vibe Coding is changing how people code, making it more accessible and productive. In…

· 4 views
ARXIV.ORG

Grounding Before Generalizing: How AI Differs from Humans in Causal Transfer

Extracting abstract causal structures and applying them to novel situations is a hallmark of human intelligence. While Large Language Models (LLMs) and Vision Language Models (VLMs) have shown strong …

· 3 views
ARXIV.ORG

Beyond the Attention Stability Boundary: Agentic Self-Synthesizing Reasoning Protocols

As LLM agents transition to autonomous digital coworkers, maintaining deterministic goal-directedness in non-linear multi-turn conversations emerged as an architectural bottleneck. We identify and for…

· 3 views
ARXIV.ORG

Evaluating whether AI models would sabotage AI safety research

We evaluate the propensity of frontier models to sabotage or refuse to assist with safety research when deployed as AI research agents within a frontier AI company. We apply two complementary evaluati…

· 3 views