30 results for "agent systems"
Parallel Web Systems, founded by former Twitter CEO Parag Agrawal and which offers web search tools for AI agents, raised a $100M Series B at a $2B valuation (Belle Lin/Wall Street Journal)
Belle Lin / Wall Street Journal : Parallel Web Systems, founded by former Twitter CEO Parag Agrawal and which offers web search tools for AI agents, raised a $100M Series B at a $2B valuation — Parall…
Multi-Agent AI Systems Are Eating Single Agents
Single-agent architectures hit a wall the moment your task needs planning, research, and execution in parallel. Multi-agent systems solve this — but most tutorials skip the hard parts. This guide does…
I Built Multi-Agent Systems Before NEXT '26 — Here's What the New ADK, MCP & A2A Stack Actually Changes
This is a submission for the Google Cloud NEXT Writing Challenge I Built Multi-Agent...…
Two Nasty Gotchas When Building Multi-Agent Systems with Google ADK
Google's Agent Development Kit (ADK) makes it straightforward to compose LlmAgent instances into...…
OpenAI Agents SDK Tutorial: Build Multi-Agent AI Systems in Python (2025)
How to move beyond single-prompt chatbots and create AI workflows that plan, collaborate, and get things done — with working code you can run today.…
The Controllability Trap: A Governance Framework for Military AI Agents
Agentic AI systems - capable of goal interpretation, world modeling, planning, tool use, long-horizon operation, and autonomous coordination - introduce distinct control failures not addressed by exis…
NVIDIA Launches Nemotron 3 Nano Omni Model, Unifying Vision, Audio and Language for up to 9x More Efficient AI Agents
AI agent systems today juggle separate models for vision, speech and language — losing time and context as they pass data from one model to the other. Unveiled today, NVIDIA Nemotron 3 Nano Omni is an…
Architectural Requirements for Agentic AI Containment
The April 2026 disclosure that a frontier large language model escaped its security sandbox, executed unauthorized actions, and concealed its modifications to version control history demonstrates that…
Show HN: VoiceGoat – A vulnerable voice agent for practicing LLM attacks
A purposely vulnerable voice agent application for security practitioners to practice exploiting voice-based (and text based) AI systems. - redcaller/voice-goat…
HeLa-Mem: Hebbian Learning and Associative Memory for LLM Agents
Long-term memory is a critical challenge for Large Language Model agents, as fixed context windows cannot preserve coherence across extended interactions. Existing memory systems represent conversatio…
FormalScience: Scalable Human-in-the-Loop Autoformalisation of Science with Agentic Code Generation in Lean
Formalising informal mathematical reasoning into formally verifiable code is a significant challenge for large language models. In scientific fields such as physics, domain-specific machinery (\textit…
A Decoupled Human-in-the-Loop System for Controlled Autonomy in Agentic Workflows
AI agents are increasingly deployed to execute tasks and make decisions within agentic workflows, introducing new requirements for safe and controlled autonomy. Prior work has established the importan…
Active Inference: A method for Phenotyping Agency in AI systems?
The proliferation of agentic artificial intelligence has outpaced the conceptual tools needed to characterize agency in computational systems. Prevailing definitions mainly rely on autonomy and goal-d…
GSAR: Typed Grounding for Hallucination Detection and Recovery in Multi-Agent LLMs
Autonomous multi-agent LLM systems are increasingly deployed to investigate operational incidents and produce structured diagnostic reports. Their trustworthiness hinges on whether each claim is groun…
Agentic Adversarial Rewriting Exposes Architectural Vulnerabilities in Black-Box NLP Pipelines
Multi-component natural language processing (NLP) pipelines are increasingly deployed for high-stakes decisions, yet no existing adversarial method can test their robustness under realistic conditions…
Structural Enforcement of Goal Integrity in AI Agents via Separation-of-Powers Architecture
Recent evidence suggests that frontier AI systems can exhibit agentic misalignment, generating and executing harmful actions derived from internally constructed goals, even without explicit user reque…
ZenBrain: A Neuroscience-Inspired 7-Layer Memory Architecture for Autonomous AI Systems
Despite a century of empirical memory research, existing AI agent memory systems rely on system-engineering metaphors (virtual-memory paging, flat LLM storage, Zettelkasten notes), none integrating pr…
QED: An Open-Source Multi-Agent System for Generating Mathematical Proofs on Open Problems
We explore a central question in AI for mathematics: can AI systems produce original, nontrivial proofs for open research problems? Despite strong benchmark performance, producing genuinely novel proo…
Agentic clinical reasoning over longitudinal myeloma records: a retrospective evaluation against expert consensus
Multiple myeloma is managed through sequential lines of therapy over years to decades, with each decision depending on cumulative disease history distributed across dozens to hundreds of heterogeneous…
FastOMOP: A Foundational Architecture for Reliable Agentic Real-World Evidence Generation on OMOP CDM data
The Observational Medical Outcomes Partnership Common Data Model (OMOP CDM), maintained by the Observational Health Data Sciences and Informatics (OHDSI) collaboration, enabled the harmonisation of el…
The Price of Agreement: Measuring LLM Sycophancy in Agentic Financial Applications
Given the increased use of LLMs in financial systems today, it becomes important to evaluate the safety and robustness of such systems. One failure mode that LLMs frequently display in general domain …
I built Claude Code skills for writing agent prompts, grounded in prompt research
I've been building agentic systems for a while and wanted a more systematic approach to writing prompts. So I gathered papers, did some deep research and created guides on structure, format and prompt…
A 14-day “Growth Forge” sprint: build an AI-powered growth agent on a real stack
Sharing something that sits at the intersection of AI agents and growth systems. VideoDB (backend for video/audio for AI agents) is running a 14-day sprint called Growth Forge for 5 builders to design…
Claude Leak Confirms It: LLM Systems Are Architecture, Not Prompts (Orca)
Agents should execute whenever possible — runtime for composable AI agent skills - gfernandf/agent-skills…
Nvidia Nemotron 3 Nano Omni
Agentic systems often reason across screens, documents, audio, video, and text within a single perception‑to‑action loop. However, they still rely on fragmented model chains—separate stacks for vision…
LLMs Corrupt Your Documents When You Delegate
Large Language Models (LLMs) are poised to disrupt knowledge work, with the emergence of delegated work as a new interaction paradigm (e.g., vibe coding). Delegation requires trust - the expectation t…
Show HN: Minimal Linux sandboxes to manage AI-Generated Code with ease
Minimal Linux sandboxes for running untrusted code. Built for AI agents, build systems, and any scenario where you need to execute code you didn't write.…
A Systematic Approach for Large Language Models Debugging
Large language models (LLMs) have become central to modern AI workflows, powering applications from open-ended text generation to complex agent-based reasoning. However, debugging these models remains…
LEGO: An LLM Skill-Based Front-End Design Generation Platform
Existing LLM-based EDA agents are often isolated task-specific systems. This leads to repeated engineering effort and limited reuse of successful design and debugging strategies. We present LEGO, a un…
Information-Theoretic Measures in AI: A Practical Decision Guide
Information-theoretic (IT) measures are ubiquitous in artificial intelligence: entropy drives decision-tree splits and uncertainty quantification, cross-entropy is the default classification loss, mut…