14 results for "agent safety"
Discovering Agentic Safety Specifications from 1-Bit Danger Signals
Can large language model agents discover hidden safety objectives through experience alone? We introduce EPO-Safe (Experiential Prompt Optimization for Safe Agents), a framework where an LLM iterative…
Alpha Vision Introduces AI Agent for Construction Safety and Operations at ENR Future Tech 2026 - Morningstar
Comprehensive up-to-date news coverage, aggregated from sources all over the world by Google News.…
The Controllability Trap: A Governance Framework for Military AI Agents
Agentic AI systems - capable of goal interpretation, world modeling, planning, tool use, long-horizon operation, and autonomous coordination - introduce distinct control failures not addressed by exis…
Architectural Requirements for Agentic AI Containment
The April 2026 disclosure that a frontier large language model escaped its security sandbox, executed unauthorized actions, and concealed its modifications to version control history demonstrates that…
Structural Enforcement of Goal Integrity in AI Agents via Separation-of-Powers Architecture
Recent evidence suggests that frontier AI systems can exhibit agentic misalignment, generating and executing harmful actions derived from internally constructed goals, even without explicit user reque…
LLM-Guided Agentic Floor Plan Parsing for Accessible Indoor Navigation of Blind and Low-Vision People
Indoor navigation remains a critical accessibility challenge for the blind and low-vision (BLV) individuals, as existing solutions rely on costly per-building infrastructure. We present an agentic fra…
FastOMOP: A Foundational Architecture for Reliable Agentic Real-World Evidence Generation on OMOP CDM data
The Observational Medical Outcomes Partnership Common Data Model (OMOP CDM), maintained by the Observational Health Data Sciences and Informatics (OHDSI) collaboration, enabled the harmonisation of el…
Evaluating whether AI models would sabotage AI safety research
We evaluate the propensity of frontier models to sabotage or refuse to assist with safety research when deployed as AI research agents within a frontier AI company. We apply two complementary evaluati…
The Price of Agreement: Measuring LLM Sycophancy in Agentic Financial Applications
Given the increased use of LLMs in financial systems today, it becomes important to evaluate the safety and robustness of such systems. One failure mode that LLMs frequently display in general domain …
Governing What You Cannot Observe: Adaptive Runtime Governance for Autonomous AI Agents
Autonomous AI agents can remain fully authorized and still become unsafe as behavior drifts, adversaries adapt, and decision patterns shift without any code change. We propose the \textbf{Informationa…
Witnesses recount chaos at WHCA Dinner after shooting, Secret Service agents drew guns to evacuate Trump
Witness described chaos inside the ballroom as Secret Service rushed Trump and officials to safety during the White House Correspondents' Dinner shooting.…
The Pious Little Delete Button
A satirical look at AI safety theatre, agentic overreach, and the strange ritual of blaming users after the database is gone.…
Humanitarian aid turns to AI as crises outpace capacity
Purpose-designed AI agents with a focus on safety can provide critical assistance to vulnerable populations.…
CAP-CoT: Cycle Adversarial Prompt for Improving Chain of Thoughts in LLM Reasoning
Chain-of-Thought (CoT) prompting has emerged as a simple and effective way to elicit step-by-step solutions from large language models (LLMs). However, CoT reasoning can be unstable across runs on lon…