WeSearch
Hub / Search / system failure
SEARCH · SYSTEM FAILURE

Results for "system failure".

17 stories match your query across our 700+ source catalog. Ranked by relevance and recency.

17 results for "system failure"

ARTIFICIAL INTELLIGENCE (AI)

The One Substrate Failure Behind Every AI System in 2026

· 3 views
GRITH

Five AI Agent Failures in 36 Days. Zero Times the Agent Caught It

Between March 18 and April 22, 2026, public failures at Meta, Mercor, CrewAI, Vercel, and Bitwarden all pointed at the same missing layer: the system acted, and someone else noticed later.…

· 12 views
WESPISER

AI Can Find the Code. It Didn't Know How the System Worked

21 bug fixes, two models, same failures. Better LLMs marginally improve things, but still failed on system boundaries and integration.…

· 3 views
ARXIV.ORG

IndustryAssetEQA: A Neurosymbolic Operational Intelligence System for Embodied Question Answering in Industrial Asset Maintenance

Industrial maintenance environments increasingly rely on AI systems to assist operators in understanding asset behavior, diagnosing failures, and evaluating interventions. Although large language mode…

· 3 views
ARXIV.ORG

Failure-Centered Runtime Evaluation for Deployed Trilingual Public-Space Agents

This paper presents PSA-Eval, a failure-centered runtime evaluation framework for deployed trilingual public-space agents. The central claim is that, when the evaluation object shifts from a static in…

· 5 views
ARXIV.ORG

QED: An Open-Source Multi-Agent System for Generating Mathematical Proofs on Open Problems

We explore a central question in AI for mathematics: can AI systems produce original, nontrivial proofs for open research problems? Despite strong benchmark performance, producing genuinely novel proo…

· 6 views
ARXIV.ORG

The Controllability Trap: A Governance Framework for Military AI Agents

Agentic AI systems - capable of goal interpretation, world modeling, planning, tool use, long-horizon operation, and autonomous coordination - introduce distinct control failures not addressed by exis…

· 1 view
THE INDEPENDENT

'It took nine seconds': Claude AI agent deletes company's database

PocketOS founder says ‘systemic failures’ with AI infrastructure made catastrophic failure inevitable…

· 1 view
ARXIV.ORG

Architectural Requirements for Agentic AI Containment

The April 2026 disclosure that a frontier large language model escaped its security sandbox, executed unauthorized actions, and concealed its modifications to version control history demonstrates that…

· 3 views
SUBSTACK

Why the same LLM gives different answers in different environments

What I found diagnosing a failure mode in my own system, and the moment retrieval turned out to be already shaped before it started…

· 6 views
ARXIV.ORG

AI Identity: Standards, Gaps, and Research Directions for AI Agents

AI agents are now running real transactions, workflows, and sub-agent chains across organizational boundaries without continuous human supervision. This creates a problem no current infrastructure is …

· 3 views
ARXIV.ORG

Agentic Adversarial Rewriting Exposes Architectural Vulnerabilities in Black-Box NLP Pipelines

Multi-component natural language processing (NLP) pipelines are increasingly deployed for high-stakes decisions, yet no existing adversarial method can test their robustness under realistic conditions…

· 3 views
ARXIV.ORG

When AI reviews science: Can we trust the referee?

The volume of scientific submissions continues to climb, outpacing the capacity of qualified human referees and stretching editorial timelines. At the same time, modern large language models (LLMs) of…

· 3 views
ARXIV.ORG

Information-Theoretic Measures in AI: A Practical Decision Guide

Information-theoretic (IT) measures are ubiquitous in artificial intelligence: entropy drives decision-tree splits and uncertainty quantification, cross-entropy is the default classification loss, mut…

· 3 views
ARXIV.ORG

Beyond the Attention Stability Boundary: Agentic Self-Synthesizing Reasoning Protocols

As LLM agents transition to autonomous digital coworkers, maintaining deterministic goal-directedness in non-linear multi-turn conversations emerged as an architectural bottleneck. We identify and for…

· 3 views
ARXIV.ORG

FastOMOP: A Foundational Architecture for Reliable Agentic Real-World Evidence Generation on OMOP CDM data

The Observational Medical Outcomes Partnership Common Data Model (OMOP CDM), maintained by the Observational Health Data Sciences and Informatics (OHDSI) collaboration, enabled the harmonisation of el…

· 3 views
ARXIV.ORG

The Price of Agreement: Measuring LLM Sycophancy in Agentic Financial Applications

Given the increased use of LLMs in financial systems today, it becomes important to evaluate the safety and robustness of such systems. One failure mode that LLMs frequently display in general domain …

· 3 views