Search: "ai errors" — WeSearch Press

CLAUDEAI

Claude Status Update : Claude.ai unavailable and elevated errors on the API on 2026-04-28T18:33:55.000Z

This is an automatic post triggered within 2 minutes of an official Claude system status update. Incident: Claude.ai unavailable and elevated errors on the API Check on progress and whether or not the…

Tue, 28 Apr 2026 19:14:32 GMT · 14 views

CLAUDEAI

Claude Status Update : Claude.ai unavailable and elevated errors on the API on 2026-04-28T17:51:36.000Z

This is an automatic post triggered within 2 minutes of an official Claude system status update. Incident: Claude.ai unavailable and elevated errors on the API Check on progress and whether or not the…

Tue, 28 Apr 2026 18:41:44 GMT · 13 views

DEV.TO (TOP)

Pylon: Self-Host Your Own AI Agent Pipeline That Fixes Sentry Errors via

Pylon is a self-hosted daemon that triggers sandboxed Claude Code agents from webhooks (Sentry, cron, chat) and reports results with human approval —…

Tue, 28 Apr 2026 20:16:24 GMT · 3 views

CLAUDEAI

Claude Status Update : Elevated errors on Claude Haiku 4.5 on 2026-04-28T12:38:38.000Z

Tue, 28 Apr 2026 12:54:59 GMT · 4 views

CLAUDE

Claude.ai is unavailable

Claude's Status Page - Claude.ai unavailable and elevated errors on the API.…

Tue, 28 Apr 2026 18:41:44 GMT · 12 views

ARXIV.ORG

CAP-CoT: Cycle Adversarial Prompt for Improving Chain of Thoughts in LLM Reasoning

Chain-of-Thought (CoT) prompting has emerged as a simple and effective way to elicit step-by-step solutions from large language models (LLMs). However, CoT reasoning can be unstable across runs on lon…

Tue, 28 Apr 2026 04:13:21 GMT · 3 views

ARXIV.ORG

When Corrective Hints Hurt: Prompt Design in Reasoner-Guided Repair of LLM Overcaution on Entailed Negations under OWL~2~DL

We report a reproducible error pattern in GPT-5.4 on OWL~2~DL compliance queries: the model frequently answers ``unknown'' when the reasoner-entailed answer is ``no'' under \emph{FunctionalProperty} c…

Tue, 28 Apr 2026 04:13:21 GMT · 4 views

ARXIV.ORG

FinGround: Detecting and Grounding Financial Hallucinations via Atomic Claim Verification

Financial AI systems must produce answers grounded in specific regulatory filings, yet current LLMs fabricate metrics, invent citations, and miscalculate derived quantities. These errors carry direct …

Tue, 28 Apr 2026 04:13:21 GMT · 3 views

ARXIV.ORG

CT-FineBench: A Diagnostic Fidelity Benchmark for Fine-Grained Evaluation of CT Report Generation

The evaluation of generated reports remains a critical challenge in Computed Tomography (CT) report generation, due to the large volume of text, the diversity and complexity of findings, and the prese…

Tue, 28 Apr 2026 04:13:21 GMT · 6 views

ARXIV.ORG

Credal Concept Bottleneck Models for Epistemic-Aleatoric Uncertainty Decomposition

Concept Bottleneck Models (CBMs) predict through human-interpretable concepts, but they typically output point concept probabilities that conflate epistemic uncertainty (reducible model underspecifica…

Tue, 28 Apr 2026 04:13:21 GMT · 3 views

ARXIV.ORG

Agentic clinical reasoning over longitudinal myeloma records: a retrospective evaluation against expert consensus

Multiple myeloma is managed through sequential lines of therapy over years to decades, with each decision depending on cumulative disease history distributed across dozens to hundreds of heterogeneous…

Tue, 28 Apr 2026 04:13:21 GMT · 3 views

ARXIV.ORG

OpenGame: Open Agentic Coding for Games

Game development sits at the intersection of creative design and intricate software engineering, demanding the joint orchestration of game engines, real-time loops, and tightly coupled state across ma…

Wed, 29 Apr 2026 05:34:25 GMT · 2 views

ARXIV CS.AI

RADIANT-LLM: an Agentic Retrieval Augmented Generation Framework for Reliable Decision Support in Safety-Critical Nuclear Engineering

Reliable decision support in nuclear engineering requires traceable, domain-grounded knowledge retrieval, yet safety and risk analysis workflows remain hampered by fragmented documentation and halluci…

Wed, 29 Apr 2026 04:04:25 GMT · 1 view

ARXIV CS.AI

When VLMs 'Fix' Students: Identifying and Penalizing Over-Correction in the Evaluation of Multi-line Handwritten Math OCR

Accurate transcription of handwritten mathematics is crucial for educational AI systems, yet current benchmarks fail to evaluate this capability properly. Most prior studies focus on single-line expre…

Wed, 29 Apr 2026 04:04:25 GMT · 4 views

ARXIV CS.AI

Parameter Efficiency Is Not Memory Efficiency: Rethinking Fine-Tuning for On-Device LLM Adaptation

Parameter-Efficient Fine-Tuning (PEFT) has become the standard for adapting large language models (LLMs). In this work we challenge the wide-spread assumption that parameter efficiency equates memory …

Wed, 29 Apr 2026 04:04:25 GMT · 2 views

ARXIV CS.AI

DO-Bench: An Attributable Benchmark for Diagnosing Object Hallucination in Vision-Language Models

Object level hallucination remains a central reliability challenge for vision language models (VLMs), particularly in binary object existence verification. Existing benchmarks emphasize aggregate accu…

Wed, 29 Apr 2026 04:04:25 GMT · 3 views

ARXIV.ORG

LLMs Corrupt Your Documents When You Delegate

Large Language Models (LLMs) are poised to disrupt knowledge work, with the emergence of delegated work as a new interaction paradigm (e.g., vibe coding). Delegation requires trust - the expectation t…

Tue, 28 Apr 2026 12:54:59 GMT · 5 views

ARXIV.ORG

A Systematic Approach for Large Language Models Debugging

Large language models (LLMs) have become central to modern AI workflows, powering applications from open-ended text generation to complex agent-based reasoning. However, debugging these models remains…

Tue, 28 Apr 2026 04:13:21 GMT · 4 views

ARXIV.ORG

Discovering Agentic Safety Specifications from 1-Bit Danger Signals

Can large language model agents discover hidden safety objectives through experience alone? We introduce EPO-Safe (Experiential Prompt Optimization for Safe Agents), a framework where an LLM iterative…

Tue, 28 Apr 2026 04:13:21 GMT · 3 views

ARXIV.ORG

Causal Discovery as Dialectical Aggregation: A Quantitative Argumentation Framework

Constraint-based causal discovery is brittle in finite-sample regimes because erroneous conditional-independence (CI) decisions can cascade into substantial structural errors. We propose Quantitative …

Tue, 28 Apr 2026 04:13:21 GMT · 3 views

Results for "ai errors".

Claude Status Update : Claude.ai unavailable and elevated errors on the API on 2026-04-28T18:33:55.000Z

Claude Status Update : Claude.ai unavailable and elevated errors on the API on 2026-04-28T17:51:36.000Z

Pylon: Self-Host Your Own AI Agent Pipeline That Fixes Sentry Errors via

Claude Status Update : Elevated errors on Claude Haiku 4.5 on 2026-04-28T12:38:38.000Z

Claude.ai is unavailable

CAP-CoT: Cycle Adversarial Prompt for Improving Chain of Thoughts in LLM Reasoning

When Corrective Hints Hurt: Prompt Design in Reasoner-Guided Repair of LLM Overcaution on Entailed Negations under OWL~2~DL

FinGround: Detecting and Grounding Financial Hallucinations via Atomic Claim Verification

CT-FineBench: A Diagnostic Fidelity Benchmark for Fine-Grained Evaluation of CT Report Generation

Credal Concept Bottleneck Models for Epistemic-Aleatoric Uncertainty Decomposition

Agentic clinical reasoning over longitudinal myeloma records: a retrospective evaluation against expert consensus

OpenGame: Open Agentic Coding for Games

RADIANT-LLM: an Agentic Retrieval Augmented Generation Framework for Reliable Decision Support in Safety-Critical Nuclear Engineering

When VLMs 'Fix' Students: Identifying and Penalizing Over-Correction in the Evaluation of Multi-line Handwritten Math OCR

Parameter Efficiency Is Not Memory Efficiency: Rethinking Fine-Tuning for On-Device LLM Adaptation

DO-Bench: An Attributable Benchmark for Diagnosing Object Hallucination in Vision-Language Models

LLMs Corrupt Your Documents When You Delegate

A Systematic Approach for Large Language Models Debugging

Discovering Agentic Safety Specifications from 1-Bit Danger Signals

Causal Discovery as Dialectical Aggregation: A Quantitative Argumentation Framework

Or browse by topic