Search: "ai verification" — WeSearch Press

10 stories match your query across our 700+ source catalog. Ranked by relevance and recency.

10 results for "ai verification"

FinGround: Detecting and Grounding Financial Hallucinations via Atomic Claim Verification

Financial AI systems must produce answers grounded in specific regulatory filings, yet current LLMs fabricate metrics, invent citations, and miscalculate derived quantities. These errors carry direct …

Tue, 28 Apr 2026 04:13:21 GMT · 3 views

ARXIV.ORG

AI Identity: Standards, Gaps, and Research Directions for AI Agents

AI agents are now running real transactions, workflows, and sub-agent chains across organizational boundaries without continuous human supervision. This creates a problem no current infrastructure is …

Tue, 28 Apr 2026 04:13:21 GMT · 3 views

ARXIV.ORG

Thinking Like a Clinician: A Cognitive AI Agent for Clinical Diagnosis via Panoramic Profiling and Adversarial Debate

The application of large language models (LLMs) in clinical decision support faces significant challenges of "tunnel vision" and diagnostic hallucinations present in their processing unstructured elec…

Tue, 28 Apr 2026 04:13:21 GMT · 2 views

ARXIV.ORG

Structural Enforcement of Goal Integrity in AI Agents via Separation-of-Powers Architecture

Recent evidence suggests that frontier AI systems can exhibit agentic misalignment, generating and executing harmful actions derived from internally constructed goals, even without explicit user reque…

Tue, 28 Apr 2026 04:13:21 GMT · 3 views

ARXIV.ORG

Agentic AI platforms for autonomous training and rule induction of human-human and virus-human protein-protein interactions

We instruct an AI agent to construct two separate agentic AI platforms: one for autonomous training of predictive ML models for human-human and virus-human PPI, and the other for inducing explicit gen…

Tue, 28 Apr 2026 04:13:21 GMT · 2 views

LabelSets — open quality standard for AI training data (LQS v3.1) [D]

Built a third-party quality rating system for ML datasets. Multi-oracle (7 scorers across 5 algorithm families), conformal prediction intervals on downstream F1, Ed25519-signed certs, and a contaminat…

Sun, 26 Apr 2026 20:54:30 GMT · 5 views

ARXIV.ORG

Mitigating Belief Inertia via Active Intervention in Embodied Agents

Recent advancements in large language models (LLMs) have enabled agents to tackle complex embodied tasks through environmental interaction. However, these agents still make suboptimal decisions and pe…

Tue, 28 Apr 2026 08:54:13 GMT · 2 views

ARXIV.ORG

QED: An Open-Source Multi-Agent System for Generating Mathematical Proofs on Open Problems

We explore a central question in AI for mathematics: can AI systems produce original, nontrivial proofs for open research problems? Despite strong benchmark performance, producing genuinely novel proo…

Tue, 28 Apr 2026 04:13:21 GMT · 5 views

ARXIV.ORG

Certified geometric robustness -- Super-DeepG

Safety-critical applications are required to perform as expected in normal operations. Image processing functions are often required to be insensitive to small geometric perturbations such as rotation…

Tue, 28 Apr 2026 04:13:21 GMT · 2 views

ARXIV.ORG

Aligning with Your Own Voice: Self-Corrected Preference Learning for Hallucination Mitigation in LVLMs

Large Vision-Language Models (LVLMs) frequently suffer from hallucinations. Existing preference learning-based approaches largely rely on proprietary models to construct preference datasets. We identi…

Tue, 28 Apr 2026 04:13:21 GMT · 2 views

Or browse by topic

World US Politics Technology AI Markets Business Science Climate Health Culture Media

Results for "ai verification".

FinGround: Detecting and Grounding Financial Hallucinations via Atomic Claim Verification

AI Identity: Standards, Gaps, and Research Directions for AI Agents

Thinking Like a Clinician: A Cognitive AI Agent for Clinical Diagnosis via Panoramic Profiling and Adversarial Debate

Structural Enforcement of Goal Integrity in AI Agents via Separation-of-Powers Architecture

Agentic AI platforms for autonomous training and rule induction of human-human and virus-human protein-protein interactions

LabelSets — open quality standard for AI training data (LQS v3.1) [D]

Mitigating Belief Inertia via Active Intervention in Embodied Agents

QED: An Open-Source Multi-Agent System for Generating Mathematical Proofs on Open Problems

Certified geometric robustness -- Super-DeepG

Aligning with Your Own Voice: Self-Corrected Preference Learning for Hallucination Mitigation in LVLMs

Or browse by topic