Search: "metrics" — WeSearch Press

DEV COMMUNITY

I replaced CAPTCHA with passive biometrics after AI hit 91% bypass rate — 7 biological signals, no puzzles, free tier

CAPTCHA is broken AI now bypasses reCAPTCHA at 91%+ success rates. Every CAPTCHA you add...…

Tue, 28 Apr 2026 15:25:00 GMT · 5 views

FORTUNE

The metrics driving Verizon’s turnaround

Under new CEO Dan Schulman, Verizon posted its first positive Q1 postpaid net adds in more than a decade.…

Tue, 28 Apr 2026 12:09:59 GMT · 5 views

Kernel console live metrics

Sun, 26 Apr 2026 06:01:10 GMT · 7 views

DEV.TO (TOP)

Part 1: Intent vs State — How AWS DevOps Agent Closes the Gap Between What Your System Is and What You Decided It Should Be

When something breaks at 3am, you look at logs, metrics, traces. You don't go and re-read the ADR your team wrote in January. AWS DevOps Agent does. Here's why that changes the first hour of an incide…

Thu, 30 Apr 2026 15:39:43 GMT · 3 views

ARXIV CS.AI

RADIANT-LLM: an Agentic Retrieval Augmented Generation Framework for Reliable Decision Support in Safety-Critical Nuclear Engineering

Reliable decision support in nuclear engineering requires traceable, domain-grounded knowledge retrieval, yet safety and risk analysis workflows remain hampered by fragmented documentation and halluci…

Wed, 29 Apr 2026 04:04:25 GMT · 6 views

ARXIV CS.AI

Quantifying Divergence in Inter-LLM Communication Through API Retrieval and Ranking

Large language models (LLMs) increasingly operate as autonomous agents that reason over external APIs to perform complex tasks. However, their reliability and agreement remain poorly characterized. We…

Wed, 29 Apr 2026 04:04:25 GMT · 5 views

ARXIV CS.AI

Behavioral Intelligence Platforms: From Event Streams to Autonomous Insight via Probabilistic Journey Graphs, Behavioral Knowledge Extraction, and Grounded Language Generation

Contemporary product analytics systems require users to pose explicit queries, such as writing SQL, configuring dashboards, or constructing funnels, before insights can surface. This pull-based paradi…

Wed, 29 Apr 2026 04:04:25 GMT · 4 views

ARXIV CS.AI

When VLMs 'Fix' Students: Identifying and Penalizing Over-Correction in the Evaluation of Multi-line Handwritten Math OCR

Accurate transcription of handwritten mathematics is crucial for educational AI systems, yet current benchmarks fail to evaluate this capability properly. Most prior studies focus on single-line expre…

Wed, 29 Apr 2026 04:04:25 GMT · 8 views

ARXIV CS.AI

BiTA: Bidirectional Gated Recurrent Unit-Transformer Aggregator in a Temporal Graph Network Framework for Alert Prediction in Computer Networks

Proactive alert prediction in computer networks is critical for mitigating evolving cyber threats and enabling timely defensive actions. Temporal Graph Neural Networks (TGNs) provide a principled fram…

Wed, 29 Apr 2026 04:04:25 GMT · 5 views

ARXIV CS.AI

Applied AI-Enhanced RF Interference Rejection

AI-enhanced interference rejection in radio frequency (RF) transmissions has recently attracted interest because deep learning approaches trained on both the signal of interest (SOI) and the signal mi…

Wed, 29 Apr 2026 04:04:25 GMT · 5 views

ARXIV CS.AI

DO-Bench: An Attributable Benchmark for Diagnosing Object Hallucination in Vision-Language Models

Object level hallucination remains a central reliability challenge for vision language models (VLMs), particularly in binary object existence verification. Existing benchmarks emphasize aggregate accu…

Wed, 29 Apr 2026 04:04:25 GMT · 5 views

ARXIV CS.AI

Structure Guided Retrieval-Augmented Generation for Factual Queries

Retrieval-Augmented Generation (RAG) has been proposed to mitigate hallucinations in large language models (LLMs), where generated outputs may be factually incorrect. However, existing RAG approaches …

Wed, 29 Apr 2026 04:04:25 GMT · 5 views

ARXIV.ORG

The Controllability Trap: A Governance Framework for Military AI Agents

Agentic AI systems - capable of goal interpretation, world modeling, planning, tool use, long-horizon operation, and autonomous coordination - introduce distinct control failures not addressed by exis…

Tue, 28 Apr 2026 21:33:22 GMT · 9 views

ANDROID POLICE

I'm officially 'un-locked' and ready to scroll, as long as I'm on my couch

No more biometrics in your trusted places…

Tue, 28 Apr 2026 12:34:59 GMT · 6 views

SEEKING ALPHA

Kimco Realty: Why The Preferred Stocks Offer A Better Risk/Return Than The Common

Kimco Realty (KIM) preferreds KIM.PR.L & KIM.PR.M yield 6.5%+ below par with strong credit metrics—better than common/bonds.…

Tue, 28 Apr 2026 11:52:05 GMT · 6 views

ARXIV.ORG

Do Transaction-Level and Actor-Level AML Queues Agree? An Empirical Evaluation of Granularity Effects on the Elliptic++ Graph

Graph-based anti-money laundering (AML) systems on blockchain networks can score suspicious activity at two granularity levels -- transactions or actor addresses -- yet compliance action is conducted …

Tue, 28 Apr 2026 04:13:21 GMT · 7 views

ARXIV.ORG

MetaGAI: A Large-Scale and High-Quality Benchmark for Generative AI Model and Data Card Generation

The rapid proliferation of Generative AI necessitates rigorous documentation standards for transparency and governance. However, manual creation of Model and Data Cards is not scalable, while automate…

Tue, 28 Apr 2026 04:13:21 GMT · 6 views

ARXIV.ORG

FinGround: Detecting and Grounding Financial Hallucinations via Atomic Claim Verification

Financial AI systems must produce answers grounded in specific regulatory filings, yet current LLMs fabricate metrics, invent citations, and miscalculate derived quantities. These errors carry direct …

Tue, 28 Apr 2026 04:13:21 GMT · 5 views

ARXIV.ORG

Does Machine Unlearning Preserve Clinical Safety? A Risk Analysis for Medical Image Classification

The application of Deep Learning in medical diagnosis must balance patient safety with compliance with data protection regulations. Machine Unlearning enables the selective removal of training data fr…

Tue, 28 Apr 2026 04:13:21 GMT · 5 views

ARXIV.ORG

Context-Aware Hospitalization Forecasting Evaluations for Decision Support using LLMs

Medical and public health experts must make real-time resource decisions, such as expanding hospital bed capacity, based on projected hospitalization trends during large-scale healthcare disruptions (…

Tue, 28 Apr 2026 04:13:21 GMT · 8 views

ARXIV.ORG

CT-FineBench: A Diagnostic Fidelity Benchmark for Fine-Grained Evaluation of CT Report Generation

The evaluation of generated reports remains a critical challenge in Computed Tomography (CT) report generation, due to the large volume of text, the diversity and complexity of findings, and the prese…

Tue, 28 Apr 2026 04:13:21 GMT · 8 views

ARXIV.ORG

The Kerimov-Alekberli Model: An Information-Geometric Framework for Real-Time System Stability

This study introduces the Kerimov-Alekberli model, a novel information-geometric framework that redefines AI safety by formally linking non-equilibrium thermodynamics to stochastic control for the eth…

Tue, 28 Apr 2026 04:13:21 GMT · 5 views

ARXIV.ORG

Multi-Dimensional Evaluation of Sustainable City Trips with LLM-as-a-Judge and Human-in-the-Loop

Evaluating nuanced conversational travel recommendations is challenging when human annotations are costly and standard metrics ignore stakeholder-centric goals. We study LLMs-as-Judges for sustainable…

Tue, 28 Apr 2026 04:13:21 GMT · 7 views

ARXIV.ORG

STELLAR-E: a Synthetic, Tailored, End-to-end LLM Application Rigorous Evaluator

The increasing reliance on Large Language Models (LLMs) across diverse sectors highlights the need for robust domain-specific and language-specific evaluation datasets; however, the collection of such…

Tue, 28 Apr 2026 04:13:21 GMT · 5 views

A 14-day “Growth Forge” sprint: build an AI-powered growth agent on a real stack

Sharing something that sits at the intersection of AI agents and growth systems. VideoDB (backend for video/audio for AI agents) is running a 14-day sprint called Growth Forge for 5 builders to design…

Sun, 26 Apr 2026 20:54:40 GMT · 9 views

Results for "metrics".

I replaced CAPTCHA with passive biometrics after AI hit 91% bypass rate — 7 biological signals, no puzzles, free tier

The metrics driving Verizon’s turnaround

Kernel console live metrics

Part 1: Intent vs State — How AWS DevOps Agent Closes the Gap Between What Your System Is and What You Decided It Should Be

RADIANT-LLM: an Agentic Retrieval Augmented Generation Framework for Reliable Decision Support in Safety-Critical Nuclear Engineering

Quantifying Divergence in Inter-LLM Communication Through API Retrieval and Ranking

Behavioral Intelligence Platforms: From Event Streams to Autonomous Insight via Probabilistic Journey Graphs, Behavioral Knowledge Extraction, and Grounded Language Generation

When VLMs 'Fix' Students: Identifying and Penalizing Over-Correction in the Evaluation of Multi-line Handwritten Math OCR

BiTA: Bidirectional Gated Recurrent Unit-Transformer Aggregator in a Temporal Graph Network Framework for Alert Prediction in Computer Networks

Applied AI-Enhanced RF Interference Rejection

DO-Bench: An Attributable Benchmark for Diagnosing Object Hallucination in Vision-Language Models

Structure Guided Retrieval-Augmented Generation for Factual Queries

The Controllability Trap: A Governance Framework for Military AI Agents

I'm officially 'un-locked' and ready to scroll, as long as I'm on my couch

Kimco Realty: Why The Preferred Stocks Offer A Better Risk/Return Than The Common

Do Transaction-Level and Actor-Level AML Queues Agree? An Empirical Evaluation of Granularity Effects on the Elliptic++ Graph

MetaGAI: A Large-Scale and High-Quality Benchmark for Generative AI Model and Data Card Generation

FinGround: Detecting and Grounding Financial Hallucinations via Atomic Claim Verification

Does Machine Unlearning Preserve Clinical Safety? A Risk Analysis for Medical Image Classification

Context-Aware Hospitalization Forecasting Evaluations for Decision Support using LLMs

CT-FineBench: A Diagnostic Fidelity Benchmark for Fine-Grained Evaluation of CT Report Generation

The Kerimov-Alekberli Model: An Information-Geometric Framework for Real-Time System Stability

Multi-Dimensional Evaluation of Sustainable City Trips with LLM-as-a-Judge and Human-in-the-Loop

STELLAR-E: a Synthetic, Tailored, End-to-end LLM Application Rigorous Evaluator

A 14-day “Growth Forge” sprint: build an AI-powered growth agent on a real stack

Or browse by topic