26 results for "agentic ai"
Agentic AI platforms for autonomous training and rule induction of human-human and virus-human protein-protein interactions
We instruct an AI agent to construct two separate agentic AI platforms: one for autonomous training of predictive ML models for human-human and virus-human PPI, and the other for inducing explicit gen…
Agentic clinical reasoning over longitudinal myeloma records: a retrospective evaluation against expert consensus
Multiple myeloma is managed through sequential lines of therapy over years to decades, with each decision depending on cumulative disease history distributed across dozens to hundreds of heterogeneous…
Xiaomi open sources MiMo-V2.5 and MiMo-V2.5-Pro under the MIT License, saying both models are among the most efficient available for agentic "claw" tasks (Carl Franzen/VentureBeat)
Carl Franzen / VentureBeat : Xiaomi open sources MiMo-V2.5 and MiMo-V2.5-Pro under the MIT License, saying both models are among the most efficient available for agentic “claw” tasks — Xiaomi, the Chi…
AMD: Inference And Agentic AI Are Expanding Its Runway
Advanced Micro Devices is Buy-rated on expanding AI demand, strong EPYC/data center momentum, and discounted valuation. Learn more about AMD stock here.…
Agentic CEO – An AI research organism that hunts, critiques, and evolves itself
Show HN: I built a way to see if your SDK is AI-friendly
Have you ever wonder if your SDKs is friendly for Agentic AI like Claude Code or Codex? I built an opensource (Apache 2.0) CLI that answer that question for you. With it you can create a test suite ei…
Why Every Platform Team Shouldn't Build Their AI Standards From Scratch
This is Part 0 of a series on building agentic AI workflows for platform engineering. Part 1 jumped...…
FormalScience: Scalable Human-in-the-Loop Autoformalisation of Science with Agentic Code Generation in Lean
Formalising informal mathematical reasoning into formally verifiable code is a significant challenge for large language models. In scientific fields such as physics, domain-specific machinery (\textit…
A Decoupled Human-in-the-Loop System for Controlled Autonomy in Agentic Workflows
AI agents are increasingly deployed to execute tasks and make decisions within agentic workflows, introducing new requirements for safe and controlled autonomy. Prior work has established the importan…
Active Inference: A method for Phenotyping Agency in AI systems?
The proliferation of agentic artificial intelligence has outpaced the conceptual tools needed to characterize agency in computational systems. Prevailing definitions mainly rely on autonomy and goal-d…
Structural Enforcement of Goal Integrity in AI Agents via Separation-of-Powers Architecture
Recent evidence suggests that frontier AI systems can exhibit agentic misalignment, generating and executing harmful actions derived from internally constructed goals, even without explicit user reque…
LLM-Guided Agentic Floor Plan Parsing for Accessible Indoor Navigation of Blind and Low-Vision People
Indoor navigation remains a critical accessibility challenge for the blind and low-vision (BLV) individuals, as existing solutions rely on costly per-building infrastructure. We present an agentic fra…
FastOMOP: A Foundational Architecture for Reliable Agentic Real-World Evidence Generation on OMOP CDM data
The Observational Medical Outcomes Partnership Common Data Model (OMOP CDM), maintained by the Observational Health Data Sciences and Informatics (OHDSI) collaboration, enabled the harmonisation of el…
The Price of Agreement: Measuring LLM Sycophancy in Agentic Financial Applications
Given the increased use of LLMs in financial systems today, it becomes important to evaluate the safety and robustness of such systems. One failure mode that LLMs frequently display in general domain …
Quoting Romain Huet
Since GPT-5.4, we’ve unified Codex and the main model into a single system, so there’s no separate coding line anymore. GPT-5.5 takes this further, with strong gains in agentic coding, computer use, a…
A 14-day “Growth Forge” sprint: build an AI-powered growth agent on a real stack
Sharing something that sits at the intersection of AI agents and growth systems. VideoDB (backend for video/audio for AI agents) is running a 14-day sprint called Growth Forge for 5 builders to design…
Discovering Agentic Safety Specifications from 1-Bit Danger Signals
Can large language model agents discover hidden safety objectives through experience alone? We introduce EPO-Safe (Experiential Prompt Optimization for Safe Agents), a framework where an LLM iterative…
Agentic Adversarial Rewriting Exposes Architectural Vulnerabilities in Black-Box NLP Pipelines
Multi-component natural language processing (NLP) pipelines are increasingly deployed for high-stakes decisions, yet no existing adversarial method can test their robustness under realistic conditions…
Beyond the Attention Stability Boundary: Agentic Self-Synthesizing Reasoning Protocols
As LLM agents transition to autonomous digital coworkers, maintaining deterministic goal-directedness in non-linear multi-turn conversations emerged as an architectural bottleneck. We identify and for…
Are there any agentic coding harnesses that AREN'T built on JS and Node?
With how often we hear about supply-chain attacks on npm I am hesitant to install any apps that use it, let alone something like an agent harness that will run constantly unsupervised.…
StoryTR: Narrative-Centric Video Temporal Retrieval with Theory of Mind Reasoning
Current video moment retrieval excels at action-centric tasks but struggles with narrative content. Models can see \textit{what is happening} but fail to reason \textit{why it matters}. This semantic …
PhysNote: Self-Knowledge Notes for Evolvable Physical Reasoning in Vision-Language Model
Vision-Language Models (VLMs) have demonstrated strong performance on textbook-style physics problems, yet they frequently fail when confronted with dynamic real-world scenarios that require temporal …
Qwen3.6-27B-3bit-mlx · Hugging Face: 3 & 5 mixed quant for RAM poor Mac users.
Just dropped a 3bit mixed quant (5bit for embeds and prediction layers) for Mac users. There was only one 3 bit version of this model (from Unsloth), but it was very heavy and painfully slow: This one…
Qwen3.6-27B: Flagship-Level Coding in a 27B Dense Model
Qwen3.6-27B: Flagship-Level Coding in a 27B Dense Model Big claims from Qwen about their latest open weight model: Qwen3.6-27B delivers flagship-level agentic coding performance, surpassing the previo…
It's a big one
This week's edition of my email newsletter (aka content from this blog delivered to your inbox) features 4 pelicans riding bicycles, 1 possum on an e-scooter, up to 5 raccoons with ham radios hiding i…
Putting Lipstyk on a pig - agents write most of my code, so I wound up making a static slop analysis tool
lipstyk — static analysis for machine-generated code patterns I've been neck deep in agentic dev for a while. Started on Pi, ended up building my own toolset on top of it, and at this point the agents…