Agentic Compilation: Reducing LLM Rerun Costs

May 23, 2026 · 9:39 PM UTC ·3 min read · 0 reactions · 0 comments · 17 views

#artificial intelligence #automation #cost reduction

⚡ TL;DR · AI summary

The paper discusses a new architecture called Compile-and-Execute aimed at reducing the costs associated with LLM-driven web automation. This approach addresses the Rerun Crisis by decoupling model reasoning from browser execution, significantly lowering inference costs. Empirical evaluations show high success rates in various tasks, making this method a viable solution for economically scalable automation.

Key facts

▪LLM-driven web agents face a scalability constraint known as the Rerun Crisis, leading to high inference costs.
▪The proposed Compile-and-Execute architecture reduces per-workflow inference costs to under 0.10 USD.
▪Empirical evaluations indicate zero-shot compilation success rates between 80-94% across different tasks.

Original article

arXiv.org

Read full at arXiv.org →

Opening excerpt (first ~120 words) tap to expand

Computer Science > Distributed, Parallel, and Cluster Computing arXiv:2604.09718 (cs) [Submitted on 8 Apr 2026 (v1), last revised 25 Apr 2026 (this version, v2)] Title:Agentic Compilation: Mitigating the LLM Rerun Crisis for Minimized-Inference-Cost Web Automation Authors:Jagadeesh Chundru View a PDF of the paper titled Agentic Compilation: Mitigating the LLM Rerun Crisis for Minimized-Inference-Cost Web Automation, by Jagadeesh Chundru View PDF HTML (experimental) Abstract:LLM-driven web agents operating through continuous inference loops -- repeatedly querying a model to evaluate browser state and select actions -- exhibit a fundamental scalability constraint for repetitive tasks.

…

Excerpt limited to ~120 words for fair-use compliance. The full article is at arXiv.org.

Anonymous · no account needed

Discussion

0 comments

Agentic Compilation: Reducing LLM Rerun Costs

Discussion

More from arXiv.org