Agentic Compilation: Reducing LLM Rerun Costs
The paper discusses a new architecture called Compile-and-Execute aimed at reducing the costs associated with LLM-driven web automation. This approach addresses the Rerun Crisis by decoupling model reasoning from browser execution, significantly lowering inference costs. Empirical evaluations show high success rates in various tasks, making this method a viable solution for economically scalable automation.
- ▪LLM-driven web agents face a scalability constraint known as the Rerun Crisis, leading to high inference costs.
- ▪The proposed Compile-and-Execute architecture reduces per-workflow inference costs to under 0.10 USD.
- ▪Empirical evaluations indicate zero-shot compilation success rates between 80-94% across different tasks.
Opening excerpt (first ~120 words) tap to expand
Computer Science > Distributed, Parallel, and Cluster Computing arXiv:2604.09718 (cs) [Submitted on 8 Apr 2026 (v1), last revised 25 Apr 2026 (this version, v2)] Title:Agentic Compilation: Mitigating the LLM Rerun Crisis for Minimized-Inference-Cost Web Automation Authors:Jagadeesh Chundru View a PDF of the paper titled Agentic Compilation: Mitigating the LLM Rerun Crisis for Minimized-Inference-Cost Web Automation, by Jagadeesh Chundru View PDF HTML (experimental) Abstract:LLM-driven web agents operating through continuous inference loops -- repeatedly querying a model to evaluate browser state and select actions -- exhibit a fundamental scalability constraint for repetitive tasks.
…
Excerpt limited to ~120 words for fair-use compliance. The full article is at arXiv.org.