Cross-Lingual Token Arbitrage: Optimizing Code Agent Context Windows via Local LLM Preprocessing
The paper discusses a new approach to optimize coding agents by reducing input-token costs. It introduces a middleware that preprocesses prompts to enhance efficiency, particularly for non-English text. The results show significant reductions in token usage while maintaining or improving task accuracy across various coding benchmarks.
- ▪AI-assisted coding agents face challenges due to input-token costs, especially with non-English text.
- ▪The proposed middleware uses a local model for cross-lingual translation and prompt optimization.
- ▪The method reduces prompt tokens by 34-47 percent and total tokens by up to 18.8 percent without sacrificing accuracy.
Opening excerpt (first ~120 words) tap to expand
Computer Science > Artificial Intelligence arXiv:2606.03618 (cs) [Submitted on 2 Jun 2026] Title:Cross-Lingual Token Arbitrage: Optimizing Code Agent Context Windows via Local LLM Preprocessing Authors:Mehmet Utku Colak View a PDF of the paper titled Cross-Lingual Token Arbitrage: Optimizing Code Agent Context Windows via Local LLM Preprocessing, by Mehmet Utku Colak View PDF HTML (experimental) Abstract:AI-assisted coding agents are bottlenecked by input-token cost. Two pathologies of raw human input drive much of this overhead: tokenization inefficiency for non-English text and structural entropy in conversational prompts. Existing approaches act reactively by compressing already-bloated contexts or intervening after failures occur.
…
Excerpt limited to ~120 words for fair-use compliance. The full article is at arXiv cs.AI.