PopPy: Opportunistically Exploiting Parallelism in Python Compound AI Apps
The paper presents PopPy, a system designed to enhance the performance of compound AI applications in Python by uncovering parallelization opportunities. It addresses challenges such as language complexity and variable mutation, achieving significant speedups in execution time. PopPy requires minimal developer input and maintains the sequential semantics of programs while optimizing performance.
- ▪PopPy can achieve up to 6.4 times speedup in end-to-end execution time compared to standard Python execution.
- ▪The system combines an ahead-of-time compiler with a runtime to address key challenges in extracting parallelism.
- ▪PopPy supports a very expressive fragment of Python and requires minimal developer input.
Opening excerpt (first ~120 words) tap to expand
Computer Science > Distributed, Parallel, and Cluster Computing arXiv:2605.18697 (cs) [Submitted on 18 May 2026] Title:PopPy: Opportunistically Exploiting Parallelism in Python Compound AI Applications Authors:Stephen Mell, David Mell, Konstantinos Kallas, Steve Zdancewic, Osbert Bastani View a PDF of the paper titled PopPy: Opportunistically Exploiting Parallelism in Python Compound AI Applications, by Stephen Mell and 4 other authors View PDF HTML (experimental) Abstract:Compound AI applications, which compose calls to ML models using a general-purpose programming language like Python, are widely used for a variety of user-facing tasks, from software engineering to enterprise automation, making their end-to-end latency a critical bottleneck.
…
Excerpt limited to ~120 words for fair-use compliance. The full article is at arXiv.org.