Show HN: YieldOS-Lite – A simulator for LLM inference control-plane governance
YieldOS-Lite is a Phase 1 research simulator designed to explore resource governance for heterogeneous LLM inference workloads. It aims to determine if a slow-path governance control plane can enhance service level objectives compared to traditional scheduling methods. The simulator is not intended for production use but serves as a tool for testing governance policies before integration with actual engines.
- ▪YieldOS-Lite is a dependency-free trace simulator for LLM inference resource governance.
- ▪The simulator models various control-plane choices, including SLO urgency and policy cadence.
- ▪Current findings suggest that predictive SLO governance is a promising approach for managing heterogeneous workloads.
Opening excerpt (first ~120 words) tap to expand
YieldOS-Lite MVP Simulator YieldOS-Lite is a Phase 1 research artifact for asking one question: When LLM inference workloads become heterogeneous, does a slow-path resource-governance control plane improve SLO-valid work over mechanistic schedulers such as continuous batching, chunked prefill, and prefill/decode disaggregation? This repository contains the simulator, paper draft, generated figures, experiment summaries, replay traces, and tests used to explore that question. It is meant to be easy to read cold: start with this README, skim the paper, run the smoke tests, then reproduce or extend the trace-driven experiments.
…
Excerpt limited to ~120 words for fair-use compliance. The full article is at GitHub.