Show HN: Brokkr - Scalable cluster management for GPU/HPC workloads
Brokkr is a self-hosted, open-source compute platform written in Rust that enables distributed execution of tasks like builds, tests, and ML training across a cluster of Linux machines. It implements the Bazel Remote Execution API v2, allowing compatibility with existing tools while building core distributed systems components from scratch for educational purposes. Currently in active development, Brokkr supports end-to-end job execution with caching and sandboxing but is not yet production-ready.
- ▪Brokkr is implemented in Rust and uses no third-party container runtimes like Docker or runc.
- ▪It supports the Bazel Remote Execution API v2, enabling integration with tools such as Bazel, Buck2, and Pants.
- ▪The platform includes a content-addressable storage (CAS) layer with features like rendezvous hashing, quorum replication, and tiered backends.
- ▪Brokkr implements hermetic sandboxing using Linux namespaces and cgroup-v2 for isolation without relying on external runtimes.
- ▪Development is structured into phases, with Phase 2 focusing on sandboxing and Phase 3 on distributed CAS enhancements.
Opening excerpt (first ~120 words) tap to expand
Brokkr A distributed build & compute grid, written in Rust. Many hammers. One forge. Brokkr is a self-hosted, open-source compute platform that turns a fleet of Linux machines into a single, coherent grid for executing arbitrary jobs — builds, tests, ML training, transcoding, anything that fits inside a sandbox. It speaks the Bazel Remote Execution API v2 so existing tooling (bazel, buck2, pants, custom REAPI clients) works unchanged. The interesting parts of distributed computing — content-addressable storage, hermetic sandboxing, scheduling, and consensus — are implemented from scratch as the project's educational core. There is no Docker, no runc, no embedded etcd, no third-party Raft. Status: Phase 1 complete; Phases 2 and 3 in flight.
…
Excerpt limited to ~120 words for fair-use compliance. The full article is at GitHub.