Show HN: Llama-dash – local LLM operators dashboard and proxy
Llama-dash is a local AI gateway that provides a unified interface for managing AI model states and request histories. It supports OpenAI and Anthropic clients while offering features like request logging, model management, and customizable routing policies. The dashboard also includes GPU monitoring and metrics for performance tracking.
- ▪Llama-dash turns a self-hosted local inference box into an observable AI gateway.
- ▪It supports OpenAI-compatible and Anthropic-compatible clients with a single public entrypoint.
- ▪The dashboard features live stats, model management, request logging, and GPU monitoring.
Opening excerpt (first ~120 words) tap to expand
llama-dash llama-dash turns a self-hosted local inference box into an observable, policy-controlled AI gateway: one UI for model state, request history, API keys, routing rules, proxy metrics, and client setup. The implemented inference backend is currently llama-swap over llama.cpp. It is the single public entrypoint for OpenAI-compatible and Anthropic-compatible clients. llama-dash owns proxy policy, logging, auth, routing, and backend normalization, your selected inference backend owns local model processes and inference when traffic is routed to local models.
…
Excerpt limited to ~120 words for fair-use compliance. The full article is at GitHub.