Show HN: Llama-dash – local LLM operators dashboard and proxy

May 19, 2026 · 5:15 PM UTC ·6 min read · 0 reactions · 0 comments · 23 views

⚡ TL;DR · AI summary

Llama-dash is a local AI gateway that provides a unified interface for managing AI model states and request histories. It supports OpenAI and Anthropic clients while offering features like request logging, model management, and customizable routing policies. The dashboard also includes GPU monitoring and metrics for performance tracking.

Key facts

▪Llama-dash turns a self-hosted local inference box into an observable AI gateway.
▪It supports OpenAI-compatible and Anthropic-compatible clients with a single public entrypoint.
▪The dashboard features live stats, model management, request logging, and GPU monitoring.

Original article

GitHub

Read full at GitHub →

Opening excerpt (first ~120 words) tap to expand

llama-dash llama-dash turns a self-hosted local inference box into an observable, policy-controlled AI gateway: one UI for model state, request history, API keys, routing rules, proxy metrics, and client setup. The implemented inference backend is currently llama-swap over llama.cpp. It is the single public entrypoint for OpenAI-compatible and Anthropic-compatible clients. llama-dash owns proxy policy, logging, auth, routing, and backend normalization, your selected inference backend owns local model processes and inference when traffic is routed to local models.

…

Excerpt limited to ~120 words for fair-use compliance. The full article is at GitHub.

Anonymous · no account needed

Discussion

0 comments

Show HN: Llama-dash – local LLM operators dashboard and proxy

Discussion

More from GitHub