Show HN: AI/ML benchmark for local LLM inference and XGBoost training on GPU/CPU

May 16, 2026 · 8:48 AM UTC ·5 min read · 0 reactions · 0 comments · 31 views

#ai #machine learning #benchmarking #gpu #python

Show HN: AI/ML benchmark for local LLM inference and XGBoost training on GPU/CPU

TL;DR · WeSearch summary

The AI/ML GPU Bench Suite is a Python-based tool that enables users to benchmark local GPU and CPU performance on AI and machine learning workloads with a single command. It supports Ollama LLM inference and XGBoost training on GPU/CPU, generating interactive HTML reports and contributing to a public Streamlit dashboard. The suite is open-source, reproducible, and designed for easy setup and comparison against reference systems.

Key facts

▪The benchmark suite tests Ollama LLMs (3B to 14B parameters) for token latency and throughput.
▪It evaluates XGBoost training and inference on the HIGGS dataset with data sizes from 100k to over 10M rows.
▪Results are stored in CSVs, visualized in an auto-generated Jupyter notebook, and optionally uploaded securely to a public Streamlit dashboard.
▪The tool uses a YAML configuration file and a single runner script to automate the entire benchmark process.
▪Users can contribute anonymized results to help grow a public database of AI/ML hardware performance.

Original article

GitHub

Read full at GitHub →

Opening excerpt (first ~120 words) tap to expand

AI & ML GPU Bench Suite for Python Objective One command → a full GPU/CPU benchmark & an interactive HTML report You can now measure your consumer GPU and/or CPU performance on typical Artificial Intelligence and Machine Learning workloads in a controlled way, with some pre‑set reference results. The reproducible benchmarks cover: Ollama LLMs (token latency & throughput on various 3B → 14B parameter models) XGBoost (training & inference on the HIGGS dataset, on 100k → 10M+ rows) Everything is orchestrated by a single YAML file (ai_bench_suite.yaml) and a runner script (run_suite.py), so you can launch an entire set of tests with one command.

…

Excerpt limited to ~120 words for fair-use compliance. The full article is at GitHub.

Anonymous · no account needed

Discussion

0 comments

Show HN: AI/ML benchmark for local LLM inference and XGBoost training on GPU/CPU

Discussion

More from GitHub