2 stories tagged with #agent-benchmarks, in publish-time order across the WeSearch catalog. Tag pages update as new stories ingest.
⌘ RSS feed for this tag → or search "Agent Benchmarks"
RELATED TAGS
HUGGING FACE - BLOG
AI evals are becoming the new compute bottleneck
A Blog post by EvalEval Coalition on Hugging Face…
FIRETHERING
Xiaomi releases MiMo-v2.5 Family weights with strong coding and agent benchmarks
Peking University gives its computer science students a compiler project every semester. Build a complete SysY compiler in Rust including lexer, parser, abstract syntax tree, IR co…