7 results for "performance optimization"
CachyOS Linux Performance Leading Over Ubuntu 26.04 LTS, Fedora Workstation 44
It's not too entirely surprising given the aggressive stance that the CachyOS Linux distribution has taken on out-of-the-box performance, but for those curious, it continues largely leading over the n…
Neural Network Optimization Reimagined: Decoupled Techniques for Scratch and Fine-Tuning
With the accumulation of resources in the era of big data and the rise of pre-trained models in deep learning, optimizing neural networks for various tasks often involves different strategies for fine…
Escher-Loop: Mutual Evolution by Closed-Loop Self-Referential Optimization
While recent autonomous agents demonstrate impressive capabilities, they predominantly rely on manually scripted workflows and handcrafted heuristics, inherently limiting their potential for open-ende…
Stochastic KV Routing: Enabling Adaptive Depth-Wise Cache Sharing
Serving transformer language models with high throughput requires caching Key-Values (KVs) to avoid redundant computation during autoregressive generation. The memory footprint of KV caching is signif…
Evaluating CUDA Tile for AI Workloads on Hopper and Blackwell GPUs
NVIDIA's CUDA Tile (CuTile) introduces a Python-based, tile-centric abstraction for GPU kernel development that aims to simplify programming while retaining Tensor Core and Tensor Memory Accelerator (…
Xiaomi releases MiMo-v2.5 Family weights with strong coding and agent benchmarks
Peking University gives its computer science students a compiler project every semester. Build a complete SysY compiler in Rust including lexer, parser, abstract syntax tree, IR code generation, assem…
Discovering Agentic Safety Specifications from 1-Bit Danger Signals
Can large language model agents discover hidden safety objectives through experience alone? We introduce EPO-Safe (Experiential Prompt Optimization for Safe Agents), a framework where an LLM iterative…