10 results for "backend"
How would you price a usage-based backend for a VPN-like SaaS? (Pay-per-GB vs subscription)
I’ve been working on infrastructure for a networking-heavy SaaS over the past couple of years, and I’ve hit an interesting question around pricing. The system is fairly resource-driven (bandwidth, act…
A crazy Claude Code conversation that happened to a colleague the other day
This didn't happen to me but to a colleague. He was working on a Java/Go backend service with Claude Code when it suddenly started hallucinating about Discord.js (a framework that has nothing to do wi…
Xiaomi releases MiMo-v2.5 Family weights with strong coding and agent benchmarks
Peking University gives its computer science students a compiler project every semester. Build a complete SysY compiler in Rust including lexer, parser, abstract syntax tree, IR code generation, assem…
Tutorial: Build High-Throughput APIs with Go 1.24 and Gin 1.10
In 2024, API throughput remains the single biggest bottleneck for 68% of backend teams, with 42% of...…
Time-Series Forecasting in Safety-Critical Environments: An EU-AI-Act-Compliant Open-Source Package / Zeitreihenprognose in sicherheitskritischen Umgebungen: Ein KI-VO-konformes Open-Source-Paket
With spotforecast2-safe we present an integrated Compliance-by-Design approach to Python-based point forecasting of time series in safety-critical environments. A review of the relevant open-source to…
We benchmarked gpt-oss-120b across 6 inference providers and found a 10x throughput spread
We ran a benchmark across 10+ LLM routers, providers, and inference backends to answer the questions that come up every time someone picks a provider. Key findings: Do LLM routers add latency? No, Ope…
Intel B70: LLama.ccp SYCL vs LLama.cpp OpenVino vs LLM-Scaler
In case anyone is interested, I decided to test out LLama.cpp's new OpenVino backend to see how it compares on Intel GPUs. At first glance, it stomps all over the previous best-case, SYCL, but lags be…
Using PaddleOCR-VL-1.5 with llama-server for book OCR
I've been running PaddleOCR-VL-1.5 via llama.cpp's server for OCR on book pages. It handles complex layouts, tables, and mixed text/figure pages surprisingly well. Setup: - Model: PaddleOCR-VL-1.5-GGU…
Qwen 3.6 27B in Claude Code says it will do something then stops and prompts for user reply (not failing a tool call)
I'm running Qwen/Qwen3.6-27B-FP8 via vLLM using this command: vllm serve Qwen/Qwen3.6-27B-FP8 --tensor-parallel-size 4 --gpu-memory-utilization 0.95 --max-num-seqs 8 \ --enable-auto-tool-choice --tool…
A 14-day “Growth Forge” sprint: build an AI-powered growth agent on a real stack
Sharing something that sits at the intersection of AI agents and growth systems. VideoDB (backend for video/audio for AI agents) is running a 14-day sprint called Growth Forge for 5 builders to design…