21 stories tagged with #pytorch, in publish-time order across the WeSearch catalog. Tag pages update as new stories ingest.
⌘ RSS feed for this tag → or search "Pytorch"
I rewrited WarpFactory into PyTorch so anyone can simulate Warp Drives for free
Holonomy_lib, exact non Euclidean geometry primitives for PyTorch
Research-grade PyTorch math: differential geometry, spectral graph theory, discrete Ricci flow, simplicial topology, persistent homology, cellular sheaves, SO(3) Lie primitives, in…
What I learned building a debugger for PyTorch training loops and how it changed how I think about failure diagnosis [D]
I made a CPU only spiking neuron network lib that comes pretty close to PyTorch
We’re on a journey to advance and democratize artificial intelligence through open source and open science.…
Profiling PyTorch training without accidentally stalling the GPU [D]
Prefix caching in vLLM under multi-tenant agent traffic
TL;DR: We turned on vLLM's prefix cache for our agent workloads at Nexus Labs and watched TTFT drop...…
Need a Linux distro to do some work with OpenCV + PyTorch but unsure which to pick based off their suggestions on the PyTorch website. I have some Linux experience but certainly not an expert or anything
Characterization of machine learning compilers for LLM inference on NVIDIA GPUs
AI inference is conflicted between Performance, developer Productivity, and device Portability–the P3 problem. Machine learning compilers (MLCs) aim to address this, but their ecos…
I built a Mamba1 variant I call SM1 with d_state=1 that runs on Blackwell in pure PyTorch [P]
PyTorch 2.12 Release
End of a Semester
A story on how I plan to spend my Holiday…
Why your diffusion model is slow at batch size 1 (and what actually helps)
TL;DR: Single-image diffusion inference is bottlenecked by kernel launch overhead and attention...…
PyTorch Landscape
Installing ComfyUI + PyTorch for AMD ROCm 7.2, using official drivers.
Your PyTorch Model File Can Execute Arbitrary Code — Here's How I Built a Scanner to Detect It
Every time you run torch.load("model.pt"), you're executing arbitrary Python code. Not "could...…
Running PyTorch Models on Apple Silicon GPUs with the ExecuTorch MLX Delegate
ImpactArbiter – A PyTorch autograd trap for LLM memory bugs
Contribute to msunda17/impactarbiter-cli development by creating an account on GitHub.…
Softmax in front of CrossEntropyLoss: 16 other bugs PyTorch won't catch
A walkthrough of the 17-rule design-time linter inside Neurarch: what each rule catches, why it matters, and where static analysis stops being useful for neural networks.…
Programmers Spend Their Time – Probably Dance
I submitted a tiny patch to flash attention. The necessary typing for the change takes less ten seconds, but the overall change took more than ten hours So where does the time go? …
PyTorch NaNs Are Silent Killers — So I Built a 3ms Hook to Catch Them at the Exact Layer
NaNs don’t crash your training — they quietly destroy it. After losing hours to a silent failure in a ResNet training run, I built a lightweight detector that pinpoints the exact l…
Porting a Scratch-Built 500M LLM Training Pipeline to ROCm on Strix Halo
A lightweight transformer language model built from scratch in PyTorch, trained on a single consumer GPU with a full pipeline for data processing, pretraining, and instruction tuni…