WeSearch
Hub / Tags / Cuda
TAG · #CUDA

Cuda coverage.

Every story in the WeSearch catalog tagged with #cuda, chronological, with view counts. Subscribe to the per-tag RSS feed to follow this topic in your reader of choice.

23 stories tagged with #cuda, in publish-time order across the WeSearch catalog. Tag pages update as new stories ingest.

⌘ RSS feed for this tag →   or   search "Cuda"

RELATED TAGS
#gpu4#ai4#nvidia3#programming3#wrestling2#girls-sports2#lily-calzadilla2#tom-mcmath2#team-barracuda2#apple-m42#go2#youth-athletics1
R/LOCALLLAMA

Tensor split mode: CUDA error on latest llama.cpp with Qwen-3.6-27b

12 views ·
TECHMEME

Nvidia says RTX Spark offers up to 20 CPU cores and a Blackwell GPU with 6,144 CUDA cores, capable of "100 FPS 1440p gaming" or running 120B-parameter models (Jeffrey Kampman/Tom's Hardware)

Jeffrey Kampman / Tom's Hardware : Nvidia says RTX Spark offers up to 20 CPU cores and a Blackwell GPU with 6,144 CUDA cores, capable of “100 FPS 1440p gaming” or running 120B-para…

24 views ·
GITHUB

Show HN: Tiny-vLLM – high performance LLM inference engine in C++ and CUDA

Build your own high performance LLM inference engine in C++ and CUDA - a smaller version of vLLM - jmaczan/tiny-vllm…

16 views ·
#technology#programming#machine learning
PHORONIX

NVIDIA CUDA 13.3 Rolls Out CUDA Python 1.0, CUDA Tile For C++

22 views ·
#nvidia#programming
R/RUST

NVIDIA Releases CUDA-Oxide 0.1 For Experimental Rust-To-CUDA Compiler

15 views ·
ARXIV CS.AI

Towards Feedback-to-Plan Decisions for Self-Evolving LLM Agents in CUDA Kernel Generation

Large language models (LLMs) have shown strong empirical gains as self-evolving agents for CUDA kernel generation, driven by feedback-conditioned planning across generations. Howev…

24 views ·
#artificial intelligence#machine learning
R/LOCALLLAMA

CUDA: add fast walsh-hadamard transform by am17an · Pull Request #23615 · ggml-org/llama.cpp

17 views ·
YAHOO SPORTS

Barracuda Defenseman Set to Return to Sweden for 2026-27 Season

One of the San Jose Sharks' pending unrestricted free agents has already gotten his plans for the 2026-27 season sorted out.…

15 views ·
#hockey#sweden#nhl
DEV.TO (TOP)

The Microsecond Lie: Why your Go timers are lying about the GPU

TL;DR: I thought my CUDA kernel was running in 160 microseconds. I was wrong. Here is how I used CUDA...…

10 views ·
#programming#gpu#go
DEV.TO (TOP)

Profiling a CUDA Python Program with GPUFlight

In the previous post, I used a C++ CUDA example to look at memory coalescing and how memory access...…

12 views ·
#python#profiling
PHORONIX

chipStar 1.3 Released For Running HIP/CUDA Code On SPIR-V With OpenCL

A new release of chipStar is now available as the open-source tool for compiling and running HIP/CUDA code in a vendor-neutral manner with the SPIR-V intermediate representation on…

16 views ·
#programming#open-source#software
DEV.TO (TOP)

Deleting the 8.4GB Python Sidecar: Pure Go + CUDA with `CGO_ENABLED=0`

TL;DR: I built gocudrv so Go services can talk directly to NVIDIA GPUs — no cgo, no CUDA toolkit, no...…

10 views ·
#programming#ai#go
R/STABLEDIFFUSION

Generated with Flux on AMD RX 580 (2017 GPU) — Vulkan only, no CUDA

18 views ·
R/SYSADMIN

Barracuda Cloud Email Archive - broken indexing

16 views ·
REAL CLEAR DEFENSE

Army Orders 3,000 Container-Launched Barracuda-500M Cruise Missiles

Anduril will provide the U.S. Army with a minimum of 3,000 Surface-Launched Barracuda-500M, starting in 2027 with the first 1,000 along with the associated containerized launch sys…

12 views ·
DEV.TO (TOP)

Calling CUDA from Go without cgo

Go is great at infrastructure. It gives us fast builds, simple deployment, lightweight concurrency,...…

15 views ·
#go#gpu
PROBABLY DANCE

Programmers Spend Their Time – Probably Dance

I submitted a tiny patch to flash attention. The necessary typing for the change takes less ten seconds, but the overall change took more than ten hours So where does the time go? …

16 views ·
#software development#debugging
GITHUB

Molecular Dynamics on Apple M4

Molecular dynamics on Apple M4 — NEON intrinsics, SME2, Metal compute shaders, OpenMP. Pushing Apple Silicon to its limits. - vyasgiridhar/moleqular…

19 views ·
#molecular dynamics#apple m4#high-performance computing
XDA DEVELOPERS

13 years later, the GTX Titan is still the most important GPU Nvidia ever made

Borrowing a $7,000 Tesla and selling it for $1,000 changed everything…

17 views ·
#gpu#nvidia
HUGGINGFACE

Asynchronicity in Continuous Batching

We’re on a journey to advance and democratize artificial intelligence through open source and open science.…

18 views ·
#llm inference#gpu optimization#asynchronous processing
ARXIV.ORG

Evaluating CUDA Tile for AI Workloads on Hopper and Blackwell GPUs

NVIDIA's CUDA Tile (CuTile) introduces a Python-based, tile-centric abstraction for GPU kernel development that aims to simplify programming while retaining Tensor Core and Tensor …

20 views ·
#machine learning#artificial intelligence#hardware architecture
YAHOO SPORTS

'Why can't I?' launched Florida's first girls-duals wrestling team

"Why can't I?" Lily Calzadilla, 14, asked in 2017. Her persistence led to Florida's first all girls-duals wrestling team in Jensen Beach.…

17 views ·
#wrestling#girls sports#youth athletics
YAHOO SPORTS

'Why can't I?' launched Florida's first girls-duals wrestling team

"Why can't I?" Lily Calzadilla, 14, asked in 2017. Her persistence led to Florida's first all girls-duals wrestling team in Jensen Beach.…

15 views ·
#wrestling#sports#girls sports