#cuda — Tagged Stories | WeSearch Press

Every story in the WeSearch catalog tagged with #cuda, chronological, with view counts. Subscribe to the per-tag RSS feed to follow this topic in your reader of choice.

23 stories tagged with #cuda, in publish-time order across the WeSearch catalog. Tag pages update as new stories ingest.

⌘ RSS feed for this tag → or search "Cuda"

RELATED TAGS

#gpu4 #ai4 #nvidia3 #programming3 #wrestling2 #girls-sports2 #lily-calzadilla2 #tom-mcmath2 #team-barracuda2 #apple-m42 #go2 #youth-athletics1

R/LOCALLLAMA

Tensor split mode: CUDA error on latest llama.cpp with Qwen-3.6-27b

12 views · Wed, 03 Jun 2026 12:42:09 GMT

TECHMEME

Nvidia says RTX Spark offers up to 20 CPU cores and a Blackwell GPU with 6,144 CUDA cores, capable of "100 FPS 1440p gaming" or running 120B-parameter models (Jeffrey Kampman/Tom's Hardware)

Jeffrey Kampman / Tom's Hardware : Nvidia says RTX Spark offers up to 20 CPU cores and a Blackwell GPU with 6,144 CUDA cores, capable of “100 FPS 1440p gaming” or running 120B-para…

24 views · Mon, 01 Jun 2026 05:57:23 GMT

GITHUB

Show HN: Tiny-vLLM – high performance LLM inference engine in C++ and CUDA

Build your own high performance LLM inference engine in C++ and CUDA - a smaller version of vLLM - jmaczan/tiny-vllm…

16 views · Fri, 29 May 2026 19:45:02 GMT

#technology #programming #machine learning

PHORONIX

NVIDIA CUDA 13.3 Rolls Out CUDA Python 1.0, CUDA Tile For C++

22 views · Wed, 27 May 2026 20:33:04 GMT

#nvidia #programming

R/RUST

NVIDIA Releases CUDA-Oxide 0.1 For Experimental Rust-To-CUDA Compiler

15 views · Wed, 27 May 2026 14:08:05 GMT

ARXIV CS.AI

Towards Feedback-to-Plan Decisions for Self-Evolving LLM Agents in CUDA Kernel Generation

Large language models (LLMs) have shown strong empirical gains as self-evolving agents for CUDA kernel generation, driven by feedback-conditioned planning across generations. Howev…

24 views · Wed, 27 May 2026 04:07:56 GMT

#artificial intelligence #machine learning

R/LOCALLLAMA

CUDA: add fast walsh-hadamard transform by am17an · Pull Request #23615 · ggml-org/llama.cpp

17 views · Mon, 25 May 2026 17:37:43 GMT

YAHOO SPORTS

Barracuda Defenseman Set to Return to Sweden for 2026-27 Season

One of the San Jose Sharks' pending unrestricted free agents has already gotten his plans for the 2026-27 season sorted out.…

15 views · Mon, 25 May 2026 00:37:35 GMT

#hockey #sweden #nhl

DEV.TO (TOP)

The Microsecond Lie: Why your Go timers are lying about the GPU

TL;DR: I thought my CUDA kernel was running in 160 microseconds. I was wrong. Here is how I used CUDA...…

10 views · Sat, 23 May 2026 19:37:27 GMT

#programming #gpu #go

DEV.TO (TOP)

Profiling a CUDA Python Program with GPUFlight

In the previous post, I used a C++ CUDA example to look at memory coalescing and how memory access...…

12 views · Fri, 22 May 2026 06:02:00 GMT

#python #profiling

PHORONIX

chipStar 1.3 Released For Running HIP/CUDA Code On SPIR-V With OpenCL

A new release of chipStar is now available as the open-source tool for compiling and running HIP/CUDA code in a vendor-neutral manner with the SPIR-V intermediate representation on…

16 views · Thu, 21 May 2026 10:26:10 GMT

#programming #open-source #software

DEV.TO (TOP)

Deleting the 8.4GB Python Sidecar: Pure Go + CUDA with `CGO_ENABLED=0`

TL;DR: I built gocudrv so Go services can talk directly to NVIDIA GPUs — no cgo, no CUDA toolkit, no...…

10 views · Wed, 20 May 2026 05:04:59 GMT

#programming #ai #go

R/STABLEDIFFUSION

Generated with Flux on AMD RX 580 (2017 GPU) — Vulkan only, no CUDA

18 views · Tue, 19 May 2026 23:35:01 GMT

R/SYSADMIN

Barracuda Cloud Email Archive - broken indexing

16 views · Tue, 19 May 2026 04:35:00 GMT

REAL CLEAR DEFENSE

Army Orders 3,000 Container-Launched Barracuda-500M Cruise Missiles

Anduril will provide the U.S. Army with a minimum of 3,000 Surface-Launched Barracuda-500M, starting in 2027 with the first 1,000 along with the associated containerized launch sys…

12 views · Mon, 18 May 2026 10:39:56 GMT

DEV.TO (TOP)

Calling CUDA from Go without cgo

Go is great at infrastructure. It gives us fast builds, simple deployment, lightweight concurrency,...…

15 views · Sun, 17 May 2026 10:22:13 GMT

#go #gpu

PROBABLY DANCE

Programmers Spend Their Time – Probably Dance

I submitted a tiny patch to flash attention. The necessary typing for the change takes less ten seconds, but the overall change took more than ten hours So where does the time go? …

16 views · Sun, 17 May 2026 06:03:59 GMT

#software development #debugging

GITHUB

Molecular Dynamics on Apple M4

Molecular dynamics on Apple M4 — NEON intrinsics, SME2, Metal compute shaders, OpenMP. Pushing Apple Silicon to its limits. - vyasgiridhar/moleqular…

19 views · Sun, 17 May 2026 05:33:58 GMT

#molecular dynamics #apple m4 #high-performance computing

XDA DEVELOPERS

13 years later, the GTX Titan is still the most important GPU Nvidia ever made

Borrowing a $7,000 Tesla and selling it for $1,000 changed everything…

17 views · Sat, 16 May 2026 23:05:19 GMT

#gpu #nvidia

HUGGINGFACE

Asynchronicity in Continuous Batching

We’re on a journey to advance and democratize artificial intelligence through open source and open science.…

18 views · Sat, 16 May 2026 07:10:17 GMT

#llm inference #gpu optimization #asynchronous processing

ARXIV.ORG

Evaluating CUDA Tile for AI Workloads on Hopper and Blackwell GPUs

NVIDIA's CUDA Tile (CuTile) introduces a Python-based, tile-centric abstraction for GPU kernel development that aims to simplify programming while retaining Tensor Core and Tensor …

20 views · Wed, 29 Apr 2026 00:27:34 GMT

#machine learning #artificial intelligence #hardware architecture

YAHOO SPORTS