#llama — Tagged Stories | WeSearch Press

Every story in the WeSearch catalog tagged with #llama, chronological, with view counts. Subscribe to the per-tag RSS feed to follow this topic in your reader of choice.

29 stories tagged with #llama, in publish-time order across the WeSearch catalog. Tag pages update as new stories ingest.

⌘ RSS feed for this tag → or search "Llama"

RELATED TAGS

#ollama6 #ai3 #claude-code2 #meta2 #llama-32 #android2 #kotlin2 #qwen3-6-27b1 #coding-model1 #open-weight1 #llama-cpp1 #quantization1

R/NODE

Show HN-style: Blue Arrow – modular orchestration system with state-driven execution, local LLaMA integration and post-execution verification

1 view · Fri, 01 May 2026 01:05:28 GMT

DEV.TO (TOP)

Normalized Categories: One Filter for "Polos" Across Every Supplier

If you've ever tried to search "polos under $10 in navy" across more than one supplier, you already...…

4 views · Fri, 01 May 2026 00:34:17 GMT

#api #python #ai

THE NEW STACK

Meta abandons open-source Llama for proprietary Muse Spark

Meta has shifted from Llama to its new proprietary AI model Muse Spark, leaving open-source developers searching for alternatives and migration paths.…

7 views · Thu, 30 Apr 2026 17:39:44 GMT

#meta #muse spark

DEV.TO (TOP)

Mastering On-Device GenAI: How to Fine-Tune LLMs for Android Using LoRA and Kotlin 2.x

The dream of a truly personal AI—one that lives entirely on your smartphone, understands your medical...…

3 views · Thu, 30 Apr 2026 10:09:10 GMT

#android #kotlin #ai

LOCALLLAMA

llama.cpp's Preliminary SM120 Native NVFP4 MMQ Is Merged

And somehow we already got some GGUFs for it! (the below one is from PR author himself)…

7 views · Wed, 29 Apr 2026 01:52:34 GMT

GITHUB

Pebble – Menu-bar text polisher running on local Ollama

Menu-bar text-polish tool that rewrites your clipboard with a local Ollama model. One global shortcut, seven presets, no cloud. - gashiartim/pebble…

8 views · Tue, 28 Apr 2026 22:17:06 GMT

#pebble #ollama #local llm

ARTIFICIAL INTELLIGENCE (AI)

Arc Gate —LLM proxy that hits P=1.00 R=1.00 F1=1.00 on indirect/roleplay prompt injection (beats OpenAI Moderation and LlamaGuard)

Benchmarked on 40 out-of-distribution prompts, indirect requests, roleplay framings, hypothetical scenarios, technical phrasings. The stuff that slips past everything else. Arc Gat…

8 views · Tue, 28 Apr 2026 18:09:07 GMT

LOCALLLAMA

convert : add support for Nemotron Nano 3 Omni by danbev · Pull Request #22481 · ggml-org/llama.cpp

NVIDIA Nemotron 3 Nano Omni is a multimodal large language model that unifies video, audio, image, and text understanding to support enterprise-grade Q&A, summarization, transcript…

8 views · Tue, 28 Apr 2026 18:09:07 GMT

PROMPTENGINEERING

Arc Gate — LLM proxy that catches 100% of indirect/roleplay prompt injection attacks (beats OpenAI Moderation and LlamaGuard)

Built an LLM proxy that sits in front of any OpenAI-compatible endpoint and blocks prompt injection before it reaches your model. Benchmarked against OpenAI Moderation API and Llam…

10 views · Tue, 28 Apr 2026 18:09:06 GMT

DEV COMMUNITY

Step-by-Step Guide to Building RAG with LlamaIndex 0.10 and Vector 0.4 for Docs Search

80% of engineering teams building RAG pipelines for internal documentation search waste 3+ weeks...…

5 views · Tue, 28 Apr 2026 13:24:59 GMT

DEADNET

Show HN: DeadNet – Watch AI agents debate, play games, and write stories live

DeadNet is a live arena where AI agents debate, play games, and write stories while humans watch and vote. Watch matches or build your own agent.…

16 views · Tue, 28 Apr 2026 13:14:59 GMT

#ai agents #live platform #debate

PYTORCH

A Primer on LLM Post-Training

8 views · Tue, 28 Apr 2026 12:39:59 GMT

#post-training #large language models #alignment

LOCALLLAMA

Duality of r/LocalLLaMA

8 views · Tue, 28 Apr 2026 07:28:05 GMT

DEV COMMUNITY

Step-by-Step Guide to Setting Up Local AI Code Review with Continue.dev 0.9, Ollama 0.5, and ESLint 9

82% of engineering teams report that cloud-based AI code review tools leak sensitive IP, cost 4x more...…

7 views · Tue, 28 Apr 2026 01:54:30 GMT

WILLIAMANGEL

Offline Agentic Coding

Offline Agentic Coding: Ollama and Claude code…

5 views · Tue, 28 Apr 2026 01:54:30 GMT

#ai #llms #agents

VRAM.cpp: Running llama-fit-params directly in your browser

Lots of people are always asking on this subreddit if their system can run a certain model. A lot of the "VRAM calculators" that I've found only provide either very rough estimates…

10 views · Mon, 27 Apr 2026 10:56:53 GMT

Intel B70: LLama.ccp SYCL vs LLama.cpp OpenVino vs LLM-Scaler

In case anyone is interested, I decided to test out LLama.cpp's new OpenVino backend to see how it compares on Intel GPUs. At first glance, it stomps all over the previous best-cas…

8 views · Mon, 27 Apr 2026 08:05:35 GMT

GITHUB

The cost math behind routing Claude Code through Ollama (~90% cut)

Pair Claude Desktop on Anthropic with Claude Code routed through Ollama. Visual walkthrough + copy-paste prompt that cuts your Claude Code bill ~90%. - Coherence-Daddy/use-ollama-t…

9 views · Mon, 27 Apr 2026 07:35:33 GMT

#claude-code #ollama #cost-optimization

LOCALLLAMA

mesa PR with 37-130% llama.cpp pp perf gain for vulkan on Linux on Intel Xe2

11 views · Mon, 27 Apr 2026 06:01:12 GMT

SIMON WILLISON'S WEBLOG

Qwen3.6-27B: Flagship-Level Coding in a 27B Dense Model

Qwen3.6-27B: Flagship-Level Coding in a 27B Dense Model Big claims from Qwen about their latest open weight model: Qwen3.6-27B delivers flagship-level agentic coding performance, s…

13 views · Sun, 26 Apr 2026 22:44:22 GMT

#qwen3.6-27b #coding model #open-weight

Browse more

All tags Search "Llama" RSS feed World US Technology Markets

Llama coverage.

Show HN-style: Blue Arrow – modular orchestration system with state-driven execution, local LLaMA integration and post-execution verification

Normalized Categories: One Filter for "Polos" Across Every Supplier

Meta abandons open-source Llama for proprietary Muse Spark

Mastering On-Device GenAI: How to Fine-Tune LLMs for Android Using LoRA and Kotlin 2.x

llama.cpp's Preliminary SM120 Native NVFP4 MMQ Is Merged

Pebble – Menu-bar text polisher running on local Ollama

Arc Gate —LLM proxy that hits P=1.00 R=1.00 F1=1.00 on indirect/roleplay prompt injection (beats OpenAI Moderation and LlamaGuard)

convert : add support for Nemotron Nano 3 Omni by danbev · Pull Request #22481 · ggml-org/llama.cpp

Arc Gate — LLM proxy that catches 100% of indirect/roleplay prompt injection attacks (beats OpenAI Moderation and LlamaGuard)

Step-by-Step Guide to Building RAG with LlamaIndex 0.10 and Vector 0.4 for Docs Search

Show HN: DeadNet – Watch AI agents debate, play games, and write stories live

A Primer on LLM Post-Training

Duality of r/LocalLLaMA

Step-by-Step Guide to Setting Up Local AI Code Review with Continue.dev 0.9, Ollama 0.5, and ESLint 9

Offline Agentic Coding

VRAM.cpp: Running llama-fit-params directly in your browser

Intel B70: LLama.ccp SYCL vs LLama.cpp OpenVino vs LLM-Scaler

The cost math behind routing Claude Code through Ollama (~90% cut)

mesa PR with 37-130% llama.cpp pp perf gain for vulkan on Linux on Intel Xe2

Qwen3.6-27B: Flagship-Level Coding in a 27B Dense Model

r/LocalLLaMa Rule Updates

Using PaddleOCR-VL-1.5 with llama-server for book OCR

Benchmark: Windows 11 vs Lubuntu 26.04 on Llama.cpp (RTX 5080 + i9-14900KF). I didn't expect the gap to be this big.

llama.cpp DeepSeek v4 Flash experimental inference

Will llama.cpp multislot improve speed?

Experts-Volunteers needed for Vulkan on ik_llama.cpp

This is where we are right now, LocalLLaMA

CUDA: reduce MMQ stream-k overhead by JohannesGaessler · Pull Request #22298 · ggml-org/llama.cpp

FP4 inference in llama.cpp (NVFP4) and ik_llama.cpp (MXFP4) landed - Finally

Browse more