#transformer — Tagged Stories

Every story in the WeSearch catalog tagged with #transformer, chronological, with view counts. Subscribe to the per-tag RSS feed to follow this topic in your reader of choice.

56 stories tagged with #transformer, in publish-time order across the WeSearch catalog. Tag pages update as new stories ingest.

⌘ RSS feed for this tag → or search "Transformer"

RELATED TAGS

#transformers15 #ai14 #ml13 #transformer-models4 #entertainment2 #hasbro2 #machinelearning2 #anniversary2 #model-serving1 #state-space-models1 #distributed-systems1 #llm-architecture1

GIZMODO

Two Years Later, We’re Finally Learning How a Transformers-Inspired Rover Fared on the Moon

SORA-Q showed that tiny robots could do big things on the Moon.…

11 views · Wed, 10 Jun 2026 18:03:35 GMT

ARXIV CS.AI

Evaluating Transformer and LSTM Frameworks for Prediction in Ungauged Basins

Watershed networks exhibit convergent topologies in which multiple tributaries merge into downstream channels,integrating diverse upstream hydrological processes. In ungauged basin…

15 views · Wed, 03 Jun 2026 04:11:55 GMT

#artificial intelligence #machine learning #hydrology

ARXIV CS.AI

RelGT-AC: A Relational Graph Transformer for Autocomplete Tasks in Relational Databases

Relational databases underpin modern enterprise, scientific, and healthcare systems, yet predictive machine learning on such data remains challenging due to their multi-table, hete…

17 views · Wed, 03 Jun 2026 04:11:55 GMT

#artificial intelligence #machine learning #databases

KDNUGGETS

Practical NLP in the Browser with Transformers.js

This tutorial covers three NLP tasks: text classification, zero-shot labelling, and question answering using Transformers.js's pipeline() API.…

23 views · Fri, 29 May 2026 14:05:00 GMT

#nlp #javascript #transformers

R/STABLEDIFFUSION

PrismML just released Binary and Ternary Bonsai Image 4B: 1-bit/ternary text-to-image diffusion transformers that can even run 100% locally in your browser on WebGPU.

23 views · Tue, 26 May 2026 20:07:57 GMT

ALEKSAGORDIC

The Transformer: The Life of a Token

A deep dive into a modern dense transformer: YaRN, hybrid attention, soft capping, QK normalization, FLOPs/token, cluster sizing, and more.…

13 views · Tue, 26 May 2026 17:37:50 GMT

#technology #artificial intelligence #machine learning

YAHOO FINANCE

Transformers Foundation Outlines Why Many Traceability Tools Fail to Meet the Standards of New Regulation

13 views · Tue, 26 May 2026 11:22:51 GMT

DEV.TO (TOP)

Transformer as an Incomplete Cognitive Architecture: What It Captures Well and What It Misses (A11 Perspective)

Since its introduction, the transformer architecture has become the cornerstone of modern artificial...…

14 views · Tue, 26 May 2026 11:07:48 GMT

#ai #architecture #machinelearning

R/SINGULARITY

One of the authors of "Attention is All You Need" just argued we should move past it. Pathway’s Post-Transformer debate is worth watching

24 views · Mon, 25 May 2026 17:07:42 GMT

R/SINGULARITY

One of the authors of "Attention is All You Need" just argued we should move past it. Pathway’s Post-Transformer debate is worth watching

11 views · Mon, 25 May 2026 16:07:42 GMT

THE HINDU — TOP

Ahead of monsoon, CESC sets up transformer banks across taluks

CESC establishes transformer banks across taluks to ensure uninterrupted power supply during the monsoon season in Mysuru and surrounding districts.…

16 views · Mon, 25 May 2026 14:42:37 GMT

#electricity #monsoon #infrastructure

TOWARDS DATA SCIENCE

From TF-IDF to Transformers: Implementing Four Generations of Semantic Search

How did semantic search evolve from simple keyword matching into modern transformer-based language understanding? This hands-on article builds four generations of semantic search s…

14 views · Mon, 25 May 2026 13:32:37 GMT

#ai #machine learning #semantic search

ARXIV CS.AI

Tensor Cache: Eviction-conditioned Associative Memory for Transformers

Autoregressive Transformer KV caches grow linearly with context length; sliding-window caching bounds memory but discards evicted tokens entirely, so relevant evidence outside the …

16 views · Mon, 25 May 2026 04:07:35 GMT

#machine learning #artificial intelligence #transformers

ARXIV CS.AI

Every Component is a Lookup: Token Attribution and Composition from a Single Decomposition

Mechanistic interpretability of transformers requires identifying not just which components matter but how they compose into the computational route that produced a prediction. Bot…

15 views · Mon, 25 May 2026 04:07:35 GMT

#machine learning #artificial intelligence #transformers

XDA DEVELOPERS

Most gamers aren't actually using DLSS 4.5's new transformer model — here's why and how to fix it

I wouldn't go with Nvidia's recommended defaults.…

13 views · Sat, 23 May 2026 23:07:28 GMT

#gaming #technology #nvidia

R/STABLEDIFFUSION

SEGA: Spectral-Energy Guided Attention for Resolution Extrapolation in Diffusion Transformers

18 views · Sat, 23 May 2026 08:07:28 GMT

THE HINDU — TOP

Transformer failure halts Rapid Metro, triggers blackout in half-a-dozen Gurugram sectors

Transformer failure disrupts Rapid Metro services in Gurugram, causing widespread power outages for nearly 45 minutes.…

15 views · Sat, 23 May 2026 03:17:24 GMT

#transportation #power outage #gurugram

TIMES OF INDIA — TOP

Gurugram hit by major power outage after transformer blaze disrupts supply

NEW DELHI: Power outage hit Gurugram after the main transformer at the 220 KVA power station in Sector-72 caught fire, disrupting electricity supply across several parts of the cit…

12 views · Fri, 22 May 2026 17:27:02 GMT

#gurugram #power outage #infrastructure

ARXIV.ORG

Coda: Rewriting Transformer Blocks as GEMM-Epilogue Programs

Transformer training systems are built around dense linear algebra, yet a nontrivial fraction of end-to-end time is spent on surrounding memory-bound operators. Normalization, acti…

11 views · Fri, 22 May 2026 05:02:00 GMT

#machine learning #transformers #gpu

ARXIV CS.AI

Plug-and-Play Spiking Operators: Breaking the Nonlinearity Bottleneck in Spiking Transformers

ANN-to-SNN conversion offers a practical, training-free route to spiking large language models. However, current pipelines primarily focus on spike-driven realizations for Transfor…

15 views · Fri, 22 May 2026 04:02:00 GMT

#machine learning #artificial intelligence #neuroscience

ARXIV CS.AI

Weight Decay Regimes in Grokking Transformers: Cheap Online Diagnostics

Transformers trained on modular arithmetic exhibit sharp transitions between memorization, generalization, and collapse. We show that weight decay acts as a scalar empirical contro…

15 views · Fri, 22 May 2026 04:02:00 GMT

#machine learning #artificial intelligence #neural computing

ARXIV CS.AI

Rethinking Cross-Layer Information Routing in Diffusion Transformers

Diffusion Transformers (DiTs) have become a de facto backbone of modern visual generation, and nearly every major axis of their design -- tokenization, attention, conditioning, obj…

17 views · Fri, 22 May 2026 04:02:00 GMT

#computer vision #artificial intelligence #transformers

INVESTING.COM — NEWS

Enphase Energy stock surges on data center transformer opportunity

19 views · Thu, 21 May 2026 16:41:36 GMT

LIVE SCIENCE

China's real-life 'transformer' mech is a giant humanoid robot that can switch from bounding on 4 legs to walking on 2

The new 'mecha' robot, which weighs over 1,000 pounds and stands nearly 10 foot tall, is designed for urban mobility.…

14 views · Thu, 21 May 2026 15:06:11 GMT

#robotics #technology #innovation

UBER

Scaling Real-Time Traffic Forecasting with a Graph-Aware Transformer

Learn how Uber deployed a deep transformer model with graph data pipelines to solve real-time traffic forecasting, improving route quality and arrival times for millions of custome…

13 views · Wed, 20 May 2026 17:05:02 GMT

#technology #transportation #artificial intelligence

ARXIV.ORG

WorldParticle: Unified World Simulation of Lagrangian Particles via Transformer

A unified simulator that can model diverse physical phenomena without solver-specific redesign is a long-standing goal across simulation science. We present a learning-based partic…

15 views · Wed, 20 May 2026 06:05:00 GMT

#computer science #graphics #machine learning

ARXIV CS.AI

Position: The Turing-Completeness of Real-World Autoregressive Transformers Relies Heavily on Context Management

Many works make the eye-catching claim that Transformers are Turing-complete. However, the literature often conflates two distinct settings: (i) a fixed Transformer system setting,…

12 views · Wed, 20 May 2026 04:04:59 GMT

#artificial intelligence #machine learning #transformers

ARXIV CS.AI

Robust Basis Spline Decoupling for the Compression of Transformer Models

Decoupling is a powerful modeling paradigm for representing multivariate functions as compositions of linear transformations and univariate nonlinear functions. A single-layer deco…

13 views · Wed, 20 May 2026 04:04:59 GMT

#machine learning #artificial intelligence #neural networks

ARXIV CS.AI

Simply Stabilizing the Loop via Fully Looped Transformer

Scaling model performance typically requires increasing model size. Looped Transformer offers a compelling alternative by iteratively reusing the same Transformer blocks, trading a…

13 views · Wed, 20 May 2026 04:04:59 GMT

#machine learning #artificial intelligence #transformer models

ARXIV CS.AI

Block-Based Double Decoders

Encoder-decoder models offer substantial inference-time savings over decoder-only models, but their pretraining objectives suffer from sparse supervision and dynamic sequence lengt…

20 views · Wed, 20 May 2026 04:04:59 GMT

#machine learning #artificial intelligence #transformer models

ARXIV CS.AI

Emergence of Frontier Superposition: M\"obius attractor and Cascade Supervision

Superposition allows Transformers to reason in depth, carrying an entire reasoning frontier in parallel through a bounded-depth forward pass instead of unrolling serial chain-of-th…

14 views · Wed, 20 May 2026 04:04:59 GMT

#machine learning #artificial intelligence #transformers

ARXIV CS.AI

Precision Tracked Transformer via Kalman Filtering, Kriging and Process Noise

The Transformer is the foundational building block of modern AI, yet offers no principled handling of \emph{uncertainty}, which is prevalent in real applications: cold-start tokens…

17 views · Wed, 20 May 2026 04:04:59 GMT

#machine learning #artificial intelligence #transformers

ARXIV CS.AI

Transformers Linearly Represent Highly Structured World Models

Do transformers, when trained on sequential reasoning traces, build internal models of the underlying task? And if so, does the structure of those internal representations mirror t…

13 views · Wed, 20 May 2026 04:04:59 GMT

#machine learning #artificial intelligence #transformers

ARXIV CS.AI

Exact Linear Attention

This paper introduces Exact Linear Attention (ELA), a mechanism that achieves linear computational complexity for Transformer attention by leveraging the exact decomposition proper…

14 views · Wed, 20 May 2026 04:04:59 GMT

#machine learning #artificial intelligence #transformer models

ARXIV CS.AI

From Sparsity to Simplicity: Enabling Simpler Sequential Replacements via Sparse Attention Distillation

Self-attention serves as the core foundation of large-scale transformer pretraining, but its quadratic token interaction cost makes inference expensive. Replacing attention with si…

11 views · Wed, 20 May 2026 04:04:59 GMT

#machine learning #artificial intelligence #transformers

DEV.TO (TOP)

I Tested KTransformers on My Laptop — 5 Hidden Features That Made 671B Models Actually Work 🔥

In May 2026, a GitHub project with 17,179 stars quietly achieved what cloud providers spend millions...…

13 views · Wed, 20 May 2026 03:34:59 GMT

#artificial intelligence #technology #software

DEV.TO (TOP)

KTransformers 的5个隐藏用法：671B模型在一台机器上跑出286 tokens/s 🔥

2026年5月，一个GitHub上仅有17,179颗星的开源项目，做到了各大云厂商砸了数百万美元才勉强做到的事情：在一台机器上以286...…

13 views · Wed, 20 May 2026 03:34:59 GMT

#technology #artificial intelligence #machine learning

GIZMODO

This Week Feels Like Christmas for Fans of ‘Transformers: The Movie’

'The Apology Tour' for the classic 1986 animated film continues with a few re-releases.…

13 views · Tue, 19 May 2026 19:04:57 GMT

#transformers #movies #anniversary

BILLBOARD

Hasbro Is Celebrating 40 Years of ‘The Transformers: The Movie’ With ‘Reformatted’ Soundtrack — And Yes, Stan Bush Is Back

'Transformers: The Movie' at 40: New soundtrack taps Stan Bush, Sebastian Bach and more.…

16 views · Tue, 19 May 2026 12:04:57 GMT

#entertainment #music #anniversary

R/MACHINELEARNING

Need reliable source for 30+ years of S&P 500 historical data for LSTM/Transformer research [P]

18 views · Tue, 19 May 2026 03:35:00 GMT

DEV.TO (TOP)

[Day 7] Does Giving an AI More 'Thinking Time' Really Make It Smarter? Training an OpenMythos-Style Mini Model on DGX

Day 7 of my 100-experiment local LLM challenge. Trained a tiny OpenMythos-style mini model (theoretical reconstruction of the rumored Claude Mythos architecture) on multi-digit add…

18 views · Tue, 19 May 2026 03:34:57 GMT

#ai #machinelearning #transformers

R/MOVIES

Official 40th Anniversary Poster for ‘The Transformers: The Movie’ Returning to Theaters September 17

84 views · Mon, 18 May 2026 17:05:00 GMT

HUGGING FACE BLOG

PaddleOCR 3.5: Running OCR and Document Parsing Tasks with a Transformers Backend

A Blog post by PaddlePaddle on Hugging Face…

17 views · Mon, 18 May 2026 15:14:56 GMT

#ocr #transformers #document parsing

VARIETY

‘Transformers’ Attraction to Launch in Brazil This Year as Hasbro Expands Global Experiences Biz (EXCLUSIVE)

A "Transformers" attraction will open in Brazil later this year, marking the latest live experience from toy giant Hasbro and a huge push into LATAM.…

12 views · Mon, 18 May 2026 14:04:56 GMT

#entertainment #hasbro #transformers

GIST

Usual implementation of attention transformers (SDPA) is kind of bad, actually

The usual implementaiton of attention transformers (SDPA) is kind of bad, actually - antisdpa.md…

14 views · Mon, 18 May 2026 04:34:54 GMT

#artificial intelligence #machine learning #technology

ARXIV CS.AI

MR2-ByteTrack: CNN and Transformer-based Video Object Detection for AI-augmented Embedded Vision Sensor Nodes

Modern smart vision sensors need on-device intelligence to process video streams, as cloud computing is often impractical due to bandwidth, latency, and privacy constraints. Howeve…

15 views · Mon, 18 May 2026 04:04:54 GMT

#computer vision #artificial intelligence #embedded systems

ARXIV CS.AI

Grokking as Structural Inference: Transformers Need Bayesian Lottery Tickets

Why does a Transformer that has memorized its training set wait thousands of steps before it generalizes? Existing accounts locate this delay in norm minimization, feature emergenc…

16 views · Mon, 18 May 2026 04:04:54 GMT

#machine learning #artificial intelligence #transformers

DEV.TO (TOP)

Taming the Spike: Predicting Glucose Peaks 30 Minutes Ahead with Transformers and TensorFlow 🩸🚀

Managing blood glucose is like trying to drive a car where the steering wheel has a 20-minute lag....…

12 views · Mon, 18 May 2026 01:33:21 GMT

#health #technology #machinelearning

MEDIUM

Autoregressive next token prediction and KV Cache in transformers

Understand the optimization technique in LLMs to speed up token generation…

14 views · Sun, 17 May 2026 20:33:20 GMT

#technology #artificial intelligence #machine learning

R/STABLEDIFFUSION

What is transformer architecture?

16 views · Sun, 17 May 2026 08:04:02 GMT

R/MACHINELEARNING

Made and Published a Paper Comparing Analysis of CNN and Vision Transformer Architectures for Brain Tumor Detection [R]

16 views · Sat, 16 May 2026 16:10:22 GMT

HACKER NEWS (AI / LLM)

Recent Developments in LLM Architectures: KV Sharing, MHC, Compressed Attention

From Gemma 4 to DeepSeek V4, How New Open-Weight LLMs Are Reducing Long-Context Costs…

12 views · Sat, 16 May 2026 15:00:18 GMT

#llm architecture #attention mechanisms #memory efficiency

HINDUSTAN TIMES — TOP

Madras HC orders CBI probe into alleged money laundering in procuring transformers

The bench directed the State Directorate of Vigilance and Anti-Corruption (DVAC) that had been probing the matter until now, to “hand over” all papers and records related to the ca…

12 views · Wed, 29 Apr 2026 06:50:31 GMT

#corruption #investigation #tenders

THE HINDU — TOP

Madras High Court orders CBI probe into ₹397-crore transformer procurement during Senthilbalaji’s tenure

Madras High Court orders CBI investigation into ₹397 crore transformer procurement scam during V. Senthilbalaji's tenure as Electricity Minister.…

14 views · Wed, 29 Apr 2026 06:50:31 GMT

#corruption #investigation #politics

VERCEL

Disaggregated Serving for Hybrid SSM Models in vLLM

Hybrid architectures that interleave Mamba-style SSM layers with standard full-attention (FA) layers — such as NVIDIA Nemotron-H — are gaining traction as a way…

10 views · Tue, 28 Apr 2026 20:44:39 GMT

#machine learning #model serving #state-space models

ALL NEWS

Astor Enerji shares rise 3% on $51.5M transformer deal

12 views · Tue, 28 Apr 2026 08:00:23 GMT

Browse more

All tags Search "Transformer" RSS feed World US Technology Markets

Transformer coverage.

Two Years Later, We’re Finally Learning How a Transformers-Inspired Rover Fared on the Moon

Evaluating Transformer and LSTM Frameworks for Prediction in Ungauged Basins

RelGT-AC: A Relational Graph Transformer for Autocomplete Tasks in Relational Databases

Practical NLP in the Browser with Transformers.js

PrismML just released Binary and Ternary Bonsai Image 4B: 1-bit/ternary text-to-image diffusion transformers that can even run 100% locally in your browser on WebGPU.

The Transformer: The Life of a Token

Transformers Foundation Outlines Why Many Traceability Tools Fail to Meet the Standards of New Regulation

Transformer as an Incomplete Cognitive Architecture: What It Captures Well and What It Misses (A11 Perspective)

One of the authors of "Attention is All You Need" just argued we should move past it. Pathway’s Post-Transformer debate is worth watching

One of the authors of "Attention is All You Need" just argued we should move past it. Pathway’s Post-Transformer debate is worth watching

Ahead of monsoon, CESC sets up transformer banks across taluks

From TF-IDF to Transformers: Implementing Four Generations of Semantic Search

Tensor Cache: Eviction-conditioned Associative Memory for Transformers

Every Component is a Lookup: Token Attribution and Composition from a Single Decomposition

Most gamers aren't actually using DLSS 4.5's new transformer model — here's why and how to fix it

SEGA: Spectral-Energy Guided Attention for Resolution Extrapolation in Diffusion Transformers

Transformer failure halts Rapid Metro, triggers blackout in half-a-dozen Gurugram sectors

Gurugram hit by major power outage after transformer blaze disrupts supply

Coda: Rewriting Transformer Blocks as GEMM-Epilogue Programs

Plug-and-Play Spiking Operators: Breaking the Nonlinearity Bottleneck in Spiking Transformers

Weight Decay Regimes in Grokking Transformers: Cheap Online Diagnostics

Rethinking Cross-Layer Information Routing in Diffusion Transformers

Enphase Energy stock surges on data center transformer opportunity

China's real-life 'transformer' mech is a giant humanoid robot that can switch from bounding on 4 legs to walking on 2

Scaling Real-Time Traffic Forecasting with a Graph-Aware Transformer

WorldParticle: Unified World Simulation of Lagrangian Particles via Transformer

Position: The Turing-Completeness of Real-World Autoregressive Transformers Relies Heavily on Context Management

Robust Basis Spline Decoupling for the Compression of Transformer Models

Simply Stabilizing the Loop via Fully Looped Transformer

Block-Based Double Decoders

Emergence of Frontier Superposition: M\"obius attractor and Cascade Supervision

Precision Tracked Transformer via Kalman Filtering, Kriging and Process Noise

Transformers Linearly Represent Highly Structured World Models

Exact Linear Attention

From Sparsity to Simplicity: Enabling Simpler Sequential Replacements via Sparse Attention Distillation

I Tested KTransformers on My Laptop — 5 Hidden Features That Made 671B Models Actually Work 🔥

KTransformers 的5个隐藏用法：671B模型在一台机器上跑出286 tokens/s 🔥

This Week Feels Like Christmas for Fans of ‘Transformers: The Movie’

Hasbro Is Celebrating 40 Years of ‘The Transformers: The Movie’ With ‘Reformatted’ Soundtrack — And Yes, Stan Bush Is Back

Need reliable source for 30+ years of S&P 500 historical data for LSTM/Transformer research [P]

[Day 7] Does Giving an AI More 'Thinking Time' Really Make It Smarter? Training an OpenMythos-Style Mini Model on DGX

Official 40th Anniversary Poster for ‘The Transformers: The Movie’ Returning to Theaters September 17

PaddleOCR 3.5: Running OCR and Document Parsing Tasks with a Transformers Backend

‘Transformers’ Attraction to Launch in Brazil This Year as Hasbro Expands Global Experiences Biz (EXCLUSIVE)

Usual implementation of attention transformers (SDPA) is kind of bad, actually

MR2-ByteTrack: CNN and Transformer-based Video Object Detection for AI-augmented Embedded Vision Sensor Nodes

Grokking as Structural Inference: Transformers Need Bayesian Lottery Tickets

Taming the Spike: Predicting Glucose Peaks 30 Minutes Ahead with Transformers and TensorFlow 🩸🚀

Autoregressive next token prediction and KV Cache in transformers

What is transformer architecture?

Made and Published a Paper Comparing Analysis of CNN and Vision Transformer Architectures for Brain Tumor Detection [R]

Recent Developments in LLM Architectures: KV Sharing, MHC, Compressed Attention

Madras HC orders CBI probe into alleged money laundering in procuring transformers

Madras High Court orders CBI probe into ₹397-crore transformer procurement during Senthilbalaji’s tenure

Disaggregated Serving for Hybrid SSM Models in vLLM

Astor Enerji shares rise 3% on $51.5M transformer deal

Browse more