#transformer-models — Tagged Stories

Every story in the WeSearch catalog tagged with #transformer-models, chronological, with view counts. Subscribe to the per-tag RSS feed to follow this topic in your reader of choice.

5 stories tagged with #transformer-models, in publish-time order across the WeSearch catalog. Tag pages update as new stories ingest.

⌘ RSS feed for this tag → or search "Transformer Models"

RELATED TAGS

#ml4 #ai3 #model-serving1 #state-space-models1 #distributed-systems1

ARXIV CS.AI

Robust Basis Spline Decoupling for the Compression of Transformer Models

Decoupling is a powerful modeling paradigm for representing multivariate functions as compositions of linear transformations and univariate nonlinear functions. A single-layer deco…

14 views · Wed, 20 May 2026 04:04:59 GMT

#machine learning #artificial intelligence #neural networks

ARXIV CS.AI

Simply Stabilizing the Loop via Fully Looped Transformer

Scaling model performance typically requires increasing model size. Looped Transformer offers a compelling alternative by iteratively reusing the same Transformer blocks, trading a…

14 views · Wed, 20 May 2026 04:04:59 GMT

#machine learning #artificial intelligence

ARXIV CS.AI

Block-Based Double Decoders

Encoder-decoder models offer substantial inference-time savings over decoder-only models, but their pretraining objectives suffer from sparse supervision and dynamic sequence lengt…

21 views · Wed, 20 May 2026 04:04:59 GMT

#machine learning #artificial intelligence

ARXIV CS.AI

Exact Linear Attention

This paper introduces Exact Linear Attention (ELA), a mechanism that achieves linear computational complexity for Transformer attention by leveraging the exact decomposition proper…

15 views · Wed, 20 May 2026 04:04:59 GMT

#machine learning #artificial intelligence

VERCEL

Disaggregated Serving for Hybrid SSM Models in vLLM

Hybrid architectures that interleave Mamba-style SSM layers with standard full-attention (FA) layers — such as NVIDIA Nemotron-H — are gaining traction as a way…

11 views · Tue, 28 Apr 2026 20:44:39 GMT

#machine learning #model serving #state-space models

Browse more

All tags Search "Transformer Models" RSS feed World US Technology Markets

Transformer Models coverage.

Robust Basis Spline Decoupling for the Compression of Transformer Models

Simply Stabilizing the Loop via Fully Looped Transformer

Block-Based Double Decoders

Exact Linear Attention

Disaggregated Serving for Hybrid SSM Models in vLLM

Browse more