Search: "transformer models"

9 stories match your query across our 700+ source catalog. Ranked by relevance and recency.

9 results for "transformer models"

The Spectral Lifecycle of Transformer Training: Transient Compression Waves, Persistent Spectral Gradients, and the Q/K--V Asymmetry

We present the first systematic study of weight matrix singular value spectra \emph{during} transformer pretraining, tracking full SVD decompositions of every weight matrix at 25-step intervals across…

Wed, 29 Apr 2026 04:04:25 GMT · 4 views

ARXIV CS.AI

BiTA: Bidirectional Gated Recurrent Unit-Transformer Aggregator in a Temporal Graph Network Framework for Alert Prediction in Computer Networks

Proactive alert prediction in computer networks is critical for mitigating evolving cyber threats and enabling timely defensive actions. Temporal Graph Neural Networks (TGNs) provide a principled fram…

Wed, 29 Apr 2026 04:04:25 GMT · 4 views

ROBBYANT 蚂蚁灵波科技

LingBot-Map: Streaming 3D reconstruction with geometric context transformer

Technology-driven and application-oriented. We build foundational large models for embodied AI: spatial perception (LingBot-Depth), VLA (LingBot-VLA), world models (LingBot-World), video action (LingB…

Tue, 28 Apr 2026 04:35:13 GMT · 5 views

ARXIV CS.AI

The Randomness Floor: Measuring Intrinsic Non-Randomness in Language Model Token Distributions

Language models cannot be random. This paper introduces Entropic Deviation (ED), the normalised KL divergence between a model's token distribution and the uniform distribution, and measures it systema…

Wed, 29 Apr 2026 04:04:25 GMT · 4 views

ARXIV CS.AI

Stochastic KV Routing: Enabling Adaptive Depth-Wise Cache Sharing

Serving transformer language models with high throughput requires caching Key-Values (KVs) to avoid redundant computation during autoregressive generation. The memory footprint of KV caching is signif…

Wed, 29 Apr 2026 04:04:25 GMT · 4 views

ARXIV CS.AI

Applied AI-Enhanced RF Interference Rejection

AI-enhanced interference rejection in radio frequency (RF) transmissions has recently attracted interest because deep learning approaches trained on both the signal of interest (SOI) and the signal mi…

Wed, 29 Apr 2026 04:04:25 GMT · 4 views

ARXIV CS.AI

MAE-Based Self-Supervised Pretraining for Data-Efficient Medical Image Segmentation Using nnFormer

Transformer architectures, including nnFormer,have demonstrated promising results in volumetric medical image segmentation by being able to capture long-range spatial interactions. Although they have …

Wed, 29 Apr 2026 04:04:25 GMT · 4 views

STABLEDIFFUSION

Ernie VS Qwen and ZiT - Big Test

A large test of 100 images in a gallery Big image generator showdown: 100 prompts, 3 models, 1 winner. This comparison brings together three open image models with very different strengths. ERNIE-Imag…

Tue, 28 Apr 2026 16:28:46 GMT · 6 views

ARXIV.ORG

Beyond the Attention Stability Boundary: Agentic Self-Synthesizing Reasoning Protocols

As LLM agents transition to autonomous digital coworkers, maintaining deterministic goal-directedness in non-linear multi-turn conversations emerged as an architectural bottleneck. We identify and for…

Tue, 28 Apr 2026 04:13:21 GMT · 4 views

Or browse by topic

World US Politics Technology AI Markets Business Science Climate Health Culture Media

Results for "transformer models".

The Spectral Lifecycle of Transformer Training: Transient Compression Waves, Persistent Spectral Gradients, and the Q/K--V Asymmetry

BiTA: Bidirectional Gated Recurrent Unit-Transformer Aggregator in a Temporal Graph Network Framework for Alert Prediction in Computer Networks

LingBot-Map: Streaming 3D reconstruction with geometric context transformer

The Randomness Floor: Measuring Intrinsic Non-Randomness in Language Model Token Distributions

Stochastic KV Routing: Enabling Adaptive Depth-Wise Cache Sharing

Applied AI-Enhanced RF Interference Rejection

MAE-Based Self-Supervised Pretraining for Data-Efficient Medical Image Segmentation Using nnFormer

Ernie VS Qwen and ZiT - Big Test

Beyond the Attention Stability Boundary: Agentic Self-Synthesizing Reasoning Protocols

Or browse by topic