Compositional Literary Primitives in Instruction-Tuned LLMs: Cross-Architectural SAE Features for Self, Style, and Affect
The paper discusses a compositional architecture of literary primitives in instruction-tuned large language models. It identifies four feature classes that enhance emotional expression and stylistic modulation in the models Llama and Gemma. The study employs a validation pipeline to assess the models' performance in generating affect-categorizable outputs.
- ▪The study characterizes literary primitives in two instruction-tuned large language models, Llama and Gemma.
- ▪Four feature classes were identified, including naming-gates and stylistic register modulators.
- ▪Llama achieved full coverage of a 27-category emotion taxonomy, while Gemma reached 23 out of 27.
Opening excerpt (first ~120 words) tap to expand
Computer Science > Machine Learning arXiv:2605.18808 (cs) [Submitted on 11 May 2026] Title:Compositional Literary Primitives in Instruction-Tuned LLMs: Cross-Architectural SAE Features for Self, Style, and Affect Authors:Joao Paulo Cavalcante Presa, Savio Salvarino Teles de Oliveira View a PDF of the paper titled Compositional Literary Primitives in Instruction-Tuned LLMs: Cross-Architectural SAE Features for Self, Style, and Affect, by Joao Paulo Cavalcante Presa and 1 other authors View PDF HTML (experimental) Abstract:We characterize a compositional architecture of literary primitives in two instruction-tuned large language models (Llama 3.1 8B-Instruct and Gemma 2 9B-IT) via sparse autoencoders on mid-depth residual streams.
…
Excerpt limited to ~120 words for fair-use compliance. The full article is at arXiv cs.AI.