Interpretable Discriminative Text Representations via Agreement and Label Disentanglement

May 22, 2026 · 4:00 AM UTC ·3 min read · 0 reactions · 0 comments · 28 views

#machine learning #artificial intelligence #text classification

TL;DR · WeSearch summary

The paper presents a new method for creating interpretable text representations that are both predictive and meaningful. It introduces LLM-assisted Feature Discovery (LFD), which enhances feature clarity and reduces label entanglement. The results demonstrate that LFD achieves high agreement among human annotators and maintains predictive performance across various text classification tasks.

Key facts

▪The proposed method focuses on conceptual clarity and label disentanglement in text representations.
▪LLM-assisted Feature Discovery (LFD) screens features using cross-LLM Cohen's kappa for reliability.
▪LFD features show higher agreement and are judged as less label-leaking compared to baseline concepts.

Original article

arXiv cs.AI

Read full at arXiv cs.AI →

Opening excerpt (first ~120 words) tap to expand

Computer Science > Computation and Language arXiv:2605.20693 (cs) [Submitted on 20 May 2026] Title:Interpretable Discriminative Text Representations via Agreement and Label Disentanglement Authors:Tong Wang, Yiqing Xu, Leo Yang Yang View a PDF of the paper titled Interpretable Discriminative Text Representations via Agreement and Label Disentanglement, by Tong Wang and 2 other authors View PDF HTML (experimental) Abstract:Interpretable text representations should expose coordinates that are not only predictive, but also meaningful enough for independent auditors to apply.

…

Excerpt limited to ~120 words for fair-use compliance. The full article is at arXiv cs.AI.

Anonymous · no account needed

Discussion

0 comments

Interpretable Discriminative Text Representations via Agreement and Label Disentanglement

Discussion

More from arXiv cs.AI