Neural Estimation of Pairwise Mutual Information in Masked Discrete Sequence Models
A new paper proposes a neural framework for estimating pairwise conditional mutual information in masked discrete sequence models. This method aims to improve the interpretability and efficiency of masked diffusion models by capturing inter-variable dependencies. The approach has shown promising results in applications such as Sudoku and protein sequence generation, significantly reducing inference time while maintaining generative quality.
- ▪The proposed framework estimates pairwise conditional mutual information directly from the hidden states of pretrained masked diffusion models.
- ▪It enables MI-guided parallel decoding by identifying conditionally independent subsets of variables.
- ▪The evaluation on Sudoku and protein sequence generation demonstrated a 3-5x reduction in inference-time forward passes compared to traditional methods.
Opening excerpt (first ~120 words) tap to expand
Computer Science > Machine Learning arXiv:2605.20187 (cs) [Submitted on 27 Jan 2026] Title:Neural Estimation of Pairwise Mutual Information in Masked Discrete Sequence Models Authors:Jai Sharma, Yifan Wang, Bryan Li View a PDF of the paper titled Neural Estimation of Pairwise Mutual Information in Masked Discrete Sequence Models, by Jai Sharma and 2 other authors View PDF HTML (experimental) Abstract:Understanding dependencies between variables is critical for interpretability and efficient generation in masked diffusion models (MDMs), yet these models primarily expose marginal conditional distributions and do not explicitly represent inter-variable dependence.
…
Excerpt limited to ~120 words for fair-use compliance. The full article is at arXiv cs.AI.