OmniToM: Benchmarking Theory of Mind in LLMs via Explicit Belief Modeling

May 27, 2026 · 4:00 AM UTC ·3 min read · 0 reactions · 0 comments · 37 views

#artificial intelligence #language models #theory of mind

TL;DR · WeSearch summary

The paper introduces OmniToM, a benchmark designed to evaluate the Theory of Mind capabilities in large language models (LLMs) through explicit belief modeling. It highlights the limitations of current evaluation methods that focus solely on end-point question answering, which may not accurately reflect a model's reasoning abilities. By requiring models to explicitly model belief structures, OmniToM aims to provide a more comprehensive assessment of LLMs' understanding of social dynamics.

Key facts

▪OmniToM evaluates the ability of LLMs to infer knowledge, intentions, and emotions of actors within narratives.
▪The benchmark consists of two stages: Belief Extraction and Belief Labeling, assessing how well models track beliefs.
▪Current LLMs face challenges in transforming narrative facts into actors' beliefs and shared mental states.

Original article

arXiv cs.AI

Read full at arXiv cs.AI →

Opening excerpt (first ~120 words) tap to expand

Computer Science > Artificial Intelligence arXiv:2605.26322 (cs) [Submitted on 25 May 2026] Title:OmniToM: Benchmarking Theory of Mind in LLMs via Explicit Belief Modeling Authors:Adam Bawatneh, Sagar Sapkota, Amrit Singh Bedi, Santu Karmaker, Mubarak Shah View a PDF of the paper titled OmniToM: Benchmarking Theory of Mind in LLMs via Explicit Belief Modeling, by Adam Bawatneh and 4 other authors View PDF HTML (experimental) Abstract:Theory of Mind (ToM), the ability to infer others' knowledge, intentions, and emotions, is commonly evaluated in large language models (LLMs) using end-point question answering, where performance is judged solely by the final answer to a social reasoning query.

…

Excerpt limited to ~120 words for fair-use compliance. The full article is at arXiv cs.AI.

Anonymous · no account needed

Discussion

0 comments

OmniToM: Benchmarking Theory of Mind in LLMs via Explicit Belief Modeling

Discussion

More from arXiv cs.AI