WeSearch

OmniToM: Benchmarking Theory of Mind in LLMs via Explicit Belief Modeling

·3 min read · 0 reactions · 0 comments · 16 views
#artificial intelligence#language models#theory of mind
OmniToM: Benchmarking Theory of Mind in LLMs via Explicit Belief Modeling
⚡ TL;DR · AI summary

The paper introduces OmniToM, a benchmark designed to evaluate the Theory of Mind capabilities in large language models (LLMs) through explicit belief modeling. It highlights the limitations of current evaluation methods that focus solely on end-point question answering, which may not accurately reflect a model's reasoning abilities. By requiring models to explicitly model belief structures, OmniToM aims to provide a more comprehensive assessment of LLMs' understanding of social dynamics.

Key facts
Original article
arXiv cs.AI
Read full at arXiv cs.AI →
Opening excerpt (first ~120 words) tap to expand

Computer Science > Artificial Intelligence arXiv:2605.26322 (cs) [Submitted on 25 May 2026] Title:OmniToM: Benchmarking Theory of Mind in LLMs via Explicit Belief Modeling Authors:Adam Bawatneh, Sagar Sapkota, Amrit Singh Bedi, Santu Karmaker, Mubarak Shah View a PDF of the paper titled OmniToM: Benchmarking Theory of Mind in LLMs via Explicit Belief Modeling, by Adam Bawatneh and 4 other authors View PDF HTML (experimental) Abstract:Theory of Mind (ToM), the ability to infer others' knowledge, intentions, and emotions, is commonly evaluated in large language models (LLMs) using end-point question answering, where performance is judged solely by the final answer to a social reasoning query.

Excerpt limited to ~120 words for fair-use compliance. The full article is at arXiv cs.AI.

Anonymous · no account needed
Share 𝕏 Facebook Reddit LinkedIn Threads WhatsApp Bluesky Mastodon Email

Discussion

0 comments

More from arXiv cs.AI