WeSearch

Evaluating Large Language Models in a Complex Hidden Role Game

·3 min read · 0 reactions · 0 comments · 10 views
#artificial intelligence#language models#game theory
Evaluating Large Language Models in a Complex Hidden Role Game
⚡ TL;DR · AI summary

The study evaluates the deceptive capabilities of Large Language Models (LLMs) in the social deduction game Secret Hitler. It introduces a framework and metrics to measure performance, revealing a gap between conversational ability and strategic depth. Findings indicate that current LLM architectures struggle with complex manipulation and deception in multi-turn scenarios.

Key facts
Original article
arXiv cs.AI
Read full at arXiv cs.AI →
Opening excerpt (first ~120 words) tap to expand

Computer Science > Computation and Language arXiv:2605.22826 (cs) [Submitted on 9 Apr 2026] Title:Evaluating Large Language Models in a Complex Hidden Role Game Authors:Niklas Bauer View a PDF of the paper titled Evaluating Large Language Models in a Complex Hidden Role Game, by Niklas Bauer View PDF HTML (experimental) Abstract:Quantifying the deceptive potential of Large Language Models (LLMs) is critical for AI safety, yet difficult to achieve in uncontrolled environments. This work investigates the reasoning, persuasion, and deceptive capabilities of LLMs within the social deduction game Secret Hitler. I introduce an open-source framework and novel metrics to measure performance: Role Identification Accuracy, Deception Retention Rate, and Game State Impact Rate.

Excerpt limited to ~120 words for fair-use compliance. The full article is at arXiv cs.AI.

Anonymous · no account needed
Share 𝕏 Facebook Reddit LinkedIn Threads WhatsApp Bluesky Mastodon Email

Discussion

0 comments

More from arXiv cs.AI