Teaching AI Through Benchmark Construction: QuestBench as a Course-Based Practice for Accountable Knowledge Work

May 22, 2026 · 4:00 AM UTC ·3 min read · 0 reactions · 0 comments · 34 views

#education #artificial intelligence #benchmarking

TL;DR · WeSearch summary

The article discusses a new educational approach to teaching AI through benchmark construction, specifically using a tool called QuestBench. This method allows students to create expert-level questions and evaluate AI systems, fostering a deeper understanding of AI's role in knowledge work. The findings indicate that many AI systems struggle with accuracy, highlighting the importance of critical evaluation in AI education.

Key facts

▪QuestBench consists of 256 questions across 14 humanities and social-science domains.
▪Evaluation of QuestBench revealed that the mean question-level pass rate for thirteen AI systems was only 16.85%.
▪The best-performing system, GPT-5.5, achieved a pass rate of 57.58%, indicating significant room for improvement.

Original article

arXiv cs.AI

Read full at arXiv cs.AI →

Opening excerpt (first ~120 words) tap to expand

Computer Science > Artificial Intelligence arXiv:2605.21413 (cs) [Submitted on 20 May 2026 (v1), last revised 21 May 2026 (this version, v2)] Title:Teaching AI Through Benchmark Construction: QuestBench as a Course-Based Practice for Accountable Knowledge Work Authors:Haiyang Shen, Jiuzheng Wang, Taian Guo, Mugeng Liu, Wenchun Jing, Chongyang Pan, Siqi Zhong, Zhiyang Chen, Weichen Bi, Yudong Han, Xiaoying Bai, Yun Ma View a PDF of the paper titled Teaching AI Through Benchmark Construction: QuestBench as a Course-Based Practice for Accountable Knowledge Work, by Haiyang Shen and 11 other authors View PDF HTML (experimental) Abstract:As AI becomes part of everyday learning, many courses teach students to use it mainly as a productivity tool: how to prompt, search, summarize, write, code,…

Excerpt limited to ~120 words for fair-use compliance. The full article is at arXiv cs.AI.

Anonymous · no account needed

Discussion

0 comments

Teaching AI Through Benchmark Construction: QuestBench as a Course-Based Practice for Accountable Knowledge Work

Discussion

More from arXiv cs.AI