ArchSIBench: Benchmarking the Architectural Spatial Intelligence of Vision-Language Models

May 22, 2026 · 4:00 AM UTC ·3 min read · 0 reactions · 0 comments · 20 views

#computer vision #artificial intelligence #benchmarking

⚡ TL;DR · AI summary

The paper introduces ArchSIBench, a benchmark designed to evaluate the architectural spatial intelligence of Vision-Language Models (VLMs). It focuses on higher-level cognitive tasks related to architectural space, which have been largely overlooked in previous research. The findings indicate significant performance gaps between VLMs and human evaluators, particularly in spatial reasoning tasks.

Key facts

▪ArchSIBench covers five core dimensions: perception, reasoning, navigation, transformation, and configuration.
▪The benchmark includes 3,000 question-answer pairs for comprehensive evaluation.
▪Most VLMs show significant differences in architectural spatial intelligence compared to human baselines.

Original article

arXiv cs.AI

Read full at arXiv cs.AI →

Opening excerpt (first ~120 words) tap to expand

Computer Science > Computer Vision and Pattern Recognition arXiv:2605.20837 (cs) [Submitted on 20 May 2026] Title:ArchSIBench: Benchmarking the Architectural Spatial Intelligence of Vision-Language Models Authors:Qirui Shen, Wenda Wang, Jiachen Lu, Zilong Huang, Jin Bai, Lei He, Hongxuan Chen, Weixin Huang View a PDF of the paper titled ArchSIBench: Benchmarking the Architectural Spatial Intelligence of Vision-Language Models, by Qirui Shen and 7 other authors View PDF HTML (experimental) Abstract:Architectural spatial intelligence, the ability to recognize and infer architectural space, is fundamental to tasks such as robot navigation, embodied interaction, and 3D scene understanding and generation.

…

Excerpt limited to ~120 words for fair-use compliance. The full article is at arXiv cs.AI.

Anonymous · no account needed

Discussion

0 comments

ArchSIBench: Benchmarking the Architectural Spatial Intelligence of Vision-Language Models

Discussion

More from arXiv cs.AI