STELLAR: Scaling 3D Perception Large Models for Autonomous Driving
The STELLAR model aims to enhance 3D perception for autonomous driving by scaling large models. It incorporates various sensor data, including LiDAR and radar, and has been trained on a substantial dataset of driving examples. The model has achieved state-of-the-art performance on the Waymo Open Dataset challenge, indicating the potential of large-scale training in this field.
- ▪The STELLAR model is based on Sparse Window Transformer and integrates multiple input modalities.
- ▪It was trained on a dataset containing 50 million driving examples and has up to 500 million parameters.
- ▪The model demonstrates significant scaling trends that link performance to model size, data, and compute.
- ▪STELLAR has set a new benchmark in the Waymo Open Dataset challenge, outperforming previous models.
Opening excerpt (first ~120 words) tap to expand
Computer Science > Computer Vision and Pattern Recognition arXiv:2605.20390 (cs) [Submitted on 19 May 2026] Title:STELLAR: Scaling 3D Perception Large Models for Autonomous Driving Authors:Yingwei Li, Xin Huang, Yang Liu, Yang Fu, Alex Zihao Zhu, Chen Song, Junwen Yao, Anant Subramanian, Hao Xiang, Weijing Shi, Yuliang Zou, Tom Hoddes, Zhaoqi Leng, Govind Thattai, Dragomir Anguelov, Mingxing Tan View a PDF of the paper titled STELLAR: Scaling 3D Perception Large Models for Autonomous Driving, by Yingwei Li and 15 other authors View PDF HTML (experimental) Abstract:Model scaling has demonstrated remarkable success through large-scale training on diverse datasets.
…
Excerpt limited to ~120 words for fair-use compliance. The full article is at arXiv cs.AI.