DRIVE: Modeling Skills at the Reasoning and Interaction Levels for Web Agents under Continual Learning
The paper titled DRIVE proposes a dual-level skill modeling framework for web agents to enhance their reasoning and interaction capabilities. It addresses the challenges of separating abstract reasoning knowledge from concrete interaction knowledge, which often limits the effectiveness of web agents. Experimental results indicate that DRIVE improves task success rates compared to existing methods.
- ▪DRIVE separates historical experiences into natural language reasoning skills and programmatic interaction skills.
- ▪The framework uses a scene-aware coordination mechanism to adaptively retrieve and invoke skills based on task semantics.
- ▪Experiments show that DRIVE achieves an average task success rate of 52.8%, outperforming the skill-free baseline by 7.3 percentage points.
Opening excerpt (first ~120 words) tap to expand
Computer Science > Artificial Intelligence arXiv:2605.23939 (cs) [Submitted on 28 Apr 2026] Title:DRIVE: Modeling Skills at the Reasoning and Interaction Levels for Web Agents under Continual Learning Authors:Xirui Liu, Sihang Zhou, Yanning Hou, Rong Zhou, Haoyuan Chen, Maolin He, Siwei Wang, Hao Chen, Jian Huang View a PDF of the paper titled DRIVE: Modeling Skills at the Reasoning and Interaction Levels for Web Agents under Continual Learning, by Xirui Liu and 8 other authors View PDF HTML (experimental) Abstract:Web agents require both high-level reasoning (for task decomposition) and low-level interactions (for page elements manipulation) to conduct different tasks.
…
Excerpt limited to ~120 words for fair-use compliance. The full article is at arXiv cs.AI.