STAR: A Stage-attributed Triage and Repair framework for RCA Agents in Microservices
The article discusses the STAR framework, which aims to enhance the reliability of root cause analysis (RCA) agents in microservices. STAR decomposes the RCA workflow into four structured stages to identify and repair errors effectively. Experimental results indicate that STAR significantly improves fault localization and classification while enabling self-repairing capabilities in RCA systems.
- ▪STAR is a Stage-attributed Triage and Repair framework designed for RCA agents in microservices.
- ▪The framework breaks down the RCA process into four stages: Evidence Package, Hypothesis Set, Analysis Structure, and Decision Report.
- ▪Experimental evaluations show that STAR enhances root cause localization and fault type classification compared to existing methods.
Opening excerpt (first ~120 words) tap to expand
Computer Science > Artificial Intelligence arXiv:2605.15581 (cs) [Submitted on 15 May 2026] Title:STAR: A Stage-attributed Triage and Repair framework for RCA Agents in Microservices Authors:Junle Wang, Xingchuang Liao, Wenjun Wu View a PDF of the paper titled STAR: A Stage-attributed Triage and Repair framework for RCA Agents in Microservices, by Junle Wang and 2 other authors View PDF HTML (experimental) Abstract:LLM-based root cause analysis (RCA) agents have recently emerged as a promising paradigm for incident diagnosis in microservice AIOps. However, their reliability remains fragile: an error in early evidence collection, hypothesis formulation, or causal analysis can propagate through the reasoning trace and eventually corrupt the final diagnosis.
…
Excerpt limited to ~120 words for fair-use compliance. The full article is at arXiv cs.AI.