WeSearch

From Skeletons to Pixels: Few-Shot Precise Event Spotting via Representation and Prediction Distillation

·3 min read · 0 reactions · 0 comments · 3 views
From Skeletons to Pixels: Few-Shot Precise Event Spotting via Representation and Prediction Distillation

Precise Event Spotting (PES) is essential in fast-paced sports such as tennis, where fine-grained events occur within very short temporal windows. Accurate frame-level localization is challenging because of motion blur, subtle action differences, and limited annotated data. We study two complementary distillation strategies for few-shot PES: Adaptive Weight Distillation (AWD), a prediction-level method that adaptively weights teacher supervision on unlabeled data, and Annealed Multimodal Distillation for Few-Shot Event Detection (AMD-FED), a representation-level framework that transfers robust skeleton knowledge into visual modalities through annealed pseudo-labeling. Both methods use multimodal distillation to improve generalization under limited supervision. We evaluate them on F3Set-Tennis(sub) under few-shot k-clip settings, where they consistently outperform single-modality baselines and prior PES approaches. After observing the stronger performance of representation-level distillation on tennis, we further validate AMD-FED on a second sports dataset, Figure Skating, where it also shows robust performance in the k-clip scenario. These results highlight the effectiveness of multimodal distillation, especially representation-level transfer, for few-shot precise event spotting.

Original article
arXiv cs.AI
Read full at arXiv cs.AI →
Opening excerpt (first ~120 words) tap to expand

Computer Science > Computer Vision and Pattern Recognition arXiv:2604.22839 (cs) [Submitted on 21 Apr 2026] Title:From Skeletons to Pixels: Few-Shot Precise Event Spotting via Representation and Prediction Distillation Authors:Zhong Han Ervin Yeoh, Jiang Kan View a PDF of the paper titled From Skeletons to Pixels: Few-Shot Precise Event Spotting via Representation and Prediction Distillation, by Zhong Han Ervin Yeoh and 1 other authors View PDF HTML (experimental) Abstract:Precise Event Spotting (PES) is essential in fast-paced sports such as tennis, where fine-grained events occur within very short temporal windows. Accurate frame-level localization is challenging because of motion blur, subtle action differences, and limited annotated data.

Excerpt limited to ~120 words for fair-use compliance. The full article is at arXiv cs.AI.

Anonymous · no account needed
Share 𝕏 Facebook Reddit LinkedIn Threads WhatsApp Bluesky Mastodon Email

Discussion

0 comments

More from arXiv cs.AI