Pretraining Objective Matters in Extreme Low-Data FGVC: A Backbone-Controlled Study
The study investigates the impact of pretraining objectives on fine-grained visual classification in scenarios with extremely limited data. It compares various pretrained encoders to determine which yields the best representation quality for downstream tasks. The findings suggest prioritizing certain pretraining methods based on the availability of data and the type of classifiers used.
- ▪The research focuses on emerald inclusion grading using a custom dataset with labeled images across three classes.
- ▪Four frozen ViT-B/16 encoders were compared, trained with different pretraining objectives including supervised classification and contrastive learning.
- ▪Supervised and contrastive encoders showed the strongest performance in linear separability, while masked reconstruction improved results under nonlinear probes.
Opening excerpt (first ~120 words) tap to expand
Computer Science > Computer Vision and Pattern Recognition arXiv:2605.15599 (cs) [Submitted on 15 May 2026] Title:Pretraining Objective Matters in Extreme Low-Data FGVC: A Backbone-Controlled Study Authors:Alexander Hackett, Srikanth Thudumu, Ginny Fisher, Mahule Roy, Aisha Sartaj, Jason Fisher View a PDF of the paper titled Pretraining Objective Matters in Extreme Low-Data FGVC: A Backbone-Controlled Study, by Alexander Hackett and 5 other authors View PDF HTML (experimental) Abstract:Extreme low-data fine-grained classification is common in expert domains where labeling is expensive, yet practitioners still need principled guidance for selecting pretrained encoders.
…
Excerpt limited to ~120 words for fair-use compliance. The full article is at arXiv cs.AI.