Meow-Omni 1: a multi-modal feline LLM
Meow-Omni 1 is a new multimodal large language model designed for understanding feline behavior. It integrates video, audio, and physiological data with textual reasoning to enhance intent recognition. The model has achieved state-of-the-art accuracy and is open-source, aiming to improve veterinary diagnostics and wildlife conservation efforts.
- ▪Meow-Omni 1 is the first open-source quad-modal MLLM specifically for computational ethology.
- ▪It combines video, audio, and physiological time-series data with textual reasoning.
- ▪The model achieved a 71.16% intent-recognition accuracy on the MeowBench benchmark.
Opening excerpt (first ~120 words) tap to expand
Computer Science > Computation and Language arXiv:2605.09152 (cs) [Submitted on 9 May 2026] Title:Meow-Omni 1: A Multimodal Large Language Model for Feline Ethology Authors:Jucheng Hu, Zhangquan Chen, Yulin Chen, Chengjie Hong, Liang Zhou, Tairan Wang, Sifei Li, Giulio Zhu, Feng Zhou, Yiheng Zeng, Suorong Yang, Dongzhan Zhou View a PDF of the paper titled Meow-Omni 1: A Multimodal Large Language Model for Feline Ethology, by Jucheng Hu and 11 other authors View PDF HTML (experimental) Abstract:Deciphering animal intent is a fundamental challenge in computational ethology, largely because of semantic aliasing, the phenomenon where identical external signals (e.g., a cat's purr) correspond to radically different internal states depending on physiological context.
…
Excerpt limited to ~120 words for fair-use compliance. The full article is at arXiv.org.