Chronicle: A Multimodal Foundation Model for Joint Language and Time Series Understanding
Chronicle is a new multimodal foundation model designed for joint understanding of language and time series data. It is the first model to be pretrained from scratch on both modalities within a unified architecture. Chronicle demonstrates strong performance on various natural language understanding tasks and time series classification benchmarks.
- ▪Chronicle is a 324M-parameter decoder-only transformer model.
- ▪It shares transformer blocks and attention mechanisms between text and time series data.
- ▪The model outperforms existing multimodal baselines in forecasting and classification tasks.
Opening excerpt (first ~120 words) tap to expand
Computer Science > Machine Learning arXiv:2605.20268 (cs) [Submitted on 18 May 2026] Title:Chronicle: A Multimodal Foundation Model for Joint Language and Time Series Understanding Authors:Paul Quinlan, Jeremy Levasseur, Qingguo Li, Xiaodan Zhu View a PDF of the paper titled Chronicle: A Multimodal Foundation Model for Joint Language and Time Series Understanding, by Paul Quinlan and 3 other authors View PDF HTML (experimental) Abstract:Real-world time series come with text: metadata, descriptions, news, reports. Yet time series foundation models process numerical sequences in isolation, and the multimodal text-and-time-series models that attempt to bridge the two all adapt a pretrained language model post hoc, inheriting representations shaped without ever seeing temporal data.
…
Excerpt limited to ~120 words for fair-use compliance. The full article is at arXiv cs.AI.