Ideogram 4.0: A 9.3B open-weight image model

Jun 3, 2026 · 4:40 PM UTC ·38 min read · 0 reactions · 0 comments · 13 views

#technology #artificial intelligence #machine learning #Ideogram Team #Hugging Face #GitHub

Ideogram 4.0: A 9.3B open-weight image model

⚡ TL;DR · AI summary

Ideogram 4.0 is a new open-weight image model featuring 9.3 billion parameters. It utilizes a unique architecture that combines a vision-language text encoder with a single-stream Diffusion Transformer. The model is specifically trained on structured JSON captions to enhance image generation capabilities.

Key facts

▪Ideogram 4.0 is a 9.3B parameter open-weight text-to-image model.
▪The model employs a vision-language text encoder and a single-stream Diffusion Transformer.
▪It is trained exclusively on structured JSON captions with detailed descriptions of image elements.

Original article

Ideogram

Read full at Ideogram →

Opening excerpt (first ~120 words) tap to expand

Technical Model release June 3, 2026 Ideogram 4.0 Technical Details: Open model at the forefront of design Our first open-weight foundation model. A 9.3B single-stream Diffusion Transformer, trained from scratch, with a vision-language text encoder and structured JSON prompts. Authors. Ideogram Team Reading time. 5 min Weights. Hugging Face Code. GitHub Overview Ideogram 4.0 is a 9.3B parameter open-weight text-to-image model. Recent open-weight releases have converged on a single self-attention sequence over text and image tokens[1][2][3], and Ideogram 4.0 follows the same pattern: text and image tokens share the same projections at every layer of a 34-layer DiT. Two design choices distinguish it from peer releases.

…

Excerpt limited to ~120 words for fair-use compliance. The full article is at Ideogram.

Anonymous · no account needed

Discussion

0 comments

Ideogram 4.0: A 9.3B open-weight image model

Discussion

More from Ideogram