Google Introduces Gemini Omni, a Multimodal AI That Knows the World
Google has unveiled Gemini Omni, a multimodal AI capable of generating realistic videos from various inputs. This new tool allows users to edit videos in unprecedented ways while incorporating advanced physics capabilities. With built-in safeguards like the SynthID watermark, Google aims to navigate the complexities of AI-generated content responsibly.
- ▪Gemini Omni can create videos from text, images, and existing videos, making it a versatile tool for content creation.
- ▪The AI includes advanced editing features that allow users to modify videos by replacing individual elements.
- ▪Omni will be available through the redesigned Gemini app, Google Flow, and YouTube Shorts for paid subscribers.
Opening excerpt (first ~120 words) tap to expand
Google announced its latest AI product, Gemini Omni, during its I/O conference on Tuesday. Unlike existing text-to-video products such as Veo, Omni can take in virtually any input to create realistic, lifelike videos. Built on Gemini modeling architecture, Omni is a true multimodal input and output system, allowing you to create videos from text, images and existing videos. At launch, you'll be able to create videos with the aforementioned inputs, but image; text generations will be available in a future update. With Gemini at its core, Omni can process and interpret multiple types of inputs to produce a consistent, sophisticated final product.
…
Excerpt limited to ~120 words for fair-use compliance. The full article is at CNET — News.