Seven models, one generator. Each Gemini Omni model handles a different creative task — from prompt-led video and reference-locked animation to motion transfer and AI image creation. Use the comparison table below to find the right model for your workflow, then start generating.
Explore each model page for full technical specs, use cases, and generation examples.
텍스트-비디오, image-to-video, multi-shot sequencing, optional audio, and 4K-capable renders with physics-accurate motion.
Reference-guided video with style preservation, character consistency, and visual identity lock across every frame.
Side-by-side specs for every Gemini Omni model. Credits shown are for a 5-second standard clip (720p 16:9, no audio) — actual cost scales with resolution, duration, and audio.
Not sure which model to pick? Match your creative task to the right Gemini Omni model.
Prompt-led video from text or image
Veo 3.1 is the most versatile video model — text-to-video, image-to-video, Draft Mode for fast iteration, and physics-accurate motion from a single prompt.
Reference-controlled video with style lock
Veo 3.1 preserves the visual identity of your reference image across every frame. Style, character, and composition stay locked — no drift.
Movement transfer from a reference video
Upload a dance, gesture, or camera-movement reference and transfer that motion to any still subject. Full-body capture at 1080p.
Generate a reference frame before video
Create the style frame, product concept, or character reference first — then feed it into Veo 3.1 or Veo 3.1 for video generation.
Common questions about choosing between Gemini Omni models.