← Back to pipelines
Video Generator
Create stunning videos from text descriptions or images — with synchronized audio, camera movement, and cinematic effects.
10
Best for
- •Creating short video clips from text descriptions or images
- •Product reveals and scene transitions with real camera movement
- •Text-to-video with native audio — no separate voice step needed
- •Cinematic shots that need real motion (not just pan/zoom on a still)
Tips
- ✓Provide detailed text prompts describing the scene, motion, and desired audio
- ✓Use a start image for consistent framing — the model animates from it
- ✓Works best for product reveals, scene changes, and dramatic transformations
- ✓For simple pan/zoom on stills, image-motion is 10x faster and cheaper
- ✓Native audio is generated with the video — no separate TTS step needed
Video Generator — AI Video from Text or Images
Generate stunning video clips from text prompts or images. Native audio generation creates synchronized sound effects, dialogue, and ambient audio. AI automatically enhances your prompt for cinematic results.
How It Works
- Describe the scene — write what you want to see: subject, action, camera movement, style
- AI enhances your prompt — adds cinematic details, camera techniques, and audio descriptions
- AI generates the video — 4-8 second clips with native audio at up to 4K resolution
Prompt Tips
- Include camera angles: close-up, aerial, tracking shot, dolly in
- Describe lighting: golden hour, neon glow, film noir shadows, volumetric fog
- Set the style: cinematic, anime, watercolor, 35mm film look
- For audio: describe sounds in a separate sentence — "Rain pattering on concrete, distant thunder"
Three Quality Tiers
- Lite — fastest generation, great for drafts and iteration
- Fast — balanced speed and quality, supports 4K
- HD — highest quality, supports reference images and video extension
Frequently Asked Questions
How long are generated videos?
4, 6, or 8 seconds per clip. Use the Video Reel pipeline to combine multiple clips into longer sequences.
Does it generate audio?
Yes — AI generates synchronized audio natively. Describe desired sounds in your prompt for best results. You can disable audio generation if needed.
What aspect ratios are supported?
16:9 (landscape) and 9:16 (portrait/vertical). Perfect for both YouTube and social media formats.
How does the prompt enhancer work?
AI analyzes your prompt and adds cinematic details — camera techniques, lighting descriptions, and audio cues — for optimal results.
What's the difference between quality tiers?
Lite is fastest and cheapest. Fast offers balanced quality with 4K support. HD provides the highest fidelity and supports reference images for subject consistency.