← Back to pipelines
Audio Generator
Turn text into natural speech with 30 voices, 73 languages, and custom style directions for tone, accent, and pacing.
0.5
AI
Auto
AI picks the best voice based on your instructions and text
Or pick a specific voice...
Zephyr
F · Bright, Enthusiastic
Puck
M · Upbeat, Casual
Charon
M · Deep, Calm
Kore
F · Firm, Clear
Fenrir
M · Warm, Friendly
Leda
F · Youthful, Energetic
Orus
M · Firm, Casual
Aoede
F · Breezy, Professional
Callirrhoe
F · Easy-going, Friendly
Autonoe
F · Bright, Warm
Enceladus
M · Breathy, Confident
Iapetus
M · Clear, Professional
Umbriel
M · Easy-going, Resonant
Algieba
M · Smooth, Warm
Despina
F · Smooth, Warm
Erinome
F · Clear, Sophisticated
Algenib
M · Gravelly, Calm
Rasalgethi
M · Informative, Energetic
Laomedeia
F · Upbeat, Approachable
Achernar
F · Soft, Inviting
Alnilam
M · Firm, Optimistic
Schedar
M · Even, Casual
Gacrux
F · Mature, Engaging
Pulcherrima
M · Forward, Youthful
Achird
M · Friendly, Articulate
Zubenelgenubi
M · Casual, Deep
Vindemiatrix
F · Gentle, Smooth
Sadachbia
M · Lively, Resonant
Sadaltager
M · Knowledgeable, Calm
Sulafat
F · Warm, Enthusiastic
Best for
- •Narration and voiceover in any of 73 supported languages
- •Text-to-speech with natural language style control (e.g. 'speak warmly', 'whisper', 'excited tone')
- •Multi-voice content with 30 distinct voices (Gemini TTS)
- •Professional-quality voiceover with expressive delivery
Tips
- ✓Write narration in short, clear sentences for better pacing
- ✓Use punctuation to control pauses: periods for full stops, commas for brief pauses, ellipsis for long pauses
- ✓AI auto-selects the best voice for your content, or you can specify a voice name
- ✓Use natural language style hints: 'speak slowly and dramatically', 'cheerful and upbeat', 'calm and reassuring'
- ✓Narration duration determines the minimum video length — write script first, then match video segment count to narration length
Audio Generator — AI Voice Synthesis
Convert any text into natural-sounding speech with 30 expressive voices and 73 languages. Control tone, accent, pacing, and emotion through natural language instructions — AI crafts the perfect delivery.
How It Works
- Enter your text — paste or type the content you want spoken
- Choose a voice — 30 voices with different characters (bright, deep, warm, firm...)
- Add instructions — describe the style: "Warm British accent, slow pace with dramatic pauses"
- Generate — AI synthesizes expressive speech
Use Cases
- Documentary narration with authoritative, professional voices
- Podcast intros, outros, and dialogue
- Multilingual voiceovers for video content
- Audiobook narration with emotional delivery
- Accessibility — make text content available as audio
Frequently Asked Questions
What voices are available?
30 voices with different characters — from deep and authoritative (Charon) to bright and youthful (Zephyr). Each voice has a distinct personality that responds to style instructions.
What languages are supported?
73 languages with auto-detection. From major languages like English, Chinese, Japanese, Spanish to regional languages like Cebuano, Konkani, and Luxembourgish.
How do instructions work?
Describe how you want the voice to sound using natural language. For example: 'A 40-year-old British journalist, speaking slowly with dramatic pauses on key points.' The AI interprets your directions and crafts the vocal performance.
What audio format is the output?
MP3 format, ready to download or use in other pipelines like Talking Avatar or Video Reel.
How many credits does it cost?
0.5 credits (Fast) or 1 credit (HD). Results are ready in seconds.