Audio Generator

Turn text into natural speech with dozens of voices, 70+ languages, and custom style directions for tone, accent, and pacing.

0.6

Model

Text

Voice

Auto

AI picks the best voice based on your instructions and text

Or pick a specific voice...

Zephyr

F · Bright, Enthusiastic

Puck

M · Upbeat, Casual

Charon

M · Deep, Calm

Kore

F · Firm, Clear

Fenrir

M · Warm, Friendly

Leda

F · Youthful, Energetic

Orus

M · Firm, Casual

Aoede

F · Breezy, Professional

Callirrhoe

F · Easy-going, Friendly

Autonoe

F · Bright, Warm

Enceladus

M · Breathy, Confident

Iapetus

M · Clear, Professional

Umbriel

M · Easy-going, Resonant

Algieba

M · Smooth, Warm

Despina

F · Smooth, Warm

Erinome

F · Clear, Sophisticated

Algenib

M · Gravelly, Calm

Rasalgethi

M · Informative, Energetic

Laomedeia

F · Upbeat, Approachable

Achernar

F · Soft, Inviting

Alnilam

M · Firm, Optimistic

Schedar

M · Even, Casual

Gacrux

F · Mature, Engaging

Pulcherrima

M · Forward, Youthful

Achird

M · Friendly, Articulate

Zubenelgenubi

M · Casual, Deep

Vindemiatrix

F · Gentle, Smooth

Sadachbia

M · Lively, Resonant

Sadaltager

M · Knowledgeable, Calm

Sulafat

F · Warm, Enthusiastic

Voice Instructions(optional)

Best for

•Narration and voiceover in 70+ languages
•Text-to-speech with natural language style control (e.g. 'speak warmly', 'whisper', 'excited tone')
•Multi-voice content with dozens of expressive voices
•Multilingual scripts — the language is auto-detected from the text

Tips

✓Write narration in short, clear sentences for better pacing
✓Use punctuation to control pauses: periods for full stops, commas for brief pauses, ellipsis for long pauses
✓Leave the voice on Auto and the AI picks one that matches the style instructions, or pin a specific voice if you want a recurring character
✓Use natural language style hints: 'speak slowly and dramatically', 'cheerful and upbeat', 'calm and reassuring'

Frequently Asked Questions

What voices are available?

Dozens of expressive voices spanning bright, deep, warm, firm, and dramatic characters. Each responds to plain-language style instructions for tone and delivery.

What languages are supported?

70+ languages with automatic detection. From major languages like English, Chinese, Japanese, Spanish to regional languages including Cebuano, Konkani, and Luxembourgish.

How do instructions work?

Describe how you want the voice to sound in plain English. For example: 'A 40-year-old British journalist, speaking slowly with dramatic pauses on key points.' AI interprets your direction and crafts the vocal performance.

What audio format is the output?

MP3, ready to download or chain into pipelines like Video Reel as a narration track.

Which model should I pick?

Flash TTS is the cheaper, faster path — great for drafts, short clips, and iteration. Pro TTS uses a higher-fidelity model with stronger prosody and pacing — pick it for production narration. Both render in seconds; Pro takes a beat longer.

How many credits does it cost?

0.6 credits for Flash TTS, 1.2 credits for Pro TTS. Results are ready in seconds.