Voice studio
AI Audio & Text-to-Speech
AI Audio lets Maximo turn text into natural speech from chat or voice using Maximo Pandora Voice (Google Gemini text-to-speech). Use Voice Flash for fast, expressive, multilingual narration and Voice Pro for the highest-fidelity, most controllable delivery. Maximo has full control of voice, language, emotion, pacing, and single- or multi-speaker layout.
- Ask Maximo in chat or voice to read text aloud, narrate a script, voice a product update, record a greeting, or perform a two-person dialogue.
- Single-speaker uses one of 30 built-in voices (such as Kore, Puck, Charon, Zephyr, Aoede). Multi-speaker supports up to 2 speakers, each with its own voice and style, for natural conversations in one pass.
- Steer delivery with natural-language style prompts and inline tags like [whispers], [laughs], or [excited] placed directly in the text; speech can span 70+ languages.
- Maximo checks AI Audio credits, personal monthly cap, daily cap, model access, characters per request, File Manager count, and storage before it starts.
- Outputs are saved to Files as AI-generated audio (24 kHz WAV) and shown in chat as a player card with inline playback, download, open, and Files actions.
- Voice Flash is available on Plus, Pro, and Max. Voice Pro starts on Pro and is also available on Max.
- Free has no AI audio generation. Plus gets 60 shared business credits monthly, 30 personal monthly credits, 6 daily credits, 5,000 characters per request, and up to 2 speakers (Voice Flash).
- Pro gets 400 shared business credits monthly, 120 personal monthly credits, 25 daily credits, 15,000 characters per request, up to 2 speakers, and Voice Flash + Pro.
- Max gets 1,200 shared business credits monthly, 320 personal monthly credits, 60 daily credits, 30,000 characters per request, up to 2 speakers, and Voice Flash + Pro.
- Credit cost scales with text length: 1 credit is about 1,000 characters (~1 minute of speech), rounded up; Voice Pro costs 2x credits.
- Generated audio follows the signed-in user's role. Owners, administrators, and members can use the tools when the plan allows it; viewers and employee viewers cannot run audio jobs.

