Google expands Gemini 2.5 Text-to-Speech with enhanced expressivity, context-aware pacing, and multilingual support


Google has announced updates to its Gemini 2.5 Flash and Gemini 2.5 Pro Text-to-Speech (TTS) preview models. The improvements include richer tone versatility, context-aware pacing, and enhanced multi-speaker dialogue handling. These models will replace the TTS models released in May.

Key Updates
  • Enhanced Expressivity: The models now follow style prompts more accurately and provide a wider range of tones, from cheerful to serious, suitable for various roles such as virtual assistants, narrators, or game characters.
  • Precision Pacing: Gemini 2.5 TTS can adjust speech speed based on context. It slows down for emphasis, accelerates for excitement, and follows explicit pacing instructions more accurately.
  • Seamless Multi-Speaker Dialogue: For applications such as podcasts, interviews, or multi-character narratives, the models maintain consistent character voices and handle transitions between speakers smoothly.
Gemini 2.5 Flash and Pro Models
  • Flash TTS: Optimized for low latency.
  • Pro TTS: Optimized for high-quality output.

These models support applications requiring granular control over style, tone, pace, and accents, including audiobooks, e-learning, product tutorials, marketing videos, and creator content. They also maintain multilingual capabilities across 24 supported languages, preserving individual character tones and styles.

Demo Applications
  • Synergy Intro: Demonstrates the models’ expressive tone and style versatility.

  • Voices from History: Highlights multi-speaker and multilingual performance.

Availability

Gemini 2.5 Flash and Pro TTS models are accessible through the Gemini API in Google AI Studio. Developers can refer to the developer documentation, prompting guide, or Gemini API Cookbook to explore features and integrate the models into their applications.

Ivan Solovyev, Product Manager at Google DeepMind, said that developers can start vibe coding apps in Google AI Studio and explore the Gemini 2.5 TTS model capabilities today in the Playground.