Thursday, December 11, 2025

Google Enhances Gemini TTS with 24 Languages and Realistic Voice Options

Google has launched updates for its Gemini 2.5 Flash and Gemini 2.5 Pro Text-to-Speech (TTS) models, accessible via the Gemini API in Google AI Studio. These advanced TTS tools are tailored for applications needing nuanced vocal delivery, such as audiobooks, e-learning, and podcasts. Key enhancements include greater voice expressivity, improved context-aware pacing, and robust multi-speaker support across 24 languages. The Gemini 2.5 Flash model emphasizes low-latency processing for interactive uses, while the Pro variant focuses on high-fidelity audio quality. Developers gain finer control over pacing, tone, and character identity, facilitating cinematic voiceovers and precise dialogue creation. This rollout enhances Google’s position in generative voice technology, providing innovative solutions for realistic and customizable speech synthesis. With increased versatility and adherence to stylistic cues, these models empower developers to meet a growing demand for dynamic audio experiences. Access is now available globally in Google AI Studio for those looking to leverage these advancements.

Source link

Share

Read more

Local News