Thursday, October 30, 2025

Soul App’s Open-Source Approach Infuses AI Podcasts with Human-Like Authenticity

Soul AI Lab has announced the open-source release of SoulX-Podcast, an advanced voice podcast generation model tailored for multi-speaker and multi-turn dialogues. This innovative model supports multiple languages and dialects, including Mandarin, English, Sichuanese, and Cantonese, delivering natural and fluent dialogues that can exceed 60 minutes. SoulX-Podcast excels in zero-shot voice cloning, accurately reproducing timbre and adapting prosody to enhance conversational authenticity. It also features controllable paralinguistic elements like laughter and breaths, enriching the user experience.

Designed to address limitations in existing speech synthesis systems, SoulX-Podcast provides comprehensive dialect coverage and performs exceptionally in single-speaker synthesis. The model leverages the “LLM + Flow Matching” architecture for optimal semantic and acoustic feature modeling, achieving top-tier results in podcast generation benchmarks. Soul AI Lab aims to foster an immersive and emotionally resonant interaction experience, engaging with the open-source community to enhance future capabilities.

Source link

Share

Read more

Local News