NVIDIA Unveils Open Dataset and Models for Multilingual Speech AI Development

August 15, 2025

NVIDIA is addressing the challenge of AI language support with its innovative Granary dataset and new models, enhancing speech recognition and translation for 25 European languages, including Croatian, Estonian, and Maltese. Granary features a massive open-source corpus with approximately one million hours of audio, designed for diverse applications such as multilingual chatbots and real-time translation services. The dataset’s processing pipeline utilizes NVIDIA’s NeMo Speech Data Processor, allowing researchers to convert unlabeled audio into a structured format without labor-intensive annotation, thereby promoting inclusivity for underrepresented languages. The new models — Canary-1b-v2 and Parakeet-tdt-0.6b-v3 — offer fast, high-quality transcription, with the former expanding language support and providing speed and accuracy comparable to larger models. For developers, these resources enable rapid AI application scaling and innovation in speech technology. Discover the Granary dataset and models on Hugging Face and GitHub.

Source link

{{post_title}}

NVIDIA Unveils Open Dataset and Models for Multilingual Speech AI Development

NO COMMENTS

LEAVE A REPLY Cancel reply

Loading…

Here are the results for the search: "{{td_search_query}}"

No results!

{{post_title}}

RELATED ARTICLES

AI Revolutionizes Cybersecurity Access: Empowering Defenders with Advanced Tools

Adobe Unveils Firefly AI Assistant, Featuring Enhanced Generative AI and Creative...

IDC MarketScape: Vendor Assessment of Global AI-Driven Enterprise Asset Management Solutions...

NO COMMENTS

LEAVE A REPLY Cancel reply