NVIDIA Pushes for True Multilingual AI

AI may feel everywhere today, but in reality, it only works fluently in a fraction of the world’s 7,000 languages. That leaves millions of people unable to use the same AI-powered tools others take for granted. NVIDIA is looking to change that with a new initiative focused on expanding speech AI across Europe.

Bridging the Language Gap

NVIDIA has launched a set of open-source tools that empower developers to build high-quality speech AI in 25 European languages. This includes widely spoken ones like German and Spanish, but more importantly, it extends to languages that are often overlooked—such as Croatian, Estonian, and Maltese.

The mission is clear: give developers the resources to build multilingual chatbots, faster translation services, and voice-powered applications that actually understand the user, regardless of where they live.

Granary: A Massive Speech Library

At the core of this initiative is Granary, a collection of around one million hours of human speech audio. It’s designed to capture the subtleties of language and improve both transcription and translation accuracy.

To put Granary to use, NVIDIA also released two new AI models:

  • Canary-1b-v2 – built for precision, handling complex transcription and translation tasks with top-tier accuracy.
  • Parakeet-tdt-0.6b-v3 – optimized for real-time applications where speed matters most.

Developers can already access both models and the dataset on Hugging Face.

Smarter Data, Smarter AI

Traditionally, building AI-ready datasets requires extensive human labeling—a process that’s time-consuming and expensive. NVIDIA’s team, alongside researchers from Carnegie Mellon University and Fondazione Bruno Kessler, solved this with an automated data pipeline powered by the company’s NeMo toolkit.

This approach allowed them to turn raw audio into structured, high-quality training data quickly. The results are impressive: Granary requires about half the data volume compared to other popular datasets to achieve the same accuracy levels.

Performance That Matters

The models showcase the payoff:

  • Canary delivers translation and transcription quality rivaling models three times its size—while being up to ten times faster.
  • Parakeet can process a full 24-minute meeting recording seamlessly, detecting languages automatically and generating polished transcripts with punctuation, capitalization, and word-level timestamps.

These features make it possible to create professional-grade AI tools without the usual heavy resource demands.

Building a More Inclusive AI Future

By sharing these resources openly, NVIDIA isn’t just strengthening its AI ecosystem—it’s taking a big step toward digital inclusivity. Developers across Europe can now build AI tools that finally speak their local languages, enabling a wider range of applications from customer service bots to real-time translation.

The ultimate vision? A world where AI can understand you no matter where you’re from—or what language you speak.

Source: https://www.artificialintelligence-news.com/news/nvidia-aims-solve-ai-issues-with-many-languages/

Facebook
Twitter
LinkedIn

Related Posts

Leave a Reply

Your email address will not be published. Required fields are marked *