Press Network of India

Gnani AI launches Prisma v2.5, ranked #1 in 8 of 9 Indian languages on real-world and noisy ASR benchmarks

0 1

Bengaluru : Gnani AI, the frontier voice AI company, announced the launch of its latest speech to text model – Prisma v2.5, which delivers the most accurate speech recognition for India’s linguistic spectrum. With 15% lower word error rates (WER) for rural Hindi dialects and 18% lower WER in noisy Dravidian environments compared to global and Indic model providers such as ElevenLabs Scribe v2, Deepgram Nova-3, and Sarvam Saaras v3, Gnani Prisma v2.5 sets a new benchmark for Indic speech recognition.

Gnani Prisma v2.5 ranks first in 8 out of 9 Indian languages on both real-world and acoustically noisy independent benchmarks, including Gramvaani – the only benchmark that captures how semi-urban and rural Indians speak. This is a significant milestone towards ensuring pan-India coverage for millions of Indians with Sovereign AI models as part of the IndiaAI mission and reducing dependencies on non-Indian players for enterprises. Gnani Prisma v2.5 is trained on 14 million hours of proprietary Indic speech, 14x more than publicly disclosed data by any competing model. The corpus spans 12 languages with real dialect variation, ambient noise, and natural code-switching baked into the training distribution.

“Consider a single sentence: mereko loan chahiye, thees karod rupaye ka. Kab tak milega,” quipped Ananth Nagaraj, Co-founder and CTO, Gnani AI. “A model that mishears ‘thees’ as ‘theen’ produces a 10% WER on that utterance. In a loan origination call, that single error misrepresents the amount by thirty crore rupees. At 30 million calls per day, error rates are not a quality metric. They are a huge business risk.”

Prisma v2.5 closes the transcription gaps that matter the most in BFSI, insurance, healthcare, and several other industries: short utterances, numerals, alphanumerics, named entities, and domain-specific vocabulary. These are the categories where errors carry direct downstream consequence in compliance workflows, CRM logging, and agent assist applications.

“Most speech models we evaluated were trained on clean audio. Our calls come in over telephony lines, from Tier 2 and Tier 3 cities, in accents no benchmark dataset captures. Gnani Prisma v2.5 was the only model where out-of-the-box accuracy matched what we needed in production,” shared Akshay Singhal, Senior Vice President, WeRize.

“CODEC handling for GSM and VoIP is native. Code-switching across Hindi-English, Tamil-English, and regional-English pairs works at the word level without language tagging. Architectural improvements through post-training optimization have also doubled throughput over the previous version without accuracy loss,” added Bharath Shankar, Co-founder and Chief Product and Engineering Officer, Gnani AI.

“Every enterprise voice AI deployment in India eventually runs into the same wall: the model was not trained for how Indians actually speak. Accents, noise, code-switching, compressed telephony audio, these are not edge cases in India; they are the norm. Prisma v2.5 is the first model India has built where the training data and the real world are the same thing,” concluded Ganesh Gopalan, Co-founder and CEO, Gnani AI.

Leave A Reply

Your email address will not be published.