The debut of 'OpenHathi' marks India’s first large language model in Hindi, emerging just a week following a successful $41 million funding round.
In Brief
Sarvam AI’s OpenHathi represents a significant step in the creation of large language models in Hindi, aiming to establish open resources that spur AI advancements for Indian dialects.

Indian Generative AI startup Sarvam AI released OpenHathi-Hi-v0.1 This model, part of the OpenHathi series, signifies a leap forward in Hindi language processing, occurring just one week post the successful fundraising event. raising $41 million This funding round, which was spearheaded by Lightspeed Ventures, signifies a strategic growth initiative for the company.
Hindi stands as the most prevalent language in India, with approximately 43% of citizens identifying it as their first language. The model itself utilizes Meta AI’s Llama2-7B architecture, achieving a level of performance comparable to leading models. GPT-3.5 for Indic languages.
The company stated on social media platform X (formerly known as Twitter) that their model's performance on Hindi tasks is equal to or surpasses that of GPT-3.5 while still delivering strong results in English.
Highlighting its versatility, the model demonstrates competitive results across various tasks in Hindi while upholding standards in English. In addition to conventional NLG tasks, it has undergone assessments of numerous practical, real-world scenarios.
— Sarvam AI (@SarvamAI) December 12, 2023
Sarvam AI emphasizes that the OpenHathi series aims to advance open model and dataset development, fueling innovation within the sphere of Indian languages. Indian language AI Collaboration with academic partners at AI4Bharat has been instrumental, as they provide essential language resources and evaluation benchmarks for this initiative.
AI4Bharat operates out of the Indian Institute of Technology (IIT) Madras, a prestigious public technical university tackling the challenge of generating open-source datasets, tools, models, and applications specifically for Indian languages.
OpenHathi uses a 48,000-token extension of Llama Utilizing a two-step training regime, the model begins with a focus on aligning randomly initialized Hindi embeddings before advancing to a second phase, which involves bilingual language modeling to understand cross-lingual token relationships.
Full-Stack Generative AI Platform to be Launch ed Soon
The company encourages a culture of creativity, urging users to innovate and enhance this new offering. Developers are invited to tailor specialized models for various applications using the OpenHathi-Hi-v0.1 model as a foundation.
Additionally, Sarvam AI is preparing to roll out sophisticated models geared towards business applications on its broad platform, which is eagerly anticipated to launch soon. generative AI During its Series A fundraising, the company disclosed plans to create a comprehensive 'full-stack' solution for Generative AI, encompassing everything from research-driven advances in custom AI model training to a robust platform for content creation and deployment.
This holistic approach is expected to significantly accelerate the penetration of generative AI technology in India, especially since many businesses recognize its potential yet struggle to integrate it effectively.
Founded in July 2023 by Vivek Raghavan and Pratyush Kumar, who both have backgrounds with AI4Bharat and support from notable figures like Infosys co-founder Nandan Nilekani.
In a recent development, India introduced BharatGPT, a language model powered by LLM technology, created in partnership with CoRover.ai. It aims to be a homegrown alternative in the generative AI landscape, offering support in over a dozen Indian languages across video, voice, and text mediums.
Lastly, it's critical to note that the content provided on this page shouldn't be taken as legal, financial, or other forms of advice. Investing should only be done with funds you can comfortably afford to lose. Always consider seeking independent financial advice if uncertainties arise. For more details, it's wise to check the terms, conditions, and support resources made available by the respective issuer or advertiser. MetaversePost is dedicated to delivering precise and impartial reporting, but please keep in mind that market conditions may change unexpectedly.
Disclaimer
In line with the Trust Project guidelines Kumar, a seasoned Tech Journalist, specializes in the rapidly evolving intersections of AI/ML, marketing tech, and emerging sectors like blockchain and NFTs. With over three years of experience, Kumar has built a solid reputation for crafting engaging narratives, conducting impactful interviews, and delivering in-depth analysis. His skill set allows him to create influential content, including articles and research papers for leading industry platforms. By blending technical insights with compelling storytelling, Kumar effectively conveys intricate technological ideas to a broad audience in an accessible and captivating way.