NANDA: The AI Model Bringing Hindi Into the Digital Age - All You Have To Know
In a significant stride toward linguistic inclusivity in artificial intelligence, UAE-based AI firm G42 has unveiled a groundbreaking large language model (LLM) called NANDA, designed specifically for Hindi-speaking users. With a parameter count of 13 billion and trained on 2.13 trillion tokens, including an extensive collection of Hindi language datasets, NANDA aims to transform the way AI interacts with one of the world’s most widely spoken languages. This ambitious project was announced at the India-UAE business forum in Mumbai during Crown Prince of Abu Dhabi, Sheikh Khaled bin Mohammed bin Zayed Al Nahyan’s State visit to India.
Named after Nanda Devi, one of India’s highest peaks, this AI model is the product of a collaboration between G42’s subsidiary Inception, Cerebras Systems, and researchers from the Mohamed bin Zayed University of Artificial Intelligence in UAE. The project is a powerful testament to the growing importance of AI in shaping the future of regional languages and bringing them into the digital fold.
The Need for NANDA: Bridging the Hindi AI Gap
India, with over 600 million Hindi speakers, represents a vast, largely untapped market for language-specific AI tools. While AI and language models have made significant progress in languages like English, Chinese, and Arabic, Hindi has remained underrepresented in the digital and AI landscape. NANDA aims to change that by catering specifically to the needs of Hindi-speaking users.
The importance of this model extends far beyond language. It signals a new chapter in AI development, one where technology becomes more inclusive, equitable, and culturally aware. As AI continues to shape how we live and work, models like NANDA ensure that large sections of the global population, particularly in non-English-speaking regions, are not left behind.
Training NANDA: A Technological Feat
NANDA’s development is nothing short of a technological marvel. The model was trained on the Condor Galaxy, one of the world’s most powerful AI supercomputers designed for training and inference tasks. Developed by G42 and Cerebras Systems, this supercomputer is capable of processing massive datasets at high speeds, making it ideal for training large language models like NANDA.
With 13 billion parameters, NANDA is designed to understand and generate nuanced Hindi text, making it capable of performing a wide range of language tasks—from conversational AI to content generation, and even translation. Dr. Andrew Jackson, acting CEO of Inception, highlighted the significance of NANDA’s development, stating that it “heralds a new era of AI inclusivity, ensuring that the rich heritage and depth of Hindi language is represented in the digital and AI landscape.”
NANDA’s Role in AI Inclusivity
The launch of NANDA represents G42’s commitment to creating AI models that serve underrepresented languages and cultures. This move mirrors the company’s earlier success with JAIS, the world’s first open-source Arabic LLM, which launched in August 2023. With NANDA, G42 seeks to replicate this success in the Indian market, ensuring that Hindi speakers have access to AI technologies that cater specifically to their language and cultural nuances.
Language models like NANDA play a critical role in making AI more inclusive and accessible. They not only break down language barriers but also help preserve linguistic diversity by incorporating regional languages into the digital age. For a language as rich and complex as Hindi, this could mean a significant transformation in how native speakers interact with technology.
A Broader AI Vision: G42 and Microsoft's Collaboration
NANDA’s development is part of a broader vision by G42 to position itself as a leader in the AI space, with a focus on creating language and domain-specific LLMs. The firm’s commitment to AI excellence is further underscored by its collaboration with Microsoft, which recently invested $1.5 billion in G42. This partnership has already borne fruit, with G42 rolling out two upgraded models of Med42 LLM earlier this year. Med42, an AI assistant designed for the healthcare sector, is used by clinicians, healthcare administrators, and insurers to improve efficiency and decision-making in medical settings.
The introduction of NANDA reflects G42’s strategy of addressing the linguistic gaps in AI by focusing on underrepresented languages and regions. By launching this model in the Indian market, G42 is setting a new standard for AI-driven language solutions, highlighting the importance of local languages in the global AI landscape.
The Future of Hindi AI
With the imminent launch of NANDA, India is set to enter a new era of AI-powered language tools. The model’s potential applications are vast, ranging from improving customer service and content generation to advancing education and research. As more organizations adopt NANDA, the benefits of a language-specific AI tool will become evident in multiple sectors, including government services, healthcare, e-commerce, and education.
Moreover, NANDA’s ability to process and generate Hindi language content will help bridge the digital divide in India, where a significant portion of the population remains digitally excluded due to language barriers. With AI tools that understand and generate native-language content, users will be able to access information, communicate, and engage with digital platforms more effectively.
NANDA is not just another AI model—it represents a critical step toward AI inclusivity and linguistic diversity. As G42 and its partners continue to push the boundaries of what AI can achieve, the launch of this Hindi-language model demonstrates how technology can be tailored to serve specific linguistic and cultural needs. With its focus on inclusivity, NANDA is poised to become a game-changer in the Indian AI landscape, helping to ensure that Hindi-speaking users are not left behind in the AI revolution.
