Indian AI research firm Soket AI Labs has partnered with Google Cloud to enhance the capabilities of its open-source multilingual foundation model, Pragna-1B.
Pragna-1B, which was launched on May 1, 2024, is designed to support a range of Indian vernacular languages, including Hindi, English, Bengali, and Gujarati.
The partnership is poised to make significant advancements in AI accessibility and efficiency in India.
Pragna-1B's features and performance
Pragna-1B has been developed with a focus on Indian languages and contexts. The model, featuring a Transformer Decoder-only architecture with 1.25 billion parameters and a context length of 2048 tokens, aims to provide state-of-the-art performance despite being trained on fewer parameters than similar models.
It has been pre-trained on approximately 150 billion tokens covering multiple languages, ensuring robust support and balanced language representation.
The model's tokenizer is noted for outperforming others when it comes to Indian languages like Kannada, Gujarati, Tamil, and Urdu.
Technical and marketplace integration
The collaboration between Soket AI Labs and Google Cloud extends beyond model development to include technical enhancements and marketplace integration. The AI Developer Platform by Soket AI Labs, along with the Pragna series models, will be listed on the Google Cloud Marketplace and the Google Vertex AI model registry. This integration will provide developers with streamlined access to high-performance resources such as Vertex AI and TPUs, enabling more efficient model fine-tuning and scaling of AI projects.
Contributions and innovations in AI development
The partnership also encompasses foundational work on training large-scale models and curating high-quality datasets for Indian languages.
Soket AI Labs has created 'Bhasha', a series of high-quality datasets, including 'Bhasha-wiki' with over 44.1 million articles in six Indian languages, and 'Bhasha-wiki-indic', focusing on content relevant to India.
These efforts aim to promote AI innovation while maintaining transparency and cost-effectiveness by using Google Cloud’s AI infrastructure.
Leadership comments
Abhishek Upperwal, founder of Soket AI Labs, highlighted the significance of this collaboration, stating, “By leveraging Google Cloud, Pragna-1B, despite being trained on fewer parameters, is efficient and compares performance in language processing tasks to similar category models.”
Bikram Singh Bedi, Vice President and Country Managing Director at Google Cloud India, expressed enthusiasm about the partnership, noting, “We are thrilled to partner with Soket AI Labs to democratize AI innovation in India. Built on Google Cloud, the launch of Pragna -1B marks a pioneering leap in Indian language technology, offering enhanced scalability and efficiency for organizations.”