ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2503.01493
36
0

Llama-3.1-Sherkala-8B-Chat: An Open Large Language Model for Kazakh

3 March 2025
Fajri Koto
Rituraj Joshi
Nurdaulet Mukhituly
Y. Wang
Zhuohan Xie
Rahul Pal
Daniil Orel
Parvez Mullah
Diana Turmakhan
Maiya Goloburda
Mohammed Kamran
Samujjwal Ghosh
Bokang Jia
Jonibek Mansurov
Mukhammed Togmanov
Debopriyo Banerjee
Nurkhan Laiyk
Akhmed Sakip
Xudong Han
Ekaterina Kochmar
Alham Fikri Aji
A. Singh
Alok Anil Jadhav
Satheesh Katipomu
Samta Kamboj
Monojit Choudhury
Gurpreet Gosal
Gokul Ramakrishnan
Biswajit Mishra
Sarath Chandran
Avraham Sheinin
Natalia Vassilieva
Neha Sengupta
Larry Murray
Preslav Nakov
    ALM
    KELM
ArXivPDFHTML
Abstract

Llama-3.1-Sherkala-8B-Chat, or Sherkala-Chat (8B) for short, is a state-of-the-art instruction-tuned open generative large language model (LLM) designed for Kazakh. Sherkala-Chat (8B) aims to enhance the inclusivity of LLM advancements for Kazakh speakers. Adapted from the LLaMA-3.1-8B model, Sherkala-Chat (8B) is trained on 45.3B tokens across Kazakh, English, Russian, and Turkish. With 8 billion parameters, it demonstrates strong knowledge and reasoning abilities in Kazakh, significantly outperforming existing open Kazakh and multilingual models of similar scale while achieving competitive performance in English. We release Sherkala-Chat (8B) as an open-weight instruction-tuned model and provide a detailed overview of its training, fine-tuning, safety alignment, and evaluation, aiming to advance research and support diverse real-world applications.

View on arXiv
@article{koto2025_2503.01493,
  title={ Llama-3.1-Sherkala-8B-Chat: An Open Large Language Model for Kazakh },
  author={ Fajri Koto and Rituraj Joshi and Nurdaulet Mukhituly and Yuxia Wang and Zhuohan Xie and Rahul Pal and Daniil Orel and Parvez Mullah and Diana Turmakhan and Maiya Goloburda and Mohammed Kamran and Samujjwal Ghosh and Bokang Jia and Jonibek Mansurov and Mukhammed Togmanov and Debopriyo Banerjee and Nurkhan Laiyk and Akhmed Sakip and Xudong Han and Ekaterina Kochmar and Alham Fikri Aji and Aaryamonvikram Singh and Alok Anil Jadhav and Satheesh Katipomu and Samta Kamboj and Monojit Choudhury and Gurpreet Gosal and Gokul Ramakrishnan and Biswajit Mishra and Sarath Chandran and Avraham Sheinin and Natalia Vassilieva and Neha Sengupta and Larry Murray and Preslav Nakov },
  journal={arXiv preprint arXiv:2503.01493},
  year={ 2025 }
}
Comments on this paper