xGen-small Technical Report

10 May 2025

Abstract

We introduce xGen-small, a family of 4B and 9B Transformer decoder models optimized for long-context applications. Our vertically integrated pipeline unites domain-balanced, frequency-aware data curation; multi-stage pre-training with quality annealing and length extension to 128k tokens; and targeted post-training via supervised fine-tuning, preference learning, and online reinforcement learning. xGen-small delivers strong performance across various tasks, especially in math and coding domains, while excelling at long context benchmarks.

View on arXiv

@article{nijkamp2025_2505.06496,
  title={ xGen-small Technical Report },
  author={ Erik Nijkamp and Bo Pang and Egor Pakhomov and Akash Gokul and Jin Qu and Silvio Savarese and Yingbo Zhou and Caiming Xiong },
  journal={arXiv preprint arXiv:2505.06496},
  year={ 2025 }
}

Comments on this paper