ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2407.10671
36
458

Qwen2 Technical Report

15 July 2024
An Yang
Baosong Yang
Binyuan Hui
Bo Zheng
Bowen Yu
Chang Zhou
Chengpeng Li
Chengyuan Li
Dayiheng Liu
Fei Huang
Guanting Dong
Haoran Wei
Huan Lin
Jialong Tang
Jialin Wang
Jian Yang
Jianhong Tu
Jianwei Zhang
Jianxin Ma
Jianxin Yang
Jin Xu
Jingren Zhou
Jinze Bai
Jinzheng He
Junyang Lin
Kai Dang
Keming Lu
Ke-Yang Chen
Kexin Yang
Mei Li
Min Xue
Na Ni
Pei Zhang
Peng Wang
Ru Peng
Rui Men
Ruize Gao
Runji Lin
Shijie Wang
Shuai Bai
Sinan Tan
Tianhang Zhu
Tianhao Li
Tianyu Liu
Wenbin Ge
Xiaodong Deng
Xiaohuan Zhou
Xingzhang Ren
Xinyu Zhang
Xipin Wei
Xuancheng Ren
Xuejing Liu
Yang Fan
Yang Yao
Yichang Zhang
Yunyang Wan
Yunfei Chu
Yuqiong Liu
Zeyu Cui
Zhenru Zhang
Zhifang Guo
Zhi-Wei Fan
    OSLM
    VLM
    MU
ArXivPDFHTML
Abstract

This report introduces the Qwen2 series, the latest addition to our large language models and large multimodal models. We release a comprehensive suite of foundational and instruction-tuned language models, encompassing a parameter range from 0.5 to 72 billion, featuring dense models and a Mixture-of-Experts model. Qwen2 surpasses most prior open-weight models, including its predecessor Qwen1.5, and exhibits competitive performance relative to proprietary models across diverse benchmarks on language understanding, generation, multilingual proficiency, coding, mathematics, and reasoning. The flagship model, Qwen2-72B, showcases remarkable performance: 84.2 on MMLU, 37.9 on GPQA, 64.6 on HumanEval, 89.5 on GSM8K, and 82.4 on BBH as a base language model. The instruction-tuned variant, Qwen2-72B-Instruct, attains 9.1 on MT-Bench, 48.1 on Arena-Hard, and 35.7 on LiveCodeBench. Moreover, Qwen2 demonstrates robust multilingual capabilities, proficient in approximately 30 languages, spanning English, Chinese, Spanish, French, German, Arabic, Russian, Korean, Japanese, Thai, Vietnamese, and more, underscoring its versatility and global reach. To foster community innovation and accessibility, we have made the Qwen2 model weights openly available on Hugging Face and ModelScope, and the supplementary materials including example code on GitHub. These platforms also include resources for quantization, fine-tuning, and deployment, facilitating a wide range of applications and research endeavors.

View on arXiv
Comments on this paper