0

Kimi K2.5: Visual Agentic Intelligence

Kimi Team
Tongtong Bai
Yifan Bai
Yiping Bao
S.H. Cai
Yuan Cao
Y. Charles
H.S. Che
Cheng Chen
Guanduo Chen
Huarong Chen
Jia Chen
Jiahao Chen
Jianlong Chen
Jun Chen
Kefan Chen
Liang Chen
Ruijue Chen
Xinhao Chen
Yanru Chen
Yanxu Chen
Yicun Chen
Yimin Chen
Yingjiang Chen
Yuankun Chen
Yujie Chen
Yutian Chen
Zhirong Chen
Ziwei Chen
Dazhi Cheng
Minghan Chu
Jialei Cui
Jiaqi Deng
Muxi Diao
Hao Ding
Mengfan Dong
Mengnan Dong
Yuxin Dong
Yuhao Dong
Angang Du
Chenzhuang Du
Dikang Du
Lingxiao Du
Yulun Du
Yu Fan
Shengjun Fang
Qiulin Feng
Yichen Feng
Garimugai Fu
Kelin Fu
Hongcheng Gao
Tong Gao
Yuyao Ge
Shangyi Geng
Chengyang Gong
Xiaochen Gong
Zhuoma Gongque
Qizheng Gu
Xinran Gu
Yicheng Gu
Longyu Guan
Yuanying Guo
Xiaoru Hao
Weiran He
Wenyang He
Yunjia He
Chao Hong
Hao Hu
Jiaxi Hu
Yangyang Hu
Zhenxing Hu
Ke Huang
Ruiyuan Huang
Weixiao Huang
Zhiqi Huang
Tao Jiang
Zhejun Jiang
Xinyi Jin
Yu Jing
Guokun Lai
Aidi Li
C. Li
Cheng Li
Fang Li
Guanghe Li
Guanyu Li
Haitao Li
Haoyang Li
Jia Li
Jingwei Li
Junxiong Li
Lincan Li
Mo Li
Weihong Li
Wentao Li
Xinhang Li
Xinhao Li
Yang Li
Yanhao Li
Yiwei Li
Abstract

We introduce Kimi K2.5, an open-source multimodal agentic model designed to advance general agentic intelligence. K2.5 emphasizes the joint optimization of text and vision so that two modalities enhance each other. This includes a series of techniques such as joint text-vision pre-training, zero-vision SFT, and joint text-vision reinforcement learning. Building on this multimodal foundation, K2.5 introduces Agent Swarm, a self-directed parallel agent orchestration framework that dynamically decomposes complex tasks into heterogeneous sub-problems and executes them concurrently. Extensive evaluations show that Kimi K2.5 achieves state-of-the-art results across various domains including coding, vision, reasoning, and agentic tasks. Agent Swarm also reduces latency by up to 4.5×4.5\times over single-agent baselines. We release the post-trained Kimi K2.5 model checkpoint to facilitate future research and real-world applications of agentic intelligence.

View on arXiv
Comments on this paper