Apple Intelligence Foundation Language Models
Tom Gunter
Zirui Wang
Chong-Jun Wang
Ruoming Pang
Andy Narayanan
Aonan Zhang
Bowen Zhang
Chen Chen
Chung-Cheng Chiu
David Qiu
Deepak Gopinath
Dian Ang Yap
Dong Yin
Feng Nan
Floris Weers
Guoli Yin
Haoshuo Huang
Jianyu Wang
Jiarui Lu
John Peebles
Kewei Ye
Mark Lee
Nan Du
Qibin Chen
Quentin Keunebroek
Sam Wiseman
Syd Evans
Tao Lei
Vivek Rathod
Xiang Kong
Xianzhi Du
Yanghao Li
Yongqiang Wang
Yuan Gao
Zaid Ahmed
Zhaoyang Xu
Zhiyun Lu
Al Rashid
Albin Madappally Jose
Alec Doane
Alfredo Bencomo
Allison Vanderby
Andrew Hansen
Ankur Jain
A. Anupama
Areeba Kamal
Bugu Wu
Carolina Brum
Charlie Maalouf
Chinguun Erdenebileg
Chris Dulhanty
Dominik Moritz
Doug Kang
Eduardo Jimenez
Evan Ladd
Fang Shi
Felix Bai
Frank Chu
Fred Hohman
Hadas Kotek
Hannah Gillis Coleman
Jane Li
Jeffrey P. Bigham
Jeffery Cao
Jeff Lai
Jessica Cheung
Jiulong Shan
Joe Zhou
John Li
Jun Qin
Karanjeet Singh
Karla Vega
Kelvin Zou
Laura Heckman
Lauren Gardiner
Margit Bowler
Maria Cordell
Meng Cao
Nicole Hay
Nilesh Shahdadpuri
Otto Godwin
Pranay Dighe
Pushyami Rachapudi
Ramsey Tantawi
Roman Frigg
Sam Davarnia
Sanskruti Shah
Saptarshi Guha
Sasha Sirovica
Shen Ma
Shuang Ma
Simon Wang
Sulgi Kim
Suma Jayaram
Vaishaal Shankar
Varsha Paidi
Vivek Kumar
Xin Eric Wang
Xin Zheng
Walker Cheng
Y. Shrager
Yang Ye
Yasu Tanaka
Yihao Guo
Yunsong Meng
Zhaoping Luo
Ouyang Zhi
Alp Aygar
Alvin Wan
Andrew D. Walkingshaw
Andy Narayanan
Tzu-Hsiang Lin
Arsalan Farooq
Brent Ramerth
Colorado Reed
Chris Bartels
Chris Chaney
David Riazati
Eric Liang Yang
Erin Feldman
Gabriel Hochstrasser
Guillaume Seguin
Irina Belousova
J. Pelemans
Karen Yang
Keivan Alizadeh Vahid
Liangliang Cao
Mahyar Najibi
Marco Zuliani
Max Horton
Minsik Cho
Nikhil Bhendawade
Patrick Dong
Piotr Maj
Pulkit Agrawal
Qi Shan
Qichen Fu
R. Poston
Sam Xu
Shuangning Liu
Sushma Rao
Tashweena Heeramun
Thomas Merth
Uday Rayala
Victor Cui
Vivek Rangarajan Sridhar
Wencong Zhang
Wenqi Zhang
Wentao Wu
Xingyu Zhou
Xinwen Liu
Yang Zhao
Yin Xia
Zhile Ren
Zhongzheng Ren

Abstract
We present foundation language models developed to power Apple Intelligence features, including a ~3 billion parameter model designed to run efficiently on devices and a large server-based language model designed for Private Cloud Compute. These models are designed to perform a wide range of tasks efficiently, accurately, and responsibly. This report describes the model architecture, the data used to train the model, the training process, how the models are optimized for inference, and the evaluation results. We highlight our focus on Responsible AI and how the principles are applied throughout the model development.
View on arXivComments on this paper