Title |
---|
![]() Scaling Smart: Accelerating Large Language Model Pre-training with Small
Model Initialization Mohammad Samragh Iman Mirzadeh Keivan Alizadeh Vahid Fartash Faghri Minsik Cho Moin Nabi Devang Naik Mehrdad Farajtabar |
![]() 52B to 1T: Lessons Learned via Tele-FLM Series Xiang Li Yiqun Yao Xin Jiang Xuezhi Fang Chao Wang ...Yequan Wang Zhongjiang He Zhongyuan Wang Xuelong Li Tiejun Huang |
![]() Tele-FLM Technical Report Xiang Li Yiqun Yao Xin Jiang Xuezhi Fang Chao Wang ...Yequan Wang Zhongjiang He Zhongyuan Wang Xuelong Li Tiejun Huang |