Title |
---|
![]() MoEfication: Transformer Feed-forward Layers are Mixtures of Experts Zhengyan Zhang Yankai Lin Zhiyuan Liu Peng Li Maosong Sun Jie Zhou |
![]() Pre-Trained Models: Past, Present and Future Xu Han Zhengyan Zhang Ning Ding Yuxian Gu Xiao Liu ...Jie Tang Ji-Rong Wen Jinhui Yuan Wayne Xin Zhao Jun Zhu |
![]() Analogous to Evolutionary Algorithm: Designing a Unified Sequence Model Jiangning Zhang Chao Xu Jian Li Wenzhou Chen Yabiao Wang Ying Tai Shuo Chen Chengjie Wang Feiyue Huang Yong Liu |