Title |
---|
![]() State Space Model for New-Generation Network Alternative to
Transformers: A Survey Xiao Wang Shiao Wang Yuhe Ding Yuehang Li Wentao Wu ...Bowei Jiang Chenglong Li Yaowei Wang Yonghong Tian Jin Tang |
![]() Distilling Step-by-Step! Outperforming Larger Language Models with Less
Training Data and Smaller Model Sizes Lokesh Nagalapatti Chun-Liang Li Chih-Kuan Yeh Hootan Nakhost Yasuhisa Fujii Alexander Ratner Ranjay Krishna Chen-Yu Lee Tomas Pfister |