RecycleGPT: An Autoregressive Language Model with Recyclable Module

7 August 2023

Papers citing "RecycleGPT: An Autoregressive Language Model with Recyclable Module"

8 / 8 papers shown

Title
Think Big, Generate Quick: LLM-to-SLM for Fast Autoregressive Decoding Benjamin Bergner Andrii Skliar Amelie Royer Tijmen Blankevoort Yuki Markus Asano B. Bejnordi 45 3 0 26 Feb 2024
ProPD: Dynamic Token Tree Pruning and Generation for LLM Parallel Decoding Shuzhang Zhong Zebin Yang Meng Li Ruihao Gong Runsheng Wang Ru Huang 24 6 0 21 Feb 2024
SiDA-MoE: Sparsity-Inspired Data-Aware Serving for Efficient and Scalable Large Mixture-of-Experts Models Zhixu Du Shiyu Li Yuhao Wu Xiangyu Jiang Jingwei Sun Qilin Zheng Yongkai Wu Ang Li Hai Helen Li Yiran Chen MoE 10 11 0 29 Oct 2023
Lossless Acceleration for Seq2seq Generation with Aggressive Decoding Tao Ge Heming Xia Xin Sun Si-Qing Chen Furu Wei 82 18 0 20 May 2022
The Pile: An 800GB Dataset of Diverse Text for Language Modeling Leo Gao Stella Biderman Sid Black Laurence Golding Travis Hoppe ... Horace He Anish Thite Noa Nabeshima Shawn Presser Connor Leahy AIMat 242 1,977 0 31 Dec 2020
Efficient Content-Based Sparse Attention with Routing Transformers Aurko Roy M. Saffar Ashish Vaswani David Grangier MoE 228 578 0 12 Mar 2020
Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism M. Shoeybi M. Patwary Raul Puri P. LeGresley Jared Casper Bryan Catanzaro MoE 243 1,791 0 17 Sep 2019
Q-BERT: Hessian Based Ultra Low Precision Quantization of BERT Sheng Shen Zhen Dong Jiayu Ye Linjian Ma Z. Yao A. Gholami Michael W. Mahoney Kurt Keutzer MQ 214 571 0 12 Sep 2019