Speculative Decoding with Big Little Decoder

v1v2v3v4 (latest)

Speculative Decoding with Big Little Decoder

Neural Information Processing Systems (NeurIPS), 2023

15 February 2023

Sehoon Kim

Suhong Moon

Michael W. Mahoney

ArXiv (abs)PDF HTML

Papers citing "Speculative Decoding with Big Little Decoder"

3 / 103 papers shown

Title
ReLU Strikes Back: Exploiting Activation Sparsity in Large Language ModelsInternational Conference on Learning Representations (ICLR), 2023 Iman Mirzadeh Keivan Alizadeh-Vahid Sachin Mehta C. C. D. Mundo Oncel Tuzel Golnoosh Samei Mohammad Rastegari Mehrdad Farajtabar 430 99 0 06 Oct 2023
Predictive Pipelined Decoding: A Compute-Latency Trade-off for Exact LLM Decoding Seongjun Yang Gibbeum Lee Jaewoong Cho Dimitris Papailiopoulos Kangwook Lee 176 46 0 12 Jul 2023
Speculative Decoding: Exploiting Speculative Execution for Accelerating Seq2seq GenerationConference on Empirical Methods in Natural Language Processing (EMNLP), 2022 Heming Xia Tao Ge Peiyi Wang Si-Qing Chen Furu Wei Zhifang Sui 359 132 0 30 Mar 2022